Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Non-convex constrained optimization problems have become a powerful framework for modeling a wide range of machine learning problems, with applications in k-means clustering, large- scale semidefinite programs (SDPs), and various other tasks. As the perfor ...
Within the context of contemporary machine learning problems, efficiency of optimization process depends on the properties of the model and the nature of the data available, which poses a significant problem as the complexity of either increases ad infinit ...
Stochastic gradient descent (SGD) and randomized coordinate descent (RCD) are two of the workhorses for training modern automated decision systems. Intriguingly, convergence properties of these methods are not well-established as we move away from the spec ...
Robust and distributionally robust optimization are modeling paradigms for decision-making under uncertainty where the uncertain parameters are only known to reside in an uncertainty set or are governed by any probability distribution from within an ambigu ...
We propose an adaptive variance-reduction method, called AdaSpider, for minimization of L-smooth, non-convex functions with a finite-sum structure. In essence, AdaSpider combines an AdaGrad-inspired [Duchi et al., 2011, McMahan & Streeter, 2010], but a fai ...
We consider the problem of finding a saddle point for the convex-concave objective minxmaxyf(x)+⟨Ax,y⟩−g∗(y), where f is a convex function with locally Lipschitz gradient and g is convex and possibly non-smooth. We propose an ...
It has been experimentally observed that the efficiency of distributed training with stochastic gradient (SGD) depends decisively on the batch size and—in asynchronous implementations—on the gradient staleness. Especially, it has been observed that the spe ...
We present a strikingly simple proof that two rules are sufficient to automate gradient descent: 1) don’t increase the stepsize too fast and 2) don’t overstep the local curvature. No need for functional values, no line search, no information about the func ...
A broad class of convex optimization problems can be formulated as a semidefinite program (SDP), minimization of a convex function over the positive-semidefinite cone subject to some affine constraints. The majority of classical SDP solvers are designed fo ...
This paper introduces a method for computing points satisfying the second-order necessary optimality conditions for nonconvex minimization problems subject to a closed and convex constraint set. The method comprises two independent steps corresponding to t ...