Adaptive Gradient Descent without Descent

We present a strikingly simple proof that two rules are sufficient to automate gradient descent: 1) don’t increase the stepsize too fast and 2) don’t overstep the local curvature. No need for functional values, no line search, no information about the function except for the gradients. By following these rules, you get a method adaptive to the local geometry, with convergence guarantees depending only on the smoothness in a neighborhood of a solution. Given that the problem is convex, our method converges even if the global smoothness constant is infinity. As an illustration, it can minimize arbitrary continuously twice differentiable convex function. We examine its performance on a range of convex and nonconvex problems, including logistic regression and matrix factorization.

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Adaptive Gradient Descent without Descent

Graph Chatbot

Chat with Graph Search

Scalable constrained optimization

WILD SOLUTIONS TO SCALAR EULER-LAGRANGE EQUATIONS

Augmented Lagrangian Methods for Provable and Scalable Machine Learning

Augmented Lagrangian Methods for Provable and Scalable Machine Learning

WILD SOLUTIONS TO SCALAR EULER-LAGRANGE EQUATIONS

Scalable constrained optimization