**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Publication# Augmented Lagrangian Methods for Provable and Scalable Machine Learning

Abstract

Non-convex constrained optimization problems have become a powerful framework for modeling a wide range of machine learning problems, with applications in k-means clustering, large- scale semidefinite programs (SDPs), and various other tasks. As the performance of these methods on real-world tasks has shown promising results, the need to develop efficient algorithms and understand their theoretical properties has become increasingly important. Augmented Lagrangian methods provide strong tools for solving constrained problems by breaking them down into a sequence of relatively easier unconstrained subproblems. Theoretical properties of these methods have been extensively studied for problems where both the objective function and constraints are convex. Additionally, there have been efforts to provide convergence guarantees to the first-order stationary points of constrained problems when only the objective function is non-convex. However, the scenario in which the objective function and constraints are both non-convex has yet to be sufficiently explored. This thesis is dedicated to the development and theoretical analysis of algorithms that are grounded in the Augmented Lagrangian method, with an emphasis on efficiency and practicality.First, we introduce a generic inexact Augmented Lagrangian framework that enables the use of either first-order or second-order subsolvers in the subproblems to attain convergence guarantees for the first (or second) order stationary points of the original constrained problem. This is accomplished with an algorithm that can be implemented practically. The success of the algorithm relies on a verifiable geometric regularity condition. We showcase the effectiveness of the algorithm via a range of numerical examples, including k-means clustering and l8 image denoising problem that incorporates a generative adversarial network (GAN) prior to counteract adversarial examples.Next, we direct our attention to a more specialized algorithm aimed at solving semidefinite programming (SDP) problems by factorizing the decision variable. The resulting optimization problems are inherently non-convex and nonlinear. We obtain the rate of convergence with an augmented Lagrangian method which combines aspects of both linearized and inexact augmented Lagrangian methods.Finally, we present a linearized alternating direction method of multipliers (ADMM) framework that is more practical and requires only two consecutive proximal gradient steps on the primal variables and a gradient ascent step in the dual. This framework is designed to address the increasingly important problem of minimizing two sets of variables with a separable non-convex composite objective that has nonlinear constraints. Our analysis enables us to recover known convergence rates with a single loop algorithm, which is less complex than the inexact Augmented Lagrangian variants. Furthermore, our framework accommodates the linearized augmented Lagrangian as a special case. The numerical evidence on various machine learning tasks, including causal learning, clustering and maximum cut problems, illustrates that the proposed algorithm is versatile, scalable and accurate while requiring minimal tuning.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related MOOCs (32)

Related concepts (37)

Introduction to optimization on smooth manifolds: first order methods

Learn to optimize on smooth, nonlinear spaces: Join us to build your foundations (starting at "what is a manifold?") and confidently implement your first algorithm (Riemannian gradient descent).

Optimization: principles and algorithms - Linear optimization

Introduction to linear optimization, duality and the simplex algorithm.

Optimization: principles and algorithms - Linear optimization

Introduction to linear optimization, duality and the simplex algorithm.

Convex optimization

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

Constrained optimization

In mathematical optimization, constrained optimization (in some contexts called constraint optimization) is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The objective function is either a cost function or energy function, which is to be minimized, or a reward function or utility function, which is to be maximized.

Genetic algorithm

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection. Some examples of GA applications include optimizing decision trees for better performance, solving sudoku puzzles, hyperparameter optimization, causal inference, etc.

Related publications (439)

Distributed learning is the key for enabling training of modern large-scale machine learning models, through parallelising the learning process. Collaborative learning is essential for learning from privacy-sensitive data that is distributed across various ...

Nikolaos Geroliminis, Claudia Bongiovanni, Mor Kaspi

This paper offers a new algorithm to efficiently optimize scheduling decisions for dial-a-ride problems (DARPs), including problem variants considering electric and autonomous vehicles (e-ADARPs). The scheduling heuristic, based on linear programming theor ...

This paper develops a fast algorithm for computing the equilibrium assignment with the perturbed utility route choice (PURC) model. Without compromise, this allows the significant advantages of the PURC model to be used in large-scale applications. We form ...