Publication

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size

\gamma

and momentum parameter

\beta

that allows u ...

2024

On the Generalization of Stochastic Gradient Descent with Momentum

Volkan Cevher, Kimon Antonakopoulos

While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that th ...

Brookline2024

Scalable constrained optimization

Maria-Luiza Vladarean

Modern optimization is tasked with handling applications of increasingly large scale, chiefly due to the massive amounts of widely available data and the ever-growing reach of Machine Learning. Consequently, this area of research is under steady pressure t ...

EPFL2024

Perturbed Utility Stochastic Traffic Assignment

This paper develops a fast algorithm for computing the equilibrium assignment with the perturbed utility route choice (PURC) model. Without compromise, this allows the significant advantages of the PURC model to be used in large-scale applications. We form ...

Informs2024

Understanding generalization and robustness in modern deep learning

Maksym Andriushchenko

In this thesis, we study two closely related directions: robustness and generalization in modern deep learning. Deep learning models based on empirical risk minimization are known to be often non-robust to small, worst-case perturbations known as adversari ...

EPFL2024

Data-driven LPV Control for Micro-disturbance Rejection in a Hybrid Isolation Platform

Alireza Karimi, Elias Sebastian Klauser

A novel approach for linear parameter-varying (LPV) controller synthesis for adaptive rejection of time-varying sinusoidal disturbances is proposed. Only the frequency response data of a linear time-invariant (LTI) multiple-input multiple-output (MIMO) sys ...

2024

Universal and adaptive methods for robust stochastic optimization

Ali Kavis

Within the context of contemporary machine learning problems, efficiency of optimization process depends on the properties of the model and the nature of the data available, which poses a significant problem as the complexity of either increases ad infinit ...

EPFL2023

Augmented Lagrangian Methods for Provable and Scalable Machine Learning

Mehmet Fatih Sahin

Non-convex constrained optimization problems have become a powerful framework for modeling a wide range of machine learning problems, with applications in k-means clustering, large- scale semidefinite programs (SDPs), and various other tasks. As the perfor ...

EPFL2023

The statistical complexity of early-stopped mirror descent

Tomas Vaskevicius, Varun Kanade

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrain ...

Oxford2023

Localizing Unsynchronized Sensors With Unknown Sources

Ivan Dokmanic, Dalia Salem Hassan Fahmy El Badawy

We propose a method for sensor array self-localization using a set of sources at unknown locations. The sources produce signals whose times of arrival are registered at the sensors. We look at the general case where neither the emission times of the source ...

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC2023

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

\gamma

and momentum parameter

\beta

that allows u ...

2024