Publications related to On the Generalization of Stochastic Gradient Descent with Momentum

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size

\gamma

and momentum parameter

\beta

that allows u ...

2024

Understanding generalization and robustness in modern deep learning

Maksym Andriushchenko

In this thesis, we study two closely related directions: robustness and generalization in modern deep learning. Deep learning models based on empirical risk minimization are known to be often non-robust to small, worst-case perturbations known as adversari ...

EPFL2024

Machine Learning Security Against Data Poisoning: Are We There Yet?

Kathrin Grosse

Poisoning attacks compromise the training data utilized to train machine learning (ML) models, diminishing their overall performance, manipulating predictions on specific test samples, and implanting backdoors. This article thoughtfully explores these atta ...

Ieee Computer Soc2024

Deep Learning Generalization with Limited and Noisy Labels

Mahsa Forouzesh

Deep neural networks have become ubiquitous in today's technological landscape, finding their way in a vast array of applications. Deep supervised learning, which relies on large labeled datasets, has been particularly successful in areas such as image cla ...

EPFL2023

Universal and adaptive methods for robust stochastic optimization

Ali Kavis

Within the context of contemporary machine learning problems, efficiency of optimization process depends on the properties of the model and the nature of the data available, which poses a significant problem as the complexity of either increases ad infinit ...

EPFL2023

Probabilistic methods for neural combinatorial optimization

Nikolaos Karalias

The monumental progress in the development of machine learning models has led to a plethora of applications with transformative effects in engineering and science. This has also turned the attention of the research community towards the pursuit of construc ...

EPFL2023

The statistical complexity of early-stopped mirror descent

Tomas Vaskevicius, Varun Kanade

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrain ...

Oxford2023

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Daniel Kuhn, Yves Rychener, Tobias Sutter

We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this an ...

2023

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

Patrick Thiran, Negar Kiyavash, Saber Salehkaleybar

We study the performance of Stochastic Cubic Regularized Newton (SCRN) on a class of functions satisfying gradient dominance property with

1\le\alpha\le2

which holds in a wide range of applications in machine learning and signal processing. This conditio ...

NeurIPS2022

Multi-Level Monte Carlo Methods for Uncertainty Quantification and Risk-Averse Optimisation

Sundar Subramaniam Ganesh

This work aims to study the effects of wind uncertainties in civil engineering structural design. Optimising the design of a structure for safety or operability without factoring in these uncertainties can result in a design that is not robust to these per ...

EPFL2022

On the Generalization of Stochastic Gradient Descent with Momentum

Graph Chatbot

Chat with Graph Search

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Understanding generalization and robustness in modern deep learning

Machine Learning Security Against Data Poisoning: Are We There Yet?

Deep Learning Generalization with Limited and Noisy Labels

Universal and adaptive methods for robust stochastic optimization

Probabilistic methods for neural combinatorial optimization

The statistical complexity of early-stopped mirror descent

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

Multi-Level Monte Carlo Methods for Uncertainty Quantification and Risk-Averse Optimisation

Understanding generalization and robustness in modern deep learning

Universal and adaptive methods for robust stochastic optimization

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Multi-Level Monte Carlo Methods for Uncertainty Quantification and Risk-Averse Optimisation

Machine Learning Security Against Data Poisoning: Are We There Yet?

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Probabilistic methods for neural combinatorial optimization

The statistical complexity of early-stopped mirror descent

Deep Learning Generalization with Limited and Noisy Labels

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function