Publication

On the linear convergence of the stochastic gradient method with constant step-size

Related publications (32)

Understanding generalization and robustness in modern deep learning

In this thesis, we study two closely related directions: robustness and generalization in modern deep learning. Deep learning models based on empirical risk minimization are known to be often non-robust to small, worst-case perturbations known as adversari ...

EPFL2024

On the Generalization of Stochastic Gradient Descent with Momentum

Volkan Cevher, Kimon Antonakopoulos

While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that th ...

Brookline2024

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size

\gamma

and momentum parameter

\beta

that allows u ...

2024

Exponential convergence to steady-states for trajectories of a damped dynamical system modeling adhesive strings

Nicola De Nitti

We study the global well-posedness and asymptotic behavior for a semilinear damped wave equation with Neumann boundary conditions, modeling a one-dimensional linearly elastic body interacting with a rigid substrate through an adhesive material. The key fea ...

World Scientific Publ Co Pte Ltd2024

Topics in statistical physics of high-dimensional machine learning

Hugo Chao Cui

In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in ...

EPFL2024

Model reduction of coupled systems based on non-intrusive approximations of the boundary response maps

Jan Sickmann Hesthaven, Niccolo' Discacciati

We propose a local, non -intrusive model order reduction technique to accurately approximate the solution of coupled multi -component parametrized systems governed by partial differential equations. Our approach is based on the approximation of the boundar ...

Lausanne2024

An interior penalty coupling strategy for isogeometric non-conformal Kirchhoff-Love shell patches

Annalisa Buffa, Pablo Antolin Sanchez, Giuliano Guarino

This work focuses on the coupling of trimmed shell patches using Isogeometric Analysis, based on higher continuity splines that seamlessly meet the C 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackag ...

Springer2024

Statistical Inference for Inverse Problems: From Sparsity-Based Methods to Neural Networks

Pakshal Narendra Bohra

In inverse problems, the task is to reconstruct an unknown signal from its possibly noise-corrupted measurements. Penalized-likelihood-based estimation and Bayesian estimation are two powerful statistical paradigms for the resolution of such problems. They ...

EPFL2024

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Scott William Pesme

In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of

2

-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...

EPFL2024

Scalable constrained optimization

Maria-Luiza Vladarean

Modern optimization is tasked with handling applications of increasingly large scale, chiefly due to the massive amounts of widely available data and the ever-growing reach of Machine Learning. Consequently, this area of research is under steady pressure t ...

EPFL2024

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Scott William Pesme

In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of

2

-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...

EPFL2024

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

\gamma

and momentum parameter

\beta

that allows u ...

2024