Publications related to Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size

\gamma

and momentum parameter

\beta

that allows u ...

2024

Phase Retrieval: From Computational Imaging to Machine Learning: A tutorial

Michaël Unser, Thanh-An Michel Pham, Jonathan Yuelin Dong

Phase retrieval consists in the recovery of a complex-valued signal from intensity-only measurements. As it pervades a broad variety of applications, many researchers have striven to develop phase-retrieval algorithms. Classical approaches involve techniqu ...

2023

Stable Nonconvex-Nonconcave Training via Linear Interpolation

Volkan Cevher, Thomas Michaelsen Pethick, Wanyun Xie

This paper presents a theoretical analysis of linear interpolation as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss lan ...

2023

Training Efficient Controllers via Analytic Policy Gradient

Dario Floreano, Valentin Wüest, Davide Scaramuzza

Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but requir ...

2023

An optimal preconditioned FFT-accelerated finite element solver for homogenization

Till Junge, Ali Falsafi, Martin Ladecký

We generalize and provide a linear algebra-based perspective on a finite element (FE) ho-mogenization scheme, pioneered by Schneider et al. (2017)[1] and Leuschner and Fritzen (2018)[2]. The efficiency of the scheme is based on a preconditioned, well-scale ...

ELSEVIER SCIENCE INC2023

Phase Reconstruction of Low-Energy Electron Holograms of Individual Proteins

Klaus Kern, Stephan Rauschenbach, Sven Alexander Szilagyi, Hannah Julia Ochner

Low-energy electron holography (LEEH) is one of the few techniques capable of imaging large and complex three-dimensional molecules, such as proteins, on the single molecule level at subnanometer resolution. During the imaging process, the structural infor ...

AMER CHEMICAL SOC2022

Masked Training of Neural Networks with Partial Gradients

Martin Jaggi, Amirkeivan Mohtashami, Sebastian Urban Stich

State-of-the-art training algorithms for deep learning models are based on stochastic gradient descent (SGD). Recently, many variations have been explored: perturbing parameters for better accuracy (such as in Extra-gradient), limiting SGD updates to a sub ...

JMLR-JOURNAL MACHINE LEARNING RESEARCH2022

On the Double Descent of Random Features Models Trained with SGD

Volkan Cevher, Fanghui Liu

We study generalization properties of random features (RF) regression in high dimensions optimized by stochastic gradient descent (SGD) in under-/overparameterized regime. In this work, we derive precise non-asymptotic error bounds of RF regression under b ...

2022

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Florent Gérard Krzakala, Lenka Zdeborová, Ludovic Théo Stephan, Bruno Loureiro

Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent. The picture can be radically different for narrow net-works, which tend to get stuck in badly-generalizing loca ...

2022

Towards Understanding Sharpness-Aware Minimization

Nicolas Henri Bernard Flammarion, Maksym Andriushchenko

Sharpness-Aware Minimization (SAM) is a recent training method that relies on worst-case weight perturbations which significantly improves generalization in various settings. We argue that the existing justifications for the success of SAM which are based ...

JMLR-JOURNAL MACHINE LEARNING RESEARCH2022

Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

Graph Chatbot

Chat with Graph Search

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Phase Retrieval: From Computational Imaging to Machine Learning: A tutorial

Stable Nonconvex-Nonconcave Training via Linear Interpolation

Training Efficient Controllers via Analytic Policy Gradient

An optimal preconditioned FFT-accelerated finite element solver for homogenization

Phase Reconstruction of Low-Energy Electron Holograms of Individual Proteins

Masked Training of Neural Networks with Partial Gradients

On the Double Descent of Random Features Models Trained with SGD

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Towards Understanding Sharpness-Aware Minimization

Training Efficient Controllers via Analytic Policy Gradient

Phase Reconstruction of Low-Energy Electron Holograms of Individual Proteins

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Phase Retrieval: From Computational Imaging to Machine Learning: A tutorial

Stable Nonconvex-Nonconcave Training via Linear Interpolation

An optimal preconditioned FFT-accelerated finite element solver for homogenization

Masked Training of Neural Networks with Partial Gradients

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

On the Double Descent of Random Features Models Trained with SGD

Towards Understanding Sharpness-Aware Minimization