On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent

Related publications (32)

About
Privacy
Disclaimer

Graph Chatbot

Bayes-optimal Learning of Deep Random Networks of Extensive-width

Florent Gérard Krzakala, Lenka Zdeborová, Hugo Chao Cui

We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights. We consider the asymptotic limit where the number of samples, the input dimension and the network width ...

2023

On the symmetries in the dynamics of wide two-layer neural networks

Lénaïc Chizat

We consider the idealized setting of gradient flow on the population risk for infinitely wide two-layer ReLU neural networks (without bias), and study the effect of symmetries on the learned parameters and predictors. We first describe a general class of s ...

AMER INST MATHEMATICAL SCIENCES-AIMS2023