Berfin Simsek | EPFL Graph Search

A Theory of Finite-Width Neural Networks: Generalization, Scaling Laws, and the Loss Landscape

Deep learning has achieved remarkable success in various challenging tasks such as generating images from natural language or engaging in lengthy conversations with humans. The success in practice stems from the ability to successfully train massive neural ...

EPFL2023

Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

Wulfram Gerstner, Clément Hongler, Johanni Michael Brea, Francesco Spadaro, Berfin Simsek, Arthur Jacot

We study how permutation symmetries in overparameterized multi-layer neural networks generate `symmetry-induced' critical points. Assuming a network with

L

layers of minimal widths

r_1^*, \ldots, r_{L-1}^*

reaches a zero-loss minimum at $ r_1^*! \c ...

2021

Implicit Regularization of Random Feature Models

Clément Hongler, Francesco Spadaro, Franck Raymond Gabriel, Berfin Simsek, Arthur Ulysse Jacot-Guillarmod

Random Feature (RF) models are used as efficient parametric approximations of kernel methods. We investigate, by means of random matrix theory, the connection between Gaussian RF models and Kernel Ridge Regression (KRR). For a Gaussian RF model with P feat ...

JMLR-JOURNAL MACHINE LEARNING RESEARCH2020