Publications related to Towards Understanding Sharpness-Aware Minimization

Deep Learning Theory Through the Lens of Diagonal Linear Networks

In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of

2

-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...

EPFL2024

Understanding generalization and robustness in modern deep learning

Maksym Andriushchenko

In this thesis, we study two closely related directions: robustness and generalization in modern deep learning. Deep learning models based on empirical risk minimization are known to be often non-robust to small, worst-case perturbations known as adversari ...

EPFL2024

Topics in statistical physics of high-dimensional machine learning

Hugo Chao Cui

In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in ...

EPFL2024

Generalization of Scaled Deep ResNets in the Mean-Field Regime

Volkan Cevher, Grigorios Chrysos, Fanghui Liu

Despite the widespread empirical success of ResNet, the generalization properties of deep ResNet are rarely explored beyond the lazy training regime. In this work, we investigate scaled ResNet in the limit of infinitely deep and wide neural networks, of wh ...

2024

Random matrix methods for high-dimensional machine learning models

Antoine Philippe Michel Bodin

In the rapidly evolving landscape of machine learning research, neural networks stand out with their ever-expanding number of parameters and reliance on increasingly large datasets. The financial cost and computational resources required for the training p ...

EPFL2024

On the number of regions of piecewise linear neural networks

Michaël Unser, Alexis Marie Frederic Goujon

Many feedforward neural networks (NNs) generate continuous and piecewise-linear (CPWL) mappings. Specifically, they partition the input domain into regions on which the mapping is affine. The number of these so-called linear regions offers a natural metric ...

2024

Residual-based attention in physics-informed neural networks

Nikolaos Stergiopoulos, Sokratis Anagnostopoulos

Driven by the need for more efficient and seamless integration of physical models and data, physics -informed neural networks (PINNs) have seen a surge of interest in recent years. However, ensuring the reliability of their convergence and accuracy remains ...

Lausanne2024

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Anastasiia Koloskova

Distributed learning is the key for enabling training of modern large-scale machine learning models, through parallelising the learning process. Collaborative learning is essential for learning from privacy-sensitive data that is distributed across various ...

EPFL2024

Deep learning approach for identification of H II regions during reionization in 21-cm observations - II. Foreground contamination

Jean-Paul Richard Kneib, Emma Elizabeth Tolley, Tianyue Chen, Michele Bianco

The upcoming Square Kilometre Array Observatory will produce images of neutral hydrogen distribution during the epoch of reionization by observing the corresponding 21-cm signal. However, the 21-cm signal will be subject to instrumental limitations such as ...

Oxford Univ Press2024

Error assessment of an adaptive finite elements-neural networks method for an elliptic parametric PDE

Marco Picasso, Alexandre Caboussat, Maude Girardin

We present a finite elements-neural network approach for the numerical approximation of parametric partial differential equations. The algorithm generates training data from finite element simulations, and uses a data -driven (supervised) feedforward neura ...

Lausanne2024

Towards Understanding Sharpness-Aware Minimization

Graph Chatbot

Chat with Graph Search

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Understanding generalization and robustness in modern deep learning

Topics in statistical physics of high-dimensional machine learning

Generalization of Scaled Deep ResNets in the Mean-Field Regime

Random matrix methods for high-dimensional machine learning models

On the number of regions of piecewise linear neural networks

Residual-based attention in physics-informed neural networks

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Deep learning approach for identification of H II regions during reionization in 21-cm observations - II. Foreground contamination

Error assessment of an adaptive finite elements-neural networks method for an elliptic parametric PDE

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Understanding generalization and robustness in modern deep learning

Topics in statistical physics of high-dimensional machine learning

Generalization of Scaled Deep ResNets in the Mean-Field Regime

Random matrix methods for high-dimensional machine learning models

On the number of regions of piecewise linear neural networks

Residual-based attention in physics-informed neural networks

Optimization Algorithms for Decentralized, Distributed and Collaborative Machine Learning

Deep learning approach for identification of H II regions during reionization in 21-cm observations - II. Foreground contamination

Error assessment of an adaptive finite elements-neural networks method for an elliptic parametric PDE