Publications associées à Regularization Techniques for Low-Resource Machine Translation

Efficient local linearity regularization to overcome catastrophic overfitting

Volkan Cevher, Grigorios Chrysos, Fanghui Liu, Elias Abad Rocamora

Catastrophic overfitting (CO) in single-step adversarial training (AT) results in abrupt drops in the adversarial test accuracy (even down to 0%). For models trained with multi-step AT, it has been observed that the loss function behaves locally linearly w ...

2024

Understanding generalization and robustness in modern deep learning

Maksym Andriushchenko

In this thesis, we study two closely related directions: robustness and generalization in modern deep learning. Deep learning models based on empirical risk minimization are known to be often non-robust to small, worst-case perturbations known as adversari ...

EPFL2024

On the Generalization of Stochastic Gradient Descent with Momentum

Volkan Cevher, Kimon Antonakopoulos

While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that th ...

Brookline2024

Statistical Inference for Inverse Problems: From Sparsity-Based Methods to Neural Networks

Pakshal Narendra Bohra

In inverse problems, the task is to reconstruct an unknown signal from its possibly noise-corrupted measurements. Penalized-likelihood-based estimation and Bayesian estimation are two powerful statistical paradigms for the resolution of such problems. They ...

EPFL2024

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size

\gamma

and momentum parameter

\beta

that allows u ...

2024

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Scott William Pesme

In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of

2

-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...

EPFL2024

Topics in statistical physics of high-dimensional machine learning

Hugo Chao Cui

In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in ...

EPFL2024

High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization

Volkan Cevher, Fanghui Liu

This paper studies kernel ridge regression in high dimensions under covariate shifts and analyzes the role of importance re-weighting. We first derive the asymptotic expansion of high dimensional kernels under covariate shifts. By a bias-variance decomposi ...

2024

Towards Trustworthy Deep Learning for Image Reconstruction

Alexis Marie Frederic Goujon

The remarkable ability of deep learning (DL) models to approximate high-dimensional functions from samples has sparked a revolution across numerous scientific and industrial domains that cannot be overemphasized. In sensitive applications, the good perform ...

EPFL2024

Robust machine learning for neuroscientific inference

Steffen Schneider

Modern neuroscience research is generating increasingly large datasets, from recording thousands of neurons over long timescales to behavioral recordings of animals spanning weeks, months, or even years. Despite a great variety in recording setups and expe ...

EPFL2024

Regularization Techniques for Low-Resource Machine Translation

Graph Chatbot

Chattez avec Graph Search

Efficient local linearity regularization to overcome catastrophic overfitting

Understanding generalization and robustness in modern deep learning

On the Generalization of Stochastic Gradient Descent with Momentum

Statistical Inference for Inverse Problems: From Sparsity-Based Methods to Neural Networks

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Topics in statistical physics of high-dimensional machine learning

High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization

Towards Trustworthy Deep Learning for Image Reconstruction

Robust machine learning for neuroscientific inference

On the Generalization of Stochastic Gradient Descent with Momentum

Understanding generalization and robustness in modern deep learning

Efficient local linearity regularization to overcome catastrophic overfitting

Statistical Inference for Inverse Problems: From Sparsity-Based Methods to Neural Networks

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Topics in statistical physics of high-dimensional machine learning

Towards Trustworthy Deep Learning for Image Reconstruction

Robust machine learning for neuroscientific inference

High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization