Publications associées (81)

On the Generalization of Stochastic Gradient Descent with Momentum

Volkan Cevher, Kimon Antonakopoulos

While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In this work, we first show that th ...
Brookline2024

Efficient local linearity regularization to overcome catastrophic overfitting

Volkan Cevher, Grigorios Chrysos, Fanghui Liu, Elias Abad Rocamora

Catastrophic overfitting (CO) in single-step adversarial training (AT) results in abrupt drops in the adversarial test accuracy (even down to 0%). For models trained with multi-step AT, it has been observed that the loss function behaves locally linearly w ...
2024

Gradient boosting with extreme-value theory for wildfire prediction

Jonathan Koh Boon Han

This paper details the approach of the team Kohrrelation in the 2021 Extreme Value Analysis data challenge, dealing with the prediction of wildfire counts and sizes over the contiguous US. Our approach uses ideas from extreme-value theory in a machine lear ...
2023

Bayes-optimal Learning of Deep Random Networks of Extensive-width

Florent Gérard Krzakala, Lenka Zdeborová, Hugo Chao Cui

We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights. We consider the asymptotic limit where the number of samples, the input dimension and the network width ...
2023

A Theory of Finite-Width Neural Networks: Generalization, Scaling Laws, and the Loss Landscape

Berfin Simsek

Deep learning has achieved remarkable success in various challenging tasks such as generating images from natural language or engaging in lengthy conversations with humans.The success in practice stems from the ability to successfully train massive neural ...
EPFL2023

Regularization Techniques for Low-Resource Machine Translation

Alejandro Ramírez Atrio

Neural machine translation (MT) and text generation have recently reached very high levels of quality. However, both areas share a problem: in order to reach these levels, they require massive amounts of data. When this is not present, they lack generaliza ...
EPFL2023

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.