Publications related to The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Updates

Deep Learning Theory Through the Lens of Diagonal Linear Networks

In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of

2

-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...

EPFL2024

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Nicolas Henri Bernard Flammarion, Hristo Georgiev Papazov, Scott William Pesme

In this work, we investigate the effect of momentum on the optimisation trajectory of gradient descent. We leverage a continuous-time approach in the analysis of momentum gradient descent with step size

\gamma

and momentum parameter

\beta

that allows u ...

2024

Deep Learning Generalization with Limited and Noisy Labels

Mahsa Forouzesh

Deep neural networks have become ubiquitous in today's technological landscape, finding their way in a vast array of applications. Deep supervised learning, which relies on large labeled datasets, has been particularly successful in areas such as image cla ...

EPFL2023

Transformer Models for Vision

Jean-Baptiste Francis Marie Juliette Cordonnier

The recent developments of deep learning cover a wide variety of tasks such as image classification, text translation, playing go, and folding proteins.All these successful methods depend on a gradient-based learning algorithm to train a model on massive a ...

EPFL2023

Fundamental Limits in Statistical Learning Problems: Block Models and Neural Networks

Elisabetta Cornacchia

This thesis focuses on two selected learning problems: 1) statistical inference on graphs models, and, 2) gradient descent on neural networks, with the common objective of defining and analysing the measures that characterize the fundamental limits.In the ...

EPFL2023

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Daniel Kuhn, Yves Rychener, Tobias Sutter

We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this an ...

2023

Hamiltonian Deep Neural Networks Guaranteeing Non-Vanishing Gradients by Design

Giancarlo Ferrari Trecate, Luca Furieri, Clara Lucía Galimberti, Liang Xu

Deep Neural Networks (DNNs) training can be difficult due to vanishing and exploding gradients during weight optimization through backpropagation. To address this problem, we propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretiz ...

2023

Probabilistic methods for neural combinatorial optimization

Nikolaos Karalias

The monumental progress in the development of machine learning models has led to a plethora of applications with transformative effects in engineering and science. This has also turned the attention of the research community towards the pursuit of construc ...

EPFL2023

Byzantine Fault-Tolerance in Federated Local SGD Under 2f-Redundancy

Nirupam Gupta

In this article, we study the problem of Byzantine fault-tolerance in a federated optimization setting, where there is a group of agents communicating with a centralized coordinator. We allow up to

f

Byzantine-faulty agents, which may not follow a prescr ...

Piscataway2023

Towards Verifiable, Generalizable and Efficient Robust Deep Neural Networks.

Chen Liu

In the last decade, deep neural networks have achieved tremendous success in many fields of machine learning.However, they are shown vulnerable against adversarial attacks: well-designed, yet imperceptible, perturbations can make the state-of-the-art deep ...

EPFL2022

The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Updates

Graph Chatbot

Chat with Graph Search

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Deep Learning Generalization with Limited and Noisy Labels

Transformer Models for Vision

Fundamental Limits in Statistical Learning Problems: Block Models and Neural Networks

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Hamiltonian Deep Neural Networks Guaranteeing Non-Vanishing Gradients by Design

Probabilistic methods for neural combinatorial optimization

Byzantine Fault-Tolerance in Federated Local SGD Under 2f-Redundancy

Towards Verifiable, Generalizable and Efficient Robust Deep Neural Networks.

Deep Learning Theory Through the Lens of Diagonal Linear Networks

End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

Hamiltonian Deep Neural Networks Guaranteeing Non-Vanishing Gradients by Design

Towards Verifiable, Generalizable and Efficient Robust Deep Neural Networks.

Deep Learning Generalization with Limited and Noisy Labels

Transformer Models for Vision

Fundamental Limits in Statistical Learning Problems: Block Models and Neural Networks

Probabilistic methods for neural combinatorial optimization

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

Byzantine Fault-Tolerance in Federated Local SGD Under 2f-Redundancy