Publications related to Generalization Properties of NAS under Activation and Skip Connection Search

Deep Learning Theory Through the Lens of Diagonal Linear Networks

In this PhD manuscript, we explore optimisation phenomena which occur in complex neural networks through the lens of

2

-layer diagonal linear networks. This rudimentary architecture, which consists of a two layer feedforward linear network with a diagonal ...

EPFL2024

Robust NAS under adversarial training: benchmark, theory, and beyond

Volkan Cevher, Grigorios Chrysos, Fanghui Liu, Yongtao Wu

Recent developments in neural architecture search (NAS) emphasize the significance of considering robust architectures against malicious data. However, there is a notable absence of benchmark evaluations and theoretical guarantees for searching these robus ...

2024

Explainable Face Verification via Feature-Guided Gradient Backpropagation

Touradj Ebrahimi, Yuhang Lu, Zewei Xu

Recent years have witnessed significant advance- ment in face recognition (FR) techniques, with their applications widely spread in people’s lives and security-sensitive areas. There is a growing need for reliable interpretations of decisions of such syste ...

2024

From Kernel Methods to Neural Networks: A Unifying Variational Formulation

Michaël Unser

The minimization of a data-fidelity term and an additive regularization functional gives rise to a powerful framework for supervised learning. In this paper, we present a unifying regularization functional that depends on an operator L\documentclass[12pt]{ ...

New York2023

Deep Learning Generalization with Limited and Noisy Labels

Mahsa Forouzesh

Deep neural networks have become ubiquitous in today's technological landscape, finding their way in a vast array of applications. Deep supervised learning, which relies on large labeled datasets, has been particularly successful in areas such as image cla ...

EPFL2023

Benign Overfitting in Deep Neural Networks under Lazy Training

Volkan Cevher, Grigorios Chrysos, Fanghui Liu, Zhenyu Zhu

This paper focuses on over-parameterized deep neural networks (DNNs) with ReLU activation functions and proves that when the data distribution is well-separated, DNNs can achieve Bayesoptimal test error for classification while obtaining (nearly) zero-trai ...

2023

Hamiltonian Deep Neural Networks Guaranteeing Non-Vanishing Gradients by Design

Giancarlo Ferrari Trecate, Luca Furieri, Clara Lucía Galimberti, Liang Xu

Deep Neural Networks (DNNs) training can be difficult due to vanishing and exploding gradients during weight optimization through backpropagation. To address this problem, we propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretiz ...

2023

Transformer Models for Vision

Jean-Baptiste Francis Marie Juliette Cordonnier

The recent developments of deep learning cover a wide variety of tasks such as image classification, text translation, playing go, and folding proteins.All these successful methods depend on a gradient-based learning algorithm to train a model on massive a ...

EPFL2023

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

Leonardo Petrini

Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language proc ...

EPFL2023

Probabilistic methods for neural combinatorial optimization

Nikolaos Karalias

The monumental progress in the development of machine learning models has led to a plethora of applications with transformative effects in engineering and science. This has also turned the attention of the research community towards the pursuit of construc ...

EPFL2023

Generalization Properties of NAS under Activation and Skip Connection Search

Graph Chatbot

Chat with Graph Search

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Robust NAS under adversarial training: benchmark, theory, and beyond

Explainable Face Verification via Feature-Guided Gradient Backpropagation

From Kernel Methods to Neural Networks: A Unifying Variational Formulation

Deep Learning Generalization with Limited and Noisy Labels

Benign Overfitting in Deep Neural Networks under Lazy Training

Hamiltonian Deep Neural Networks Guaranteeing Non-Vanishing Gradients by Design

Transformer Models for Vision

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

Probabilistic methods for neural combinatorial optimization

Deep Learning Theory Through the Lens of Diagonal Linear Networks

From Kernel Methods to Neural Networks: A Unifying Variational Formulation

Transformer Models for Vision

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

Probabilistic methods for neural combinatorial optimization

Deep Learning Generalization with Limited and Noisy Labels

Robust NAS under adversarial training: benchmark, theory, and beyond

Explainable Face Verification via Feature-Guided Gradient Backpropagation

Benign Overfitting in Deep Neural Networks under Lazy Training

Hamiltonian Deep Neural Networks Guaranteeing Non-Vanishing Gradients by Design