Publication

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

Leonardo Petrini
2023
Thèse EPFL
Résumé

Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language processing and computer vision, largely attributed to deep learning, a special class of machine learning models. Deep learning arguably surpasses traditional approaches by learning the relevant features from raw data through a series of computational layers.This thesis explores the theoretical foundations of deep learning by studying the relationship between the architecture of these models and the inherent structures found within the data they process. In particular, we ask: What drives the efficacy of deep learning algorithms and allows them to beat the so-called curse of dimensionality—i.e. the difficulty of generally learning functions in high dimensions due to the exponentially increasing need for data points with increased dimensionality? Is it their ability to learn relevant representations of the data by exploiting their structure? How do different architectures exploit different data structures? In order to address these questions, we push forward the idea that the structure of the data can be effectively characterized by its invariances—i.e. aspects that are irrelevant for the task at hand.Our methodology takes an empirical approach to deep learning, combining experimental studies with physics-inspired toy models. These simplified models allow us to investigate and interpret the complex behaviors we observe in deep learning systems, offering insights into their inner workings, with the far-reaching goal of bridging the gap between theory and practice.Specifically, we compute tight generalization error rates of shallow fully connected networks demonstrating that they are capable of performing well by learning linear invariances, i.e. becoming insensitive to irrelevant linear directions in input space. However, we show that these network architectures can perform poorly in learning non-linear invariances such as rotation invariance or the invariance with respect to smooth deformations of the input. This result illustrates that, if a chosen architecture is not suitable for a task, it might overfit, making a kernel method, for which representations are not learned, potentially a better choice.Modern architectures like convolutional neural networks, however, are particularly well-fitted to learn the non-linear invariances that are present in real data. In image classification, for example, the exact position of an object or feature might not be crucial for recognizing it. This property gives rise to an invariance with respect to small deformations. Our findings show that the neural networks that are more invariant to deformations tend to have higher performance, underlying the importance of exploiting such invariance.Another key property that gives structure to real data is the fact that high-level features are a hierarchical composition of lower-level features—a dog is made of a head and limbs, the head is made of eyes, nose, and mouth, which are then made of simple textures and edges. These features can be realized in multiple synonymous ways, giving rise to an invariance. To investigate the synonymic invariance that arises from the hierarchical structure of data, we introduce a toy data model that allows us to examine how features are extracted and combined to form incr

À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
Concepts associés (34)
Apprentissage profond
L'apprentissage profond ou apprentissage en profondeur (en anglais : deep learning, deep structured learning, hierarchical learning) est un sous-domaine de l’intelligence artificielle qui utilise des réseaux neuronaux pour résoudre des tâches complexes grâce à des architectures articulées de différentes transformations non linéaires. Ces techniques ont permis des progrès importants et rapides dans les domaines de l'analyse du signal sonore ou visuel et notamment de la reconnaissance faciale, de la reconnaissance vocale, de la vision par ordinateur, du traitement automatisé du langage.
Réseau de neurones artificiels
Un réseau de neurones artificiels, ou réseau neuronal artificiel, est un système dont la conception est à l'origine schématiquement inspirée du fonctionnement des neurones biologiques, et qui par la suite s'est rapproché des méthodes statistiques. Les réseaux de neurones sont généralement optimisés par des méthodes d'apprentissage de type probabiliste, en particulier bayésien.
Apprentissage automatique
L'apprentissage automatique (en anglais : machine learning, « apprentissage machine »), apprentissage artificiel ou apprentissage statistique est un champ d'étude de l'intelligence artificielle qui se fonde sur des approches mathématiques et statistiques pour donner aux ordinateurs la capacité d'« apprendre » à partir de données, c'est-à-dire d'améliorer leurs performances à résoudre des tâches sans être explicitement programmés pour chacune. Plus largement, il concerne la conception, l'analyse, l'optimisation, le développement et l'implémentation de telles méthodes.
Afficher plus
Publications associées (574)

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Mattia Atzeni

The ability to reason, plan and solve highly abstract problems is a hallmark of human intelligence. Recent advancements in artificial intelligence, propelled by deep neural networks, have revolutionized disciplines like computer vision and natural language ...
EPFL2024

Topics in statistical physics of high-dimensional machine learning

Hugo Chao Cui

In the past few years, Machine Learning (ML) techniques have ushered in a paradigm shift, allowing the harnessing of ever more abundant sources of data to automate complex tasks. The technical workhorse behind these important breakthroughs arguably lies in ...
EPFL2024

Deep learning approach for identification of H II regions during reionization in 21-cm observations - II. Foreground contamination

Jean-Paul Richard Kneib, Emma Elizabeth Tolley, Tianyue Chen, Michele Bianco

The upcoming Square Kilometre Array Observatory will produce images of neutral hydrogen distribution during the epoch of reionization by observing the corresponding 21-cm signal. However, the 21-cm signal will be subject to instrumental limitations such as ...
Oxford Univ Press2024
Afficher plus
MOOCs associés (32)
Neuronal Dynamics - Computational Neuroscience of Single Neurons
The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.
Neuronal Dynamics - Computational Neuroscience of Single Neurons
The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.
Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
Afficher plus