**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Publication# Random matrix methods for high-dimensional machine learning models

Abstract

In the rapidly evolving landscape of machine learning research, neural networks stand out with their ever-expanding number of parameters and reliance on increasingly large datasets. The financial cost and computational resources required for the training phase have sparked debates and raised concerns regarding the environmental impact of this process. As a result, it has become paramount to construct a theoretical framework that can provide deeper insights into how model performance scales with the size of the data, number of parameters, and training epochs.This thesis is concerned with the analysis of such large machine learning models through a theoretical lens. The sheer sizes considered in these models make them suitable for the application of statistical methods in the limit of high dimensions, akin to the thermodynamic limit in the context of statistical physics.Our approach is based on different results from random matrix theory, which involves large matrices with random entries. We will make a deep dive into this field and use a spectrum of tools and techniques that will underpin our investigations of these models across various settings.Throughout our journey, we begin by constructing a model starting from a linear regression. We then extend and build upon it to allow for a wider range of architectures, culminating in a model that closely resembles the structure of a multi-layer neural network.With the gradient-flow dynamics, we further develop analytical formulas predicting the learning curves of both the training and generalization errors. The equations derived in the process reveal several underlying phenomena emerging from the dynamics such as the double descent, and specific descent structures over time.We then take a detour to explore the dynamics of the rank-one matrix estimation problem, commonly referred to as the Spike-Wigner model. This model is particularly intriguing due to the presence of a phase transition with respect to the signal-to-noise ratio, as well as challenges related to the non-convexity of the loss function and non-linear learning equations. Subsequently, we address the extensive-rank matrix denoising problem which is an extension of the previous model. It holds particular interest in the context of sample covariance matrix estimation, and presents other challenges stemming from the initialization and the tracking of eigenvectors alignment.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (38)

Related MOOCs (26)

Related publications (176)

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.

Artificial neural network

Artificial neural networks (ANNs, also shortened to neural networks (NNs) or neural nets) are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons.

Random matrix

In probability theory and mathematical physics, a random matrix is a matrix-valued random variable—that is, a matrix in which some or all elements are random variables. Many important properties of physical systems can be represented mathematically as matrix problems. For example, the thermal conductivity of a lattice can be computed from the dynamical matrix of the particle-particle interactions within the lattice. In nuclear physics, random matrices were introduced by Eugene Wigner to model the nuclei of heavy atoms.

Algebra (part 1)

Un MOOC francophone d'algèbre linéaire accessible à tous, enseigné de manière rigoureuse et ne nécessitant aucun prérequis.

Algebra (part 1)

Un MOOC francophone d'algèbre linéaire accessible à tous, enseigné de manière rigoureuse et ne nécessitant aucun prérequis.

Algebra (part 2)

Un MOOC francophone d'algèbre linéaire accessible à tous, enseigné de manière rigoureuse et ne nécessitant aucun prérequis.

A key challenge across many disciplines is to extract meaningful information from data which is often obscured by noise. These datasets are typically represented as large matrices. Given the current trend of ever-increasing data volumes, with datasets grow ...

In inverse problems, the task is to reconstruct an unknown signal from its possibly noise-corrupted measurements. Penalized-likelihood-based estimation and Bayesian estimation are two powerful statistical paradigms for the resolution of such problems. They ...

Demetri Psaltis, Mario Paolone, Christophe Moser, Luisa Lambertini

With the significant increase in photovoltaic (PV) electricity generation, more attention has been given to PV power forecasting. Indeed, accurate forecasting allows power grid operators to better schedule and dispatch their assets, such as energy storage ...