**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Person# Nicolas Henri Bernard Flammarion

Biography

Nicolas Flammarion is a tenure-track assistant professor in computer science at EPFL. Prior to that, he was a postdoctoral fellow at UC Berkeley, hosted by Michael I. Jordan. He received his PhD in 2017 from Ecole Normale Superieure in Paris, where he was advised by Alexandre d’Aspremont and Francis Bach. In 2018 he received the prize of the Fondation Mathematique Jacques Hadamard for the best PhD thesis in the field of optimization. His research focuses primarily on learning problems at the interface of machine learning, statistics and optimization.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related units

Loading

Courses taught by this person

Loading

Related research domains

Loading

Related publications

Loading

People doing similar research

Loading

Courses taught by this person (3)

CS-433: Machine learning

Machine learning methods are becoming increasingly central in many sciences and applications. In this course, fundamental principles and methods of machine learning will be introduced, analyzed and practically implemented.

CS-439: Optimization for machine learning

This course teaches an overview of modern optimization methods, for applications in machine learning and data science. In particular, scalability of algorithms to large datasets will be discussed in theory and in implementation.

ENG-704: EECS Seminar: Advanced Topics in Machine Learning

Students learn about advanced topics in machine learning, artificial intelligence, optimization, and data science. Students also learn to interact with scientific work, analyze and understand strengths and weaknesses of scientific arguments of both theoretical and experimental results.

Related research domains (26)

Related publications (32)

Gradient descent

In mathematics, gradient descent (also often called steepest descent) is a iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated ste

Stochastic gradient descent

Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can b

Algorithm

In mathematics and computer science, an algorithm (ˈælɡərɪðəm) is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algo

Loading

Loading

Loading

People doing similar research (101)

Related units (4)

Nicolas Henri Bernard Flammarion

Controlling the parameters' norm often yields good generalisation when training neural networks. Beyond simple intuitions, the relation between parameters' norm and obtained estimators theoretically remains misunderstood. For one hidden ReLU layer networks with unidimensional data, this work shows the minimal parameters' norm required to represent a function is given by the total variation of its second derivative, weighted by a $\sqrt{1+x^2}$ factor. As a comparison, this $\sqrt{1+x^2}$ weighting disappears when the norm of the bias terms are ignored. This additional weighting is of crucial importance, since it is shown in this work to enforce uniqueness and sparsity (in number of kinks) of the minimal norm interpolator. On the other hand, omitting the bias' norm allows for non-sparse solutions. Penalising the bias terms in the regularisation, either explicitly or implicitly, thus leads to sparse estimators. This sparsity might take part in the good generalisation of neural networks that is empirically observed.

2023Nicolas Henri Bernard Flammarion, Oguz Kaan Yüksel

Due to its empirical success on few shot classification and reinforcement learning, meta-learning recently received a lot of interest. Meta-learning leverages data from previous tasks to quickly learn a new task, despite limited data. In particular, model agnostic methods look for initialisation points from which gradient descent quickly adapts to any new task. Although it has been empirically suggested that such methods learn a good shared representation during training, there is no strong theoretical evidence of such behavior. More importantly, it is unclear whether these methods truly are model agnostic, i.e., whether they still learn a shared structure despite architecture misspecifications. To fill this gap, this work shows in the limit of an infinite number of tasks that first order ANIL with a linear two-layer network architecture successfully learns a linear shared representation. Moreover, this result holds despite misspecifications: having a large width with respect to the hidden dimension of the shared representation does not harm the algorithm performance. The learnt parameters then allow to get a small test loss after a single gradient step on any new task. Overall this illustrates how well model agnostic methods can adapt to any (unknown) model structure.

2023Nicolas Henri Bernard Flammarion, Aditya Vardhan Varre

We consider stochastic approximation for the least squares regression problem in the non-strongly convex setting. We present the first practical algorithm that achieves the optimal prediction error rates in terms of dependence on the noise of the problem, as $O(d/t)$ while accelerating the forgetting of the initial conditions to $O(d/t^2)$. Our new algorithm is based on a simple modification of the accelerated gradient descent. We provide convergence results for both the averaged and the last iterate of the algorithm. In order to describe the tightness of these new bounds, we present a matching lower bound in the noiseless setting and thus show the optimality of our algorithm.

2022