**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Person# Armin Eftekhari

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related units

Loading

Courses taught by this person

Loading

Related research domains

Loading

Related publications

Loading

People doing similar research

Loading

Related publications (10)

Loading

Loading

Loading

Courses taught by this person

Related research domains (5)

No results

Principal component analysis

Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while pr

Data

In common usage and statistics, data (USˈdætə; UKˈdeɪtə) is a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic uni

Dimensionality reduction

Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meani

Volkan Cevher, Armin Eftekhari, Thomas Michaelsen Pethick, Chaehwan Song

Overparameterization refers to the important phenomenon where the width of a neural network is chosen such that learning algorithms can provably attain zero loss in nonconvex training. The existing theory establishes such global convergence using various initialization strategies, training modifications, and width scalings. In particular, the state-of-the-art results require the width to scale quadratically with the number of training data under standard initialization strategies used in practice for best generalization performance. In contrast, the most recent results obtain linear scaling either with requiring initializations that lead to the “lazy-training”, or training only a single layer. In this work, we provide an analytical framework that allows us to adopt standard initialization strategies, possibly avoid lazy training, and train all layers simultaneously in basic shallow neural networks while attaining a desirable subquadratic scaling on the network width. We achieve the desiderata via Polyak-Łojasiewicz condition, smoothness, and standard assumptions on data, and use tools from random matrix theory.

2021People doing similar research (102)

, , , , , , , , ,

Related units (1)

Volkan Cevher, Armin Eftekhari, Ali Kavis, Paul Thierry Yves Rolland

A well-known first-order method for sampling from log-concave probability distributions is the Unadjusted Langevin Algorithm (ULA). This work proposes a new annealing step-size schedule for ULA, which allows to prove new convergence guarantees for sampling from a smooth log-concave distribution, which are not covered by existing state-of-the-art convergence guarantees. To establish this result, we derive a new theoretical bound that relates the Wasserstein distance to total variation distance between any two log-concave distributions that complements the reach of Talagrand $T_2$ inequality. Moreover, applying this new step size schedule to an existing constrained sampling algorithm, we show state-of-the-art convergence rates for sampling from a constrained log-concave distribution, as well as improved dimension dependence.

2020We consider the problem of non-negative super-resolution, which concerns reconstructing a non-negative signal x = Sigma(k )(i=1)a(i)delta(ti) from m samples of its convolution with a window function phi(s - t), of the form y(s(j)) = Sigma(k)(i=1) a(i) phi(s(j) - t(i)) + delta(j), where delta(j) indicates an inexactness in the sample value. We first show that x is the unique non-negative measure consistent with the samples, provided the samples are exact. Moreover, we characterise non-negative solutions (x) over cap consistent with the samples within the bound Sigma(m)(j=1) delta(2)(j)