**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Personne# Amir Zandieh

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Unités associées

Chargement

Cours enseignés par cette personne

Chargement

Domaines de recherche associés

Chargement

Publications associées

Chargement

Personnes menant des recherches similaires

Chargement

Cours enseignés par cette personne

Aucun résultat

Domaines de recherche associés (10)

Transformation de Fourier

thumb|Portrait de Joseph Fourier.
En mathématiques, plus précisément en analyse, la transformation de Fourier est une extension, pour les fonctions non périodiques, du développement en série de Fou

Algorithme

thumb|Algorithme de découpe d'un polygone quelconque en triangles (triangulation).
Un algorithme est une suite finie et non ambiguë d'instructions et d’opérations permettant de résoudre une classe de

Sample complexity

The sample complexity of a machine learning algorithm represents the number of training-samples that it needs in order to successfully learn a target function.
More precisely, the sample complexity

Personnes menant des recherches similaires (135)

, , , , , , , , ,

Unités associées (6)

Publications associées (6)

Chargement

Chargement

Chargement

This thesis focuses on developing efficient algorithmic tools for processing large datasets. In many modern data analysis tasks, the sheer volume of available datasets far outstrips our abilities to process them. This scenario commonly arises in tasks including parameter tuning of machine learning models (e.g., Google Vizier) and training neural networks. These tasks often require solving numerical linear algebraic problems on large matrices, making the classical primitives prohibitively expensive. Hence, there is a crucial need to efficiently compress the available datasets. In other settings, even collecting the input dataset is extremely expensive, making it vital to design optimal data sampling strategies. This is common in applications such as MRI acquisition and spectrum sensing.
The fundamental questions above are often dual to each other, and hence can be addressed using the same set of core techniques. Indeed, exploiting structured Fourier sparsity is a recurring source of efficiency in this thesis, leading to both fast numerical linear algebra methods and sample efficient data acquisition schemes.
One of the main results that we present in this thesis is the first Sublinear-time Model-based Sparse FFT algorithm that achieves a nearly optimal sample complexity for recovery of every signal whose Fourier transform is well approximated by a small number of blocks (e.g., such structure is common in spectrum sensing). Our method matches in sublinear time the result of Baraniuk et. al. (2010), which started the field of model-based compressed sensing. Another highlight of this thesis includes the first Dimension-independent Sparse FFT algorithm that, computes the Fourier transform of a sparse signal in sublinear runtime in any dimension. This is the first algorithm that just like the FFT of Cooley and Tukey is dimension independent and avoids the curse of dimensionality inherent to all previously known techniques. Finally, we give a Universal Sampling Scheme for the reconstruction of structured Fourier signals from continuous measurements. Our approach matches the classical results of Slepian, Pollak, and Landau (1960s) on the reconstruction of bandlimited signals via Prolate Spheroidal Wave Functions and extends these results to a wide class of Fourier structure types.
Besides having classical applications in signal processing and data analysis, Fourier techniques have been at the core of many machine learning tasks such as Kernel Matrix Approximation. The second half of this thesis is dedicated to finding compressed and low-rank representations of kernel matrices, which are the primary means of computation with large kernel matrices in machine learning. We build on Fourier techniques and achieve spectral approximation guarantees to the Gaussian kernel using an optimal number of samples, significantly improving upon the classical Random Fourier Features of Rahimi and Recht (2008). Finally, we present a nearly-optimal Oblivious Subspace Embedding for high-degree Polynomial kernels which leads to nearly-optimal embeddings of the high-dimensional Gaussian kernel. This is the first result that does not suffer from an exponential loss in the degree of the polynomial kernel or the dimension of the input point set, providing exponential improvements over the prior works, including the TensorSketch of Pagh (2013) and application of the celebrated Fast Multipole Method of Greengard and Rokhlin (1986) to the kernel approximation problem.

Mikhail Kapralov, Amir Zandieh

Kernel methods are fundamental tools in machine learning that allow detection of non-linear dependencies between data without explicitly constructing feature vectors in high dimensional spaces. A major disadvantage of kernel methods is their poor scalability: primitives such as kernel PCA or kernel ridge regression generally take prohibitively large quadratic space and (at least) quadratic time, as kernel matrices are usually dense. Some methods for speeding up kernel linear algebra are known, but they all invariably take time exponential in either the dimension of the input point set (e.g., fast multipole methods suffer from the curse of dimensionality) or in the degree of the kernel function. Oblivious sketching has emerged as a powerful approach to speeding up numerical linear algebra over the past decade, but our understanding of oblivious sketching solutions for kernel matrices has remained quite limited, suffering from the aforementioned exponential dependence on input parameters. Our main contribution is a general method for applying sketching solutions developed in numerical linear algebra over the past decade to a tensoring of data points without forming the tensoring explicitly. This leads to the first oblivious sketch for the polynomial kernel with a target dimension that is only polynomially dependent on the degree of the kernel function, as well as the first oblivious sketch for the Gaussian kernel on bounded datasets that does not suffer from an exponential dependence on the dimensionality of input data points.

Mikhail Kapralov, Navid Nouri, Amir Zandieh

Random binning features, introduced in the seminal paper of Rahimi and Recht '07, are an efficient method for approximating a kernel matrix using locality sensitive hashing. Random binning features provide a very simple and efficient way to approximate the Laplace kernel but unfortunately do not apply to many important classes of kernels, notably ones that generate smooth Gaussian processes, such as the Gaussian kernel and Matern kernel. In this paper we introduce a simple weighted version of random binning features, and show that the corresponding kernel function generates Gaussian processes of any desired smoothness. We show that our weighted random binning features provide a spectral approximation to the corresponding kernel matrix, leading to efficient algorithms for kernel ridge regression. Experiments on large scale regression datasets show that our method outperforms the accuracy of random Fourier features method.