**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Personne# Navid Nouri

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Unités associées

Chargement

Cours enseignés par cette personne

Chargement

Domaines de recherche associés

Chargement

Publications associées

Chargement

Personnes menant des recherches similaires

Chargement

Publications associées (4)

Cours enseignés par cette personne

Domaines de recherche associés

Aucun résultat

Aucun résultat

Chargement

Chargement

Chargement

Personnes menant des recherches similaires

Unités associées (2)

Aucun résultat

As it has become easier and cheaper to collect big datasets in the last few decades, designing efficient and low-cost algorithms for these datasets has attracted unprecedented attention. However, in most applications, even storing datasets as acquired has become extremely costly and inefficient, which motivates the study of sublinear algorithms. This thesis focuses on studying two fundamental graph problems in the sublinear regime. Furthermore, it presents a fast kernel density estimation algorithm and data structure. The first part of this thesis focuses on graph spectral sparsification in dynamic streams. Our algorithm achieves almost optimal runtime and space simultaneously in a single pass. Our method is based on a novel bucketing scheme that enables us to recover high effective resistance edges faster. This contribution presents a novel approach to the effective resistance embedding of the graph, using locality-sensitive hash functions, with possible further future applications.The second part of this thesis presents spanner construction results in the dynamic streams and the simultaneous communication models. First, we show how one can construct a $\tilde{O}(n^{2/3})$-spanner using the above-mentioned almost-optimal single-pass spectral sparsifier, resulting in the first single-pass algorithm for non-trivial spanner construction in the literature. Then, we generalize this result to constructing $\tilde{O}(n^{2/3(1-\alpha)})$-spanners using $\tilde{O}(n^{1+\alpha})$ space for any $\alpha \in [0,1]$, providing a smooth trade-off between distortion and memory complexity. Moreover, we study the simultaneous communication model and propose a novel protocol with low per player information. Also, we show how one can leverage more rounds of communication in this setting to achieve better distortion guarantees. Finally, in the third part of this thesis, we study the kernel density estimation problem. In this problem, given a kernel function, an input dataset imposes a kernel density on the space. The goal is to design fast and memory-efficient data structures that can output approximations to the kernel density at queried points. This thesis presents a data structure based on the classical near neighbor search and locality-sensitive hashing techniques that improves or matches the query time and space complexity for radial kernels considered in the literature. The approach is based on an implementation of (approximate) importance sampling for each distance range and then using near neighbor search algorithms to recover points from these distance ranges. Later, we show how to improve the runtime, using recent advances in the data-dependent near neighbor search data structures, for a class of radial kernels that includes the Gaussian kernel.

Mikhail Kapralov, Aidasadat Mousavifar, Navid Nouri, Jakab Tardos

In this paper, we resolve the complexity problem of spectral graph sparcification in dynamic streams up to polylogarithmic factors. Using a linear sketch we design a streaming algorithm that uses (O) over tilde (n) space, and with high probability, recovers a spectral sparsifier from the sketch in (O) over tilde (n) time.(1) Prior results either achieved near optimal (O) over tilde (n) space, but Omega(n(2)) recovery time [Kapralov et al. '14], or ran in o(n(2)) time, but used polynomially suboptimal space [Ahn et al '13]. Our main technical contribution is a novel method for recovering graph edges with high effective resistance from a linear sketch. We show how to do so in nearly linear time by 'bucketing' vertices of the input graph into clusters using a coarse approximation to the graph's effective resistance metric. A second main contribution is a new pseudorandom generator (PRG) for linear sketching algorithms. Constructed from a locally computable randomness extractor, our PRG stretches a seed of (O) over tilde (n) random bits polynomially in length with just log(O(1)) n runtime cost per evaluation. This improves on Nisan's commonly used PRG, which in our setting would require (O) over tilde (n) time per evaluation. Our faster PRG is essential to simultaneously achieving near optimal space and time complexity.

In this paper we revisit the kernel density estimation problem: given a kernel K(x, y) and a dataset of n points in high dimensional Euclidean space, prepare a data structure that can quickly output, given a query q, a (1 + epsilon)-approximation to mu := 1/vertical bar P vertical bar Sigma p is an element of P K(p, q). First, we give a single data structure based on classical near neighbor search techniques that improves upon or essentially matches the query time and space complexity for all radial kernels considered in the literature so far. We then show how to improve both the query complexity and runtime by using recent advances in data-dependent near neighbor search. We achieve our results by giving an new implementation of the natural importance sampling scheme. Unlike previous approaches, our algorithm first samples the dataset uniformly (considering a geometric sequence of sampling rates), and then uses existing approximate near neighbor search techniques on the resulting smaller dataset to retrieve the sampled points that lie at an appropriate distance from the query. We show that the resulting sampled dataset has strong geometric structure, making approximate near neighbor search return the required samples much more efficiently than for worst case datasets of the same size. As an example application, we show that this approach yields a data structure that achieves query time mu-((1+o(1))/4) and space complexity mu-((1+o(1))) for the Gaussian kernel. Our data dependent approach achieves query time mu(-0.173-o(1)) and space mu-(1+o(1)) for the Gaussian kernel. The data dependent analysis relies on new techniques for tracking the geometric structure of the input datasets in a recursive hashing process that we hope will be of interest in other applications in near neighbor search.