**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Person# Navid Nouri

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related units

Loading

Courses taught by this person

Loading

Related research domains

Loading

Related publications

Loading

People doing similar research

Loading

Related research domains

No results

People doing similar research

No results

Related publications (4)

Loading

Loading

Loading

Courses taught by this person

No results

As it has become easier and cheaper to collect big datasets in the last few decades, designing efficient and low-cost algorithms for these datasets has attracted unprecedented attention. However, in most applications, even storing datasets as acquired has become extremely costly and inefficient, which motivates the study of sublinear algorithms. This thesis focuses on studying two fundamental graph problems in the sublinear regime. Furthermore, it presents a fast kernel density estimation algorithm and data structure. The first part of this thesis focuses on graph spectral sparsification in dynamic streams. Our algorithm achieves almost optimal runtime and space simultaneously in a single pass. Our method is based on a novel bucketing scheme that enables us to recover high effective resistance edges faster. This contribution presents a novel approach to the effective resistance embedding of the graph, using locality-sensitive hash functions, with possible further future applications.The second part of this thesis presents spanner construction results in the dynamic streams and the simultaneous communication models. First, we show how one can construct a $\tilde{O}(n^{2/3})$-spanner using the above-mentioned almost-optimal single-pass spectral sparsifier, resulting in the first single-pass algorithm for non-trivial spanner construction in the literature. Then, we generalize this result to constructing $\tilde{O}(n^{2/3(1-\alpha)})$-spanners using $\tilde{O}(n^{1+\alpha})$ space for any $\alpha \in [0,1]$, providing a smooth trade-off between distortion and memory complexity. Moreover, we study the simultaneous communication model and propose a novel protocol with low per player information. Also, we show how one can leverage more rounds of communication in this setting to achieve better distortion guarantees. Finally, in the third part of this thesis, we study the kernel density estimation problem. In this problem, given a kernel function, an input dataset imposes a kernel density on the space. The goal is to design fast and memory-efficient data structures that can output approximations to the kernel density at queried points. This thesis presents a data structure based on the classical near neighbor search and locality-sensitive hashing techniques that improves or matches the query time and space complexity for radial kernels considered in the literature. The approach is based on an implementation of (approximate) importance sampling for each distance range and then using near neighbor search algorithms to recover points from these distance ranges. Later, we show how to improve the runtime, using recent advances in the data-dependent near neighbor search data structures, for a class of radial kernels that includes the Gaussian kernel.

Related units (2)

, , ,

In this paper, we resolve the complexity problem of spectral graph sparcification in dynamic streams up to polylogarithmic factors. Using a linear sketch we design a streaming algorithm that uses (O) over tilde (n) space, and with high probability, recovers a spectral sparsifier from the sketch in (O) over tilde (n) time.(1) Prior results either achieved near optimal (O) over tilde (n) space, but Omega(n(2)) recovery time [Kapralov et al. '14], or ran in o(n(2)) time, but used polynomially suboptimal space [Ahn et al '13]. Our main technical contribution is a novel method for recovering graph edges with high effective resistance from a linear sketch. We show how to do so in nearly linear time by 'bucketing' vertices of the input graph into clusters using a coarse approximation to the graph's effective resistance metric. A second main contribution is a new pseudorandom generator (PRG) for linear sketching algorithms. Constructed from a locally computable randomness extractor, our PRG stretches a seed of (O) over tilde (n) random bits polynomially in length with just log(O(1)) n runtime cost per evaluation. This improves on Nisan's commonly used PRG, which in our setting would require (O) over tilde (n) time per evaluation. Our faster PRG is essential to simultaneously achieving near optimal space and time complexity.

,

In this paper we revisit the kernel density estimation problem: given a kernel K(x, y) and a dataset of n points in high dimensional Euclidean space, prepare a data structure that can quickly output, given a query q, a (1 + epsilon)-approximation to mu := 1/vertical bar P vertical bar Sigma p is an element of P K(p, q). First, we give a single data structure based on classical near neighbor search techniques that improves upon or essentially matches the query time and space complexity for all radial kernels considered in the literature so far. We then show how to improve both the query complexity and runtime by using recent advances in data-dependent near neighbor search. We achieve our results by giving an new implementation of the natural importance sampling scheme. Unlike previous approaches, our algorithm first samples the dataset uniformly (considering a geometric sequence of sampling rates), and then uses existing approximate near neighbor search techniques on the resulting smaller dataset to retrieve the sampled points that lie at an appropriate distance from the query. We show that the resulting sampled dataset has strong geometric structure, making approximate near neighbor search return the required samples much more efficiently than for worst case datasets of the same size. As an example application, we show that this approach yields a data structure that achieves query time mu-((1+o(1))/4) and space complexity mu-((1+o(1))) for the Gaussian kernel. Our data dependent approach achieves query time mu(-0.173-o(1)) and space mu-(1+o(1)) for the Gaussian kernel. The data dependent analysis relies on new techniques for tracking the geometric structure of the input datasets in a recursive hashing process that we hope will be of interest in other applications in near neighbor search.