**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Unit# Theory of calculation 4

Laboratory

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related people

Loading

Units doing similar research

Loading

Related research domains

Loading

Related publications

Loading

Related people (10)

Related publications (24)

Loading

Loading

Loading

Related research domains (19)

Algorithm

In mathematics and computer science, an algorithm (ˈælɡərɪðəm) is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algo

Hypergraph

In mathematics, a hypergraph is a generalization of a graph in which an edge can join any number of vertices. In contrast, in an ordinary graph, an edge connects exactly two vertices.
Formally, a

Time complexity

In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the

Units doing similar research (106)

Submodular functions are a widely studied topic in theoretical computer science. They have found several applications both theoretical and practical in the fields of economics, combinatorial optimization and machine learning. More recently, there have also been numerous works that study combinatorial problems with submodular objective functions. This is motivated by their natural diminishing returns property which is useful in real-world applications. The thesis at hand is concerned with the study of streaming and matching problems with submodular functions.Firstly, motivated by developing robust algorithms, we propose a new adversarial injections model, in which the input is ordered randomly, but an adversary may inject misleading elements at arbitrary positions. We study the maximum matching problem and cardinality constrained monotone submodular maximization. We show that even under this seemingly powerful adversary, it is possible to break the barrier of 1/2 for both these problems in the streaming setting. Our main result is a novel streaming algorithm that computes a 0.55-approximation for cardinality constrained monotone submodular maximization.In the second part of the thesis, we study the problem of matroid intersection in the semi-streaming setting. Our main result is a (2 + e)-approximate semi-streaming algorithm for weighted matroid inter- section improving upon the previous best guarantee of 4 + e. While our algorithm is based on the local ratio technique, its analysis differs from the related problem of weighted maximum matching and uses the concept of matroid kernels. We are also able to generalize our results to work for submodular functions by adapting ideas from a recent result by Levin and Wajc (SODA'21) on submodular maximization subject to matching constraints.Finally, we study the submodular Santa Claus problem in the restricted assignment case. The submodular Santa Claus problem was introduced in a seminal work by Goemans, Harvey, Iwata, and Mirrokni (SODA'09) as an application of their structural result. In the mentioned problem n unsplittable resources have to be assigned to m players, each with a monotone submodular utility function fi. The goal is to maximize mini fi(Si) where S1, . . . , Sm is a partition of the resources. The result by Goemans et al. implies a polynomial time O(n1/2+e)-approximation algorithm. In the restricted assignment case, each player is given a set of desired resources Gi and the individual valuation functions are defined as fi(S)= f(SnGi). OurmainresultisaO(loglog(n))-approximation algorithm for the problem. Our proof is inspired by the approach of Bansal and Srividenko (STOC'06) to the Santa Claus problem. Com- pared to the more basic linear setting, the introduction of submodularity requires a much more involved analysis and several new ideas.

With the increasing prevalence of massive datasets, it becomes important to design algorithmic techniques for dealing with scenarios where the input to be processed does not fit in the memory of a single machine. Many highly successful approaches have emerged in recent decades, such as processing the data in a stream, parallel processing, and data compression. The aim of this thesis is to apply these techniques to various important graph theoretical problems. Our contributions can be broadly classified into two categories: spectral graph theory, and maximum matching.Spectral Graph Theory. Spectral sparsification is a technique of rendering an arbitrary graph sparse, while approximately preserving the quadratic form of the Laplacian matrix. In this thesis, we extend the result of (Kapralov et al.), and propose a sketch and corresponding decoding algorithm for constructing a spectral sparsifier from a dynamic stream of edge insertions and deletions. The size of the resulting sparsifier, the size of the sketch, and the decoding time are all nearly linear in the number of vertices, and consequently nearly optimal.The concept of spectral sparsification has recently been generalized to hypergraphs (Soma and Yoshida) -- an analogue of graphs for modeling higher order relationships. As one of the main contributions of the thesis, we prove for the first time the existence of nearly-linear sized spectral sparsifiers for arbitrary hypergraphs, and provide a corresponding nearly-linear time algorithm for constructing them. Through a lower bound construction, we show that our sparsifiers achieve nearly-optimal compression of the hypergraph spectral structure.On the more applied side of spectral graph theory, we present a fully scalable MPC (massively parallel computation) algorithm which is capable of simulating a large number of independent random walks of length L from an arbitrary starting distribution in O(log(L)) rounds.Maximum Matching. We propose a novel randomized composable coreset for the problem of maximum matching, called the matching skeleton. The coreset achieves a 1/2 approximation, while having fewer than n edges.We also propose a new, highly space-efficient variant of a peeling algorithm for maximum matching. With this, we are able to approximate the maximum matching size of a graph to within a constant factor, using a stream of m uniformly random edges (where m is the total number of edges), in as little as O(log^2(n)) space. Conversely, we show that significantly fewer (that is m^(1-Omega(1))) samples do not suffice, even with unlimited space. Finally, we design a Local Computation Algorithm, which implicitly construct a constant-approximate maximum matching in time and space that is nearly linear in the maximum degree.

As it has become easier and cheaper to collect big datasets in the last few decades, designing efficient and low-cost algorithms for these datasets has attracted unprecedented attention. However, in most applications, even storing datasets as acquired has become extremely costly and inefficient, which motivates the study of sublinear algorithms. This thesis focuses on studying two fundamental graph problems in the sublinear regime. Furthermore, it presents a fast kernel density estimation algorithm and data structure. The first part of this thesis focuses on graph spectral sparsification in dynamic streams. Our algorithm achieves almost optimal runtime and space simultaneously in a single pass. Our method is based on a novel bucketing scheme that enables us to recover high effective resistance edges faster. This contribution presents a novel approach to the effective resistance embedding of the graph, using locality-sensitive hash functions, with possible further future applications.The second part of this thesis presents spanner construction results in the dynamic streams and the simultaneous communication models. First, we show how one can construct a $\tilde{O}(n^{2/3})$-spanner using the above-mentioned almost-optimal single-pass spectral sparsifier, resulting in the first single-pass algorithm for non-trivial spanner construction in the literature. Then, we generalize this result to constructing $\tilde{O}(n^{2/3(1-\alpha)})$-spanners using $\tilde{O}(n^{1+\alpha})$ space for any $\alpha \in [0,1]$, providing a smooth trade-off between distortion and memory complexity. Moreover, we study the simultaneous communication model and propose a novel protocol with low per player information. Also, we show how one can leverage more rounds of communication in this setting to achieve better distortion guarantees. Finally, in the third part of this thesis, we study the kernel density estimation problem. In this problem, given a kernel function, an input dataset imposes a kernel density on the space. The goal is to design fast and memory-efficient data structures that can output approximations to the kernel density at queried points. This thesis presents a data structure based on the classical near neighbor search and locality-sensitive hashing techniques that improves or matches the query time and space complexity for radial kernels considered in the literature. The approach is based on an implementation of (approximate) importance sampling for each distance range and then using near neighbor search algorithms to recover points from these distance ranges. Later, we show how to improve the runtime, using recent advances in the data-dependent near neighbor search data structures, for a class of radial kernels that includes the Gaussian kernel.