Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.
We consider the classical k-means clustering problem in the setting of bi-criteria approximation, in which an algorithm is allowed to output betak > k clusters, and must produce a clustering with cost at most alpha times the to the cost of the optimal set of k clusters. We argue that this approach is natural in many settings, for which the exact number of clusters is a priori unknown, or unimportant up to a constant factor. We give new bi-criteria approximation algorithms, based on linear programming and local search, respectively, which attain a guarantee alpha(beta) depending on the number betak of clusters that may be opened. Our guarantee alpha(beta) is always at most 9 + epsilon and improves rapidly with beta (for example: alpha(2) < 2.59, and alpha(3) < 1.4). Moreover, our algorithms have only polynomial dependence on the dimension of the input data, and so are applicable in high-dimensional settings.
Loading
Loading
Loading
polynomial-time'' means
efficient''. That algorithm is sequential and deterministic. We have also known since the 1980s that the matching problem has efficient parallel algorithms if the use of randomness is allowed. Formally, it is in the class RNC, i.e., it has randomized algorithms that use polynomially many processors and run in polylogarithmic time. However, we do not know if randomness is necessary - that is, whether the matching problem is in the class NC.
In this thesis we show that the matching problem is in quasi-NC. That is, we give a deterministic parallel algorithm that runs in O(log^3 n) time on n^{O(log^2 n)} processors. The result is obtained by a derandomization of the Isolation Lemma for perfect matchings, which was introduced in the classic paper by Mulmuley, Vazirani and Vazirani to obtain an RNC algorithm. Our proof extends the framework of Fenner, Gurjar and Thierauf, who proved the analogous result in the special case of bipartite graphs. Compared to that setting, several new ingredients are needed due to the significantly more complex structure of perfect matchings in general graphs. In particular, our proof heavily relies on the laminar structure of the faces of the perfect matching polytope.