Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Clustering is a classic topic in optimization with k-means being one of the most fundamental such problems. In the absence of any restrictions on the input, the best-known algorithm for k-means in Euclidean space with a provable guarantee is a simple local search heuristic yielding an approximation guarantee of 9+epsilon, a ratio that is known to be tight with respect to such methods. We overcome this barrier by presenting a new primal-dual approach that allows us to (1) exploit the geometric structure of k-means and (2) satisfy the hard constraint that at most k clusters are selected without deteriorating the approximation guarantee. Our main result is a 6.357-approximation algorithm with respect to the standard linear programming (LP) relaxation. Our techniques are quite general, and we also show improved guarantees for k-median in Euclidean metrics and for a generalization of k-means in which the underlying metric is not required to be Euclidean.
, ,