Improving K-means Clustering Using Speculation

K-means is one of the fundamental unsupervised data clustering and machine learning methods. It has been well studied over the years: parallelized, approximated, and optimized for different cases and applications. With increasingly higher parallelism leading to processing becoming bandwidth-bound, further speeding up iterative k-means would require data reduction using sampling at the cost of accuracy. We examine the use of speculation techniques to expedite the convergence of the iterative k-means algorithm without affecting the accuracy of the results. We achieve this by introducing two cooperative and concurrent phases: one works on the overall input data, and the other speculates and explores the space faster using sampling. At the end of every iteration, the two phases vote and choose the best centroids according to the objective function. Our speculative technique reduces the number of steps needed for convergence without compromising accuracy and, at the same time, provides an effective mechanism to escape local minima without prior initialization cost through resampling.

Improving K-means Clustering Using Speculation

Graph Chatbot

Chattez avec Graph Search

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Interpret3C: Interpretable Student Clustering Through Individualized Feature Selection

Partial discharge localization in power transformer tanks using machine learning methods

Interpret3C: Interpretable Student Clustering Through Individualized Feature Selection

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Partial discharge localization in power transformer tanks using machine learning methods