**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Lecture# Clustering: k-means

Description

This lecture covers the concept of k-means clustering, where data points are assigned to clusters based on their proximity. The instructor explains the process step by step, from understanding the algorithm to minimizing the within-cluster sum of squared distances. The lecture also delves into the application of k-means in clustering data points, emphasizing the goal of grouping similar data points together. Additionally, the lecture touches upon the use of Euclidean distance in measuring proximity and the iterative nature of the k-means algorithm.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

In course

Instructor

PHYS-467: Machine learning for physicists

Machine learning and data analysis are becoming increasingly central in sciences including physics. In this course, fundamental principles and methods of machine learning will be introduced and practi

Related concepts (205)

Computer cluster

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The components of a cluster are usually connected to each other through fast local area networks, with each node (computer used as a server) running its own instance of an operating system. In most circumstances, all of the nodes use the same hardware and the same operating system, although in some setups (e.

Set theory

Set theory is the branch of mathematical logic that studies sets, which can be informally described as collections of objects. Although objects of any kind can be collected into a set, set theory, as a branch of mathematics, is mostly concerned with those that are relevant to mathematics as a whole. The modern study of set theory was initiated by the German mathematicians Richard Dedekind and Georg Cantor in the 1870s. In particular, Georg Cantor is commonly considered the founder of set theory.

K-means clustering

k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances.

Fuzzy set

In mathematics, fuzzy sets (a.k.a. uncertain sets) are sets whose elements have degrees of membership. Fuzzy sets were introduced independently by Lotfi A. Zadeh in 1965 as an extension of the classical notion of set. At the same time, defined a more general kind of structure called an L-relation, which he studied in an abstract algebraic context. Fuzzy relations, which are now used throughout fuzzy mathematics and have applications in areas such as linguistics , decision-making , and clustering , are special cases of L-relations when L is the unit interval [0, 1].

Universal set

In set theory, a universal set is a set which contains all objects, including itself. In set theory as usually formulated, it can be proven in multiple ways that a universal set does not exist. However, some non-standard variants of set theory include a universal set. Many set theories do not allow for the existence of a universal set. There are several different arguments for its non-existence, based on different choices of axioms for set theory. In Zermelo–Fraenkel set theory, the axiom of regularity and axiom of pairing prevent any set from containing itself.

Related lectures (1,000)

Nearest Neighbor Rules: Part 2EE-566: Adaptation and learning

Explores the Nearest Neighbor Rules, k-NN algorithm challenges, Bayes classifier, and k-means algorithm for clustering.

Clustering & Density EstimationDH-406: Machine learning for DH

Covers dimensionality reduction, PCA, clustering techniques, and density estimation methods.

Clustering Methods

Covers K-means, hierarchical, and DBSCAN clustering methods with practical examples.

Document Analysis: Topic ModelingDH-406: Machine learning for DH

Explores document analysis, topic modeling, and generative models for data generation in machine learning.

Clustering & Density EstimationDH-406: Machine learning for DH

Covers clustering, PCA, LDA, K-means, GMM, KDE, and Mean Shift algorithms for density estimation and clustering.