Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms.
The density matching property of vector quantization is powerful, especially for identifying the density of large and high-dimensional data. Since data points are represented by the index of their closest centroid, commonly occurring data have low error, and rare data high error. This is why VQ is suitable for lossy data compression. It can also be used for lossy data correction and density estimation.
Vector quantization is based on the competitive learning paradigm, so it is closely related to the self-organizing map model and to sparse coding models used in deep learning algorithms such as autoencoder.
The simplest training algorithm for vector quantization is:
Pick a sample point at random
Move the nearest quantization vector centroid towards this sample point, by a small fraction of the distance
Repeat
A more sophisticated algorithm reduces the bias in the density matching estimation, and ensures that all points are used, by including an extra sensitivity parameter :
Increase each centroid's sensitivity by a small amount
Pick a sample point at random
For each quantization vector centroid , let denote the distance of and
Find the centroid for which is the smallest
Move towards by a small fraction of the distance
Set to zero
Repeat
It is desirable to use a cooling schedule to produce convergence: see Simulated annealing. Another (simpler) method is LBG which is based on K-Means.
The algorithm can be iteratively updated with 'live' data, rather than by picking random points from a data set, but this will introduce some bias if the data are temporally correlated over many samples.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
The goal of this course is to introduce the engineering students state-of-the-art speech and audio coding techniques with an emphasis on the integration of knowledge about sound production and auditor
This course covers fundamental notions in image and video processing, as well as covers most popular tools used, such as edge detection, motion estimation, segmentation, and compression. It is compose
This course provides the students with 1) a set of theoretical concepts to understand the machine learning approach; and 2) a subset of the tools to use this approach for problems arising in mechanica
A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the topological structure of the data. For example, a data set with variables measured in observations could be represented as clusters of observations with similar values for the variables.
k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances.
In electrical engineering and computer science, Lloyd's algorithm, also known as Voronoi iteration or relaxation, is an algorithm named after Stuart P. Lloyd for finding evenly spaced sets of points in subsets of Euclidean spaces and partitions of these subsets into well-shaped and uniformly sized convex cells. Like the closely related k-means clustering algorithm, it repeatedly finds the centroid of each set in the partition and then re-partitions the input according to which of these centroids is closest.
Explores consensus algorithms in networked control systems, covering topics like Metropolis-Hasting models and distributed computation of Least-Squares regression.
Point cloud representation is a popular modality to code immersive 3D contents. Several solutions and standards have been recently proposed in order to efficiently compress the large volume of data that point clouds require, in order to make them feasible ...
Visual Question Answering (VQA) on remote sensing imagery can help non-expert users in extracting information from Earth observation data. Current approaches follow a neural encoder-decoder design, combining convolutional and recurrent encoders together wi ...
A logconcave likelihood is as important to proper statistical inference as a convex cost function is important to variational optimization. Quantization is often disregarded when writing likelihood models, ignoring the limitations of the physical detectors ...