Concept

Variation of information

Summary
In probability theory and information theory, the variation of information or shared information distance is a measure of the distance between two clusterings (partitions of elements). It is closely related to mutual information; indeed, it is a simple linear expression involving the mutual information. Unlike the mutual information, however, the variation of information is a true metric, in that it obeys the triangle inequality. Suppose we have two partitions and of a set into disjoint subsets, namely and . Let: and Then the variation of information between the two partitions is: This is equivalent to the shared information distance between the random variables i and j with respect to the uniform probability measure on defined by for . We can rewrite this definition in terms that explicitly highlight the information content of this metric. The set of all partitions of a set form a compact Lattice where the partial order induces two operations, the meet and the join , where the maximum is the partition with only one block, i.e., all elements grouped together, and the minimum is , the partition consisting of all elements as singletons. The meet of two partitions and is easy to understand as that partition formed by all pair intersections of one block of, , of and one, , of . It then follows that and . Let's define the entropy of a partition as where . Clearly, and . The entropy of a partition is a monotonous function on the lattice of partitions in the sense that . Then the VI distance between and is given by The difference is a pseudo-metric as doesn't necessarily imply that . From the definition of , it is . If in the Hasse diagram we draw an edge from every partition to the maximum and assign it a weight equal to the VI distance between the given partition and , we can interpret the VI distance as basically an average of differences of edge weights to the maximum For as defined above, it holds that the joint information of two partitions coincides with the entropy of the meet and we also have that coincides with the conditional entropy of the meet (intersection) relative to .
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (7)
COM-406: Foundations of Data Science
We discuss a set of topics that are important for the understanding of modern data science but that are typically not taught in an introductory ML course. In particular we discuss fundamental ideas an
COM-621: Advanced Topics in Information Theory
The class will focus on information-theoretic progress of the last decade. Topics include: Network Information Theory ; Information Measures: definitions, properties, and applications to probabilistic
EE-543: Advanced wireless receivers
Students extend their knowledge on wireless communication systems to spread-spectrum communication and to multi-antenna systems. They also learn about the basic information theoretic concepts, about c
Show more