Mutual information

In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the "amount of information" (in units such as shannons (bits), nats or hartleys) obtained about one random variable by observing the other random variable. The concept of mutual information is intimately linked to that of entropy of a random variable, a fundamental notion in information theory that quantifies the expected "amount of information" held in a random variable. Not limited to real-valued random variables and linear dependence like the correlation coefficient, MI is more general and determines how different the joint distribution of the pair is from the product of the marginal distributions of and . MI is the expected value of the pointwise mutual information (PMI). The quantity was defined and analyzed by Claude Shannon in his landmark paper "A Mathematical Theory of Communication", although he did not call it "mutual information". This term was coined later by Robert Fano. Mutual Information is also known as information gain. Let be a pair of random variables with values over the space . If their joint distribution is and the marginal distributions are and , the mutual information is defined as where is the Kullback–Leibler divergence, and is the outer product distribution which assigns probability to each . Notice, as per property of the Kullback–Leibler divergence, that is equal to zero precisely when the joint distribution coincides with the product of the marginals, i.e. when and are independent (and hence observing tells you nothing about ). is non-negative, it is a measure of the price for encoding as a pair of independent random variables when in reality they are not. If the natural logarithm is used, the unit of mutual information is the nat. If the log base 2 is used, the unit of mutual information is the shannon, also known as the bit. If the log base 10 is used, the unit of mutual information is the hartley, also known as the ban or the dit.

Graph Chatbot

Chat with Graph Search

Methodology for selecting measurement points that optimize information gain for model updating

Motor-Unit Ordering of Blindly-Separated Surface-EMG Signals for Gesture Recognition

Impact of phylogeny on structural contact inference from protein sequence data

Methodology for selecting measurement points that optimize information gain for model updating

Motor-Unit Ordering of Blindly-Separated Surface-EMG Signals for Gesture Recognition

Impact of phylogeny on structural contact inference from protein sequence data