Summary
In information theory, joint entropy is a measure of the uncertainty associated with a set of variables. The joint Shannon entropy (in bits) of two discrete random variables and with images and is defined as where and are particular values of and , respectively, is the joint probability of these values occurring together, and is defined to be 0 if . For more than two random variables this expands to where are particular values of , respectively, is the probability of these values occurring together, and is defined to be 0 if . The joint entropy of a set of random variables is a nonnegative number. The joint entropy of a set of variables is greater than or equal to the maximum of all of the individual entropies of the variables in the set. The joint entropy of a set of variables is less than or equal to the sum of the individual entropies of the variables in the set. This is an example of subadditivity. This inequality is an equality if and only if and are statistically independent. Joint entropy is used in the definition of conditional entropy and It is also used in the definition of mutual information In quantum information theory, the joint entropy is generalized into the joint quantum entropy. The above definition is for discrete random variables and just as valid in the case of continuous random variables. The continuous version of discrete joint entropy is called joint differential (or continuous) entropy. Let and be a continuous random variables with a joint probability density function . The differential joint entropy is defined as For more than two continuous random variables the definition is generalized to: The integral is taken over the support of . It is possible that the integral does not exist in which case we say that the differential entropy is not defined. As in the discrete case the joint differential entropy of a set of random variables is smaller or equal than the sum of the entropies of the individual random variables: The following chain rule holds for two random variables: In the case of m
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (3)
COM-406: Foundations of Data Science
We discuss a set of topics that are important for the understanding of modern data science but that are typically not taught in an introductory ML course. In particular we discuss fundamental ideas an
COM-102: Advanced information, computation, communication II
Text, sound, and images are examples of information sources stored in our computers and/or communicated over the Internet. How do we measure, compress, and protect the informatin they contain?
COM-404: Information theory and coding
The mathematical principles of communication that govern the compression and transmission of data and the design of efficient methods of doing so.
Related lectures (32)
Conditional Entropy: Huffman Coding
Explores conditional entropy and Huffman coding for efficient data compression techniques.
Conditional Entropy: Review and Definitions
Covers conditional entropy, weather conditions, function entropy, and the chain rule.
Entropy Bounds: Conditional Entropy Theorems
Explores entropy bounds, conditional entropy theorems, and the chain rule for entropies, illustrating their application through examples.
Show more
Related publications (8)
Related concepts (7)
Cross-entropy
In information theory, the cross-entropy between two probability distributions and over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution , rather than the true distribution . The cross-entropy of the distribution relative to a distribution over a given set is defined as follows: where is the expected value operator with respect to the distribution .
Quantities of information
The mathematical theory of information is based on probability theory and statistics, and measures information with several quantities of information. The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. The most common unit of information is the bit, or more correctly the shannon, based on the binary logarithm.
Conditional entropy
In information theory, the conditional entropy quantifies the amount of information needed to describe the outcome of a random variable given that the value of another random variable is known. Here, information is measured in shannons, nats, or hartleys. The entropy of conditioned on is written as . The conditional entropy of given is defined as where and denote the support sets of and . Note: Here, the convention is that the expression should be treated as being equal to zero. This is because .
Show more