**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Joint entropy

Summary

In information theory, joint entropy is a measure of the uncertainty associated with a set of variables.
The joint Shannon entropy (in bits) of two discrete random variables and with images and is defined as
where and are particular values of and , respectively, is the joint probability of these values occurring together, and is defined to be 0 if .
For more than two random variables this expands to
where are particular values of , respectively, is the probability of these values occurring together, and is defined to be 0 if .
The joint entropy of a set of random variables is a nonnegative number.
The joint entropy of a set of variables is greater than or equal to the maximum of all of the individual entropies of the variables in the set.
The joint entropy of a set of variables is less than or equal to the sum of the individual entropies of the variables in the set. This is an example of subadditivity. This inequality is an equality if and only if and are statistically independent.
Joint entropy is used in the definition of conditional entropy
and It is also used in the definition of mutual information
In quantum information theory, the joint entropy is generalized into the joint quantum entropy.
The above definition is for discrete random variables and just as valid in the case of continuous random variables. The continuous version of discrete joint entropy is called joint differential (or continuous) entropy. Let and be a continuous random variables with a joint probability density function . The differential joint entropy is defined as
For more than two continuous random variables the definition is generalized to:
The integral is taken over the support of . It is possible that the integral does not exist in which case we say that the differential entropy is not defined.
As in the discrete case the joint differential entropy of a set of random variables is smaller or equal than the sum of the entropies of the individual random variables:
The following chain rule holds for two random variables:
In the case of m

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

No results

Related people

No results

Related units

No results

Related concepts (8)

Quantities of information

The mathematical theory of information is based on probability theory and statistics, and measures information with several quantities of information. The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. The most common unit of information is the bit, or more correctly the shannon, based on the binary logarithm.

Joint entropy

In information theory, joint entropy is a measure of the uncertainty associated with a set of variables. The joint Shannon entropy (in bits) of two discrete random variables and with images and is defined as where and are particular values of and , respectively, is the joint probability of these values occurring together, and is defined to be 0 if . For more than two random variables this expands to where are particular values of , respectively, is the probability of these values occurring together, and is defined to be 0 if .

Cross-entropy

In information theory, the cross-entropy between two probability distributions and over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution , rather than the true distribution . The cross-entropy of the distribution relative to a distribution over a given set is defined as follows: where is the expected value operator with respect to the distribution .

Related courses (3)

COM-406: Foundations of Data Science

We discuss a set of topics that are important for the understanding of modern data science but that are typically not taught in an introductory ML course. In particular we discuss fundamental ideas an

COM-102: Advanced information, computation, communication II

Text, sound, and images are examples of information sources stored in our computers and/or communicated over the Internet. How do we measure, compress, and protect the informatin they contain?

COM-404: Information theory and coding

The mathematical principles of communication that govern the compression and transmission of data and the design of efficient methods of doing so.

Related lectures (33)

Conditional Entropy: Huffman Coding

Explores conditional entropy and Huffman coding for efficient data compression techniques.

Conditional Entropy: Review and Definitions

Covers conditional entropy, weather conditions, function entropy, and the chain rule.

Entropy Bounds: Conditional Entropy Theorems

Explores entropy bounds, conditional entropy theorems, and the chain rule for entropies, illustrating their application through examples.

Related MOOCs

No results