In information theory, the conditional entropy quantifies the amount of information needed to describe the outcome of a random variable given that the value of another random variable is known. Here, information is measured in shannons, nats, or hartleys. The entropy of conditioned on is written as . The conditional entropy of given is defined as where and denote the support sets of and . Note: Here, the convention is that the expression should be treated as being equal to zero. This is because . Intuitively, notice that by definition of expected value and of conditional probability, can be written as , where is defined as . One can think of as associating each pair with a quantity measuring the information content of given . This quantity is directly related to the amount of information needed to describe the event given . Hence by computing the expected value of over all pairs of values , the conditional entropy measures how much information, on average, the variable encodes about . Let be the entropy of the discrete random variable conditioned on the discrete random variable taking a certain value . Denote the support sets of and by and . Let have probability mass function . The unconditional entropy of is calculated as , i.e. where is the information content of the outcome of taking the value . The entropy of conditioned on taking the value is defined analogously by conditional expectation: Note that is the result of averaging over all possible values that may take. Also, if the above sum is taken over a sample , the expected value is known in some domains as equivocation. Given discrete random variables with image and with image , the conditional entropy of given is defined as the weighted sum of for each possible value of , using as the weights: if and only if the value of is completely determined by the value of . Conversely, if and only if and are independent random variables. Assume that the combined system determined by two random variables and has joint entropy , that is, we need bits of information on average to describe its exact state.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (6)
COM-621: Advanced Topics in Information Theory
The class will focus on information-theoretic progress of the last decade. Topics include: Network Information Theory ; Information Measures: definitions, properties, and applications to probabilistic
COM-102: Advanced information, computation, communication II
Text, sound, and images are examples of information sources stored in our computers and/or communicated over the Internet. How do we measure, compress, and protect the informatin they contain?
COM-404: Information theory and coding
The mathematical principles of communication that govern the compression and transmission of data and the design of efficient methods of doing so.
Show more
Related publications (38)
Related concepts (11)
Differential entropy
Differential entropy (also referred to as continuous entropy) is a concept in information theory that began as an attempt by Claude Shannon to extend the idea of (Shannon) entropy, a measure of average (surprisal) of a random variable, to continuous probability distributions. Unfortunately, Shannon did not derive this formula, and rather just assumed it was the correct continuous analogue of discrete entropy, but it is not. The actual continuous version of discrete entropy is the limiting density of discrete points (LDDP).
Quantities of information
The mathematical theory of information is based on probability theory and statistics, and measures information with several quantities of information. The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. The most common unit of information is the bit, or more correctly the shannon, based on the binary logarithm.
Joint entropy
In information theory, joint entropy is a measure of the uncertainty associated with a set of variables. The joint Shannon entropy (in bits) of two discrete random variables and with images and is defined as where and are particular values of and , respectively, is the joint probability of these values occurring together, and is defined to be 0 if . For more than two random variables this expands to where are particular values of , respectively, is the probability of these values occurring together, and is defined to be 0 if .
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.