Summary
In information theory, the information content, self-information, surprisal, or Shannon information is a basic quantity derived from the probability of a particular event occurring from a random variable. It can be thought of as an alternative way of expressing probability, much like odds or log-odds, but which has particular mathematical advantages in the setting of information theory. The Shannon information can be interpreted as quantifying the level of "surprise" of a particular outcome. As it is such a basic quantity, it also appears in several other settings, such as the length of a message needed to transmit the event given an optimal source coding of the random variable. The Shannon information is closely related to entropy, which is the expected value of the self-information of a random variable, quantifying how surprising the random variable is "on average". This is the average amount of self-information an observer would expect to gain about a random variable when measuring it. The information content can be expressed in various units of information, of which the most common is the "bit" (more correctly called the shannon), as explained below. Claude Shannon's definition of self-information was chosen to meet several axioms: An event with probability 100% is perfectly unsurprising and yields no information. The less probable an event is, the more surprising it is and the more information it yields. If two independent events are measured separately, the total amount of information is the sum of the self-informations of the individual events. The detailed derivation is below, but it can be shown that there is a unique function of probability that meets these three axioms, up to a multiplicative scaling factor. Broadly, given a real number and an event with probability , the information content is defined as follows: The base b corresponds to the scaling factor above. Different choices of b correspond to different units of information: when b = 2, the unit is the shannon (symbol Sh), often called a 'bit'; when b = e, the unit is the natural unit of information (symbol nat); and when b = 10, the unit is the hartley (symbol Hart).
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related publications (134)