**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Concept# Typical set

Summary

In information theory, the typical set is a set of sequences whose probability is close to two raised to the negative power of the entropy of their source distribution. That this set has total probability close to one is a consequence of the asymptotic equipartition property (AEP) which is a kind of law of large numbers. The notion of typicality is only concerned with the probability of a sequence and not the actual sequence itself.
This has great use in compression theory as it provides a theoretical means for compressing data, allowing us to represent any sequence Xn using nH(X) bits on average, and, hence, justifying the use of entropy as a measure of information from a source.
The AEP can also be proven for a large class of stationary ergodic processes, allowing typical set to be defined in more general cases.
If a sequence x1, ..., xn is drawn from an i.i.d. distribution X defined over a finite alphabet , then the typical set, Aε(n)(n) is defined as those sequences which satisfy:
where
is the information entropy of X. The probability above need only be within a factor of 2n ε. Taking the logarithm on all sides and dividing by -n, this definition can be equivalently stated as
For i.i.d sequence, since
we further have
By the law of large numbers, for sufficiently large n
An essential characteristic of the typical set is that, if one draws a large number n of independent random samples from the distribution X, the resulting sequence (x1, x2, ..., xn) is very likely to be a member of the typical set, even though the typical set comprises only a small fraction of all the possible sequences. Formally, given any , one can choose n such that:
The probability of a sequence from X(n) being drawn from Aε(n) is greater than 1 − ε, i.e.
If the distribution over is not uniform, then the fraction of sequences that are typical is
as n becomes very large, since where is the cardinality of .
For a general stochastic process {X(t)} with AEP, the (weakly) typical set can be defined similarly with p(x1, x2, ..., xn) replaced by p(x0τ) (i.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related lectures (7)

Related publications (29)

Related people (6)

Related concepts (1)

Noisy-channel coding theorem

In information theory, the noisy-channel coding theorem (sometimes Shannon's theorem or Shannon's limit), establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data (digital information) nearly error-free up to a computable maximum rate through the channel. This result was presented by Claude Shannon in 1948 and was based in part on earlier work and ideas of Harry Nyquist and Ralph Hartley.

Information Theory: Source Coding

Covers source coding, typical sequences, stationarity, and efficient encoding in information theory.

Information Theory and Coding

Covers expected code word length, Huffman procedure, and entropy in coding theory.

Advanced Information Theory: Random Binning

Explores random binning in advanced information theory, focusing on assigning labels based on typicality and achieving negligible error rates in source coding.

Michael Christoph Gastpar, Sung Hoon Lim, Adriano Pastore, Chen Feng

Compute–forward is a coding technique that enables receiver(s) in a network to directly decode one or more linear combinations of the transmitted codewords. Initial efforts focused on Gaussian channels and derived achievable rate regions via nested lattice ...

2022We propose a new class of efficient decoding algorithms for Reed-Muller (RM) codes over binary-input memoryless channels. The algorithms are based on projecting the code on its cosets, recursively decoding the projected codes (which are lower-order RM code ...

We consider that a network is an observation, and a collection of observed networks forms a sample. In this setting, we provide methods to test whether all observations in a network sample are drawn from a specified model. We achieve this by deriving the j ...