Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible data compression, and the operational meaning of the Shannon entropy. Named after Claude Shannon, the source coding theorem shows that (in the limit, as the length of a stream of independent and identically-distributed random variable (i.i.d.) data tends to infinity) it is impossible to compress the data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss. The source coding theorem for symbol codes places an upper and a lower bound on the minimal possible expected length of codewords as a function of the entropy of the input word (which is viewed as a random variable) and of the size of the target alphabet. Source coding is a mapping from (a sequence of) symbols from an information source to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is the concept behind data compression. In information theory, the source coding theorem (Shannon 1948) informally states that (MacKay 2003, pg. 81, Cover 2006, Chapter 5): N i.i.d. random variables each with entropy H(X) can be compressed into more than N H(X) bits with negligible risk of information loss, as N → ∞; but conversely, if they are compressed into fewer than N H(X) bits it is virtually certain that information will be lost.The coded sequence represents the compressed message in a biunivocal way, under the assumption that the decoder knows the source. From a practical point of view, this hypothesis is not always true. Consequently, when the entropy encoding is applied the transmitted message is .
Martin Jaggi, Sebastian Urban Stich, Quentin Rebjock, Sai Praneeth Reddy Karimireddy
Michael Christoph Gastpar, Sung Hoon Lim, Jingge Zhu
Martin Jaggi, Sebastian Urban Stich, Quentin Rebjock, Sai Praneeth Reddy Karimireddy