In various science/engineering applications, such as independent component analysis, , genetic analysis, speech recognition, manifold learning, and time delay estimation it is useful to estimate the differential entropy of a system or process, given some observations. The simplest and most common approach uses histogram-based estimation, but other approaches have been developed and used, each with its own benefits and drawbacks. The main factor in choosing a method is often a trade-off between the bias and the variance of the estimate, although the nature of the (suspected) distribution of the data may also be a factor. The histogram approach uses the idea that the differential entropy of a probability distribution for a continuous random variable , can be approximated by first approximating with a histogram of the observations, and then finding the discrete entropy of a quantization of with bin probabilities given by that histogram. The histogram is itself a maximum-likelihood (ML) estimate of the discretized frequency distribution ), where is the width of the th bin. Histograms can be quick to calculate, and simple, so this approach has some attraction. However, the estimate produced is biased, and although corrections can be made to the estimate, they may not always be satisfactory. A method better suited for multidimensional probability density functions (pdf) is to first make a pdf estimate with some method, and then, from the pdf estimate, compute the entropy. A useful pdf estimate method is e.g. Gaussian mixture modeling (GMM), where the expectation maximization (EM) algorithm is used to find an ML estimate of a weighted sum of Gaussian pdf's approximating the data pdf. If the data is one-dimensional, we can imagine taking all the observations and putting them in order of their value. The spacing between one value and the next then gives us a rough idea of (the reciprocal of) the probability density in that region: the closer together the values are, the higher the probability density.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (1)
BIO-369: Randomness and information in biological data
Biology is becoming more and more a data science, as illustrated by the explosion of available genome sequences. This course aims to show how we can make sense of such data and harness it in order to
Related lectures (29)
Quantum Entropy: Markov Chains and Bell States
Explores quantum entropy in Markov chains and Bell states, emphasizing entanglement.
Bipartite Graphs: Independent Sets
Explores bipartite graphs, independent sets, Shearer's Lemma, labeled graphs, and entropy analysis.
Thermodynamics: Entropy and Ideal Gases
Explores entropy, ideal gases, and TDS equations in thermodynamics, emphasizing the importance of the Clausius inequality and the Carnot cycle.
Show more
Related publications (32)
Related units (1)
Related concepts (1)
Entropy (information theory)
In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable , which takes values in the alphabet and is distributed according to : where denotes the sum over the variable's possible values. The choice of base for , the logarithm, varies for different applications. Base 2 gives the unit of bits (or "shannons"), while base e gives "natural units" nat, and base 10 gives units of "dits", "bans", or "hartleys".

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.