In probability theory, concentration inequalities provide bounds on how a random variable deviates from some value (typically, its expected value). The law of large numbers of classical probability theory states that sums of independent random variables are, under very mild conditions, close to their expectation with a large probability. Such sums are the most basic examples of random variables concentrated around their mean. Recent results show that such behavior is shared by other functions of independent random variables.
Concentration inequalities can be sorted according to how much information about the random variable is needed in order to use them.
Markov's inequality
Let be a random variable that is non-negative (almost surely). Then, for every constant ,
Note the following extension to Markov's inequality: if is a strictly increasing and non-negative function, then
Chebyshev's inequality
Chebyshev's inequality requires the following information on a random variable :
The expected value is finite.
The variance is finite.
Then, for every constant ,
or equivalently,
where is the standard deviation of .
Chebyshev's inequality can be seen as a special case of the generalized Markov's inequality applied to the random variable with .
Vysochanskij–Petunin inequalityLet X be a random variable with unimodal distribution, mean μ and finite, non-zero variance σ2. Then, for any
(For a relatively elementary proof see e.g. ).
Vysochanskij–Petunin inequality
For a unimodal random variable and , the one-sided Vysochanskij-Petunin inequality holds as follows:
Paley–Zygmund inequality
Cantelli's inequality
Gauss's inequality
Chernoff bound
The generic Chernoff bound requires the moment generating function of , defined as It always exists, but may be infinite. From Markov's inequality, for every :
and for every :
There are various Chernoff bounds for different distributions and different values of the parameter . See for a compilation of more concentration inequalities.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
In probability theory, Hoeffding's inequality provides an upper bound on the probability that the sum of bounded independent random variables deviates from its expected value by more than a certain amount. Hoeffding's inequality was proven by Wassily Hoeffding in 1963. Hoeffding's inequality is a special case of the Azuma–Hoeffding inequality and McDiarmid's inequality. It is similar to the Chernoff bound, but tends to be less sharp, in particular when the variance of the random variables is small.
In probability theory, Bernstein inequalities give bounds on the probability that the sum of random variables deviates from its mean. In the simplest case, let X1, ..., Xn be independent Bernoulli random variables taking values +1 and −1 with probability 1/2 (this distribution is also known as the Rademacher distribution), then for every positive , Bernstein inequalities were proven and published by Sergei Bernstein in the 1920s and 1930s. Later, these inequalities were rediscovered several times in various forms.
In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms the Chernoff or Chernoff-Cramér bound, which may decay faster than exponential (e.g. sub-Gaussian). It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables. The bound is commonly named after Herman Chernoff who described the method in a 1952 paper, though Chernoff himself attributed it to Herman Rubin.
In this course, various aspects of probability theory are considered. The first part is devoted to the main theorems in the field (law of large numbers, central limit theorem, concentration inequaliti
We discuss a set of topics that are important for the understanding of modern data science but that are typically not taught in an introductory ML course. In particular we discuss fundamental ideas an
Multiple tensor-times-matrix (Multi-TTM) is a key computation in algorithms for computing and operating with the Tucker tensor decomposition, which is frequently used in multidimensional data analysis. We establish communication lower bounds that determine ...
Philadelphia2024
This paper proposes an algorithm to upper-bound maximal quantile statistics of a state function over the course of a Stochastic Differential Equation (SDE) system execution. This chance-peak problem is posed as a nonconvex program aiming to maximize the Va ...
We derive confidence intervals (CIs) and confidence sequences (CSs) for the classical problem of estimating a bounded mean. Our approach generalizes and improves on the celebrated Chernoff method, yielding the best closed-form "empirical-Bernstein" CSs and ...