Summary
In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier proof of the same inequality for doubly-differentiable functions by Otto Hölder in 1889. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations. Jensen's inequality generalizes the statement that the secant line of a convex function lies above the graph of the function, which is Jensen's inequality for two points: the secant line consists of weighted means of the convex function (for t ∈ [0,1]), while the graph of the function is the convex function of the weighted means, Thus, Jensen's inequality is In the context of probability theory, it is generally stated in the following form: if X is a random variable and φ is a convex function, then The difference between the two sides of the inequality, , is called the Jensen gap. The classical form of Jensen's inequality involves several numbers and weights. The inequality can be stated quite generally using either the language of measure theory or (equivalently) probability. In the probabilistic setting, the inequality can be further generalized to its full strength. For a real convex function , numbers in its domain, and positive weights , Jensen's inequality can be stated as: and the inequality is reversed if is concave, which is Equality holds if and only if or is linear on a domain containing . As a particular case, if the weights are all equal, then () and () become For instance, the function log(x) is concave, so substituting in the previous formula () establishes the (logarithm of the) familiar arithmetic-mean/geometric-mean inequality: A common application has x as a function of another variable (or set of variables) t, that is, .
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (6)
MATH-476: Optimal transport
The first part is devoted to Monge and Kantorovitch problems, discussing the existence and the properties of the optimal plan. The second part introduces the Wasserstein distance on measures and devel
EE-556: Mathematics of data: from theory to computation
This course provides an overview of key advances in continuous optimization and statistical analysis for machine learning. We review recent learning formulations and models as well as their guarantees
COM-406: Foundations of Data Science
We discuss a set of topics that are important for the understanding of modern data science but that are typically not taught in an introductory ML course. In particular we discuss fundamental ideas an
Show more