In probability theory, an -divergence is a function that measures the difference between two probability distributions and . Many common divergences, such as KL-divergence, Hellinger distance, and total variation distance, are special cases of -divergence.
These divergences were introduced by Alfréd Rényi in the same paper where he introduced the well-known Rényi entropy. He proved that these divergences decrease in Markov processes. f-divergences were studied further independently by , and and are sometimes known as Csiszár -divergences, Csiszár–Morimoto divergences, or Ali–Silvey distances.
Let and be two probability distributions over a space , such that , that is, is absolutely continuous with respect to . Then, for a convex function such that is finite for all , , and (which could be infinite), the -divergence of from is defined as
We call the generator of .
In concrete applications, there is usually a reference distribution on (for example, when , the reference distribution is the Lebesgue measure), such that , then we can use Radon-Nikodym theorem to take their probability densities and , giving
When there is no such reference distribution ready at hand, we can simply define , and proceed as above. This is a useful technique in more abstract proofs.
The above definition can be extended to cases where is no longer satisfied (Definition 7.1 of ).
Since is convex, and , the function must nondecrease, so there exists , taking value in .
Since for any , we have , we can extend f-divergence to the .
Linearity: given a finite sequence of nonnegative real numbers and generators .
iff for some .
In particular, the monotonicity implies that if a Markov process has a positive equilibrium probability distribution then is a monotonic (non-increasing) function of time, where the probability distribution is a solution of the Kolmogorov forward equations (or Master equation), used to describe the time evolution of the probability distribution in the Markov process. This means that all f-divergences are the Lyapunov functions of the Kolmogorov forward equations.