Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
In probability theory, an -divergence is a function that measures the difference between two probability distributions and . Many common divergences, such as KL-divergence, Hellinger distance, and total variation distance, are special cases of -divergence. These divergences were introduced by Alfréd Rényi in the same paper where he introduced the well-known Rényi entropy. He proved that these divergences decrease in Markov processes. f-divergences were studied further independently by , and and are sometimes known as Csiszár -divergences, Csiszár–Morimoto divergences, or Ali–Silvey distances. Let and be two probability distributions over a space , such that , that is, is absolutely continuous with respect to . Then, for a convex function such that is finite for all , , and (which could be infinite), the -divergence of from is defined as We call the generator of . In concrete applications, there is usually a reference distribution on (for example, when , the reference distribution is the Lebesgue measure), such that , then we can use Radon-Nikodym theorem to take their probability densities and , giving When there is no such reference distribution ready at hand, we can simply define , and proceed as above. This is a useful technique in more abstract proofs. The above definition can be extended to cases where is no longer satisfied (Definition 7.1 of ). Since is convex, and , the function must nondecrease, so there exists , taking value in . Since for any , we have , we can extend f-divergence to the . Linearity: given a finite sequence of nonnegative real numbers and generators . iff for some . In particular, the monotonicity implies that if a Markov process has a positive equilibrium probability distribution then is a monotonic (non-increasing) function of time, where the probability distribution is a solution of the Kolmogorov forward equations (or Master equation), used to describe the time evolution of the probability distribution in the Markov process. This means that all f-divergences are the Lyapunov functions of the Kolmogorov forward equations.
Aude Billard, Iason Batzianoulis, Anqing Duan