Résumé
In probability theory, Dirichlet processes (after the distribution associated with Peter Gustav Lejeune Dirichlet) are a family of stochastic processes whose realizations are probability distributions. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution. As an example, a bag of 100 real-world dice is a random probability mass function (random pmf)—to sample this random pmf you put your hand in the bag and draw out a die, that is, you draw a pmf. A bag of dice manufactured using a crude process 100 years ago will likely have probabilities that deviate wildly from the uniform pmf, whereas a bag of state-of-the-art dice used by Las Vegas casinos may have barely perceptible imperfections. We can model the randomness of pmfs with the Dirichlet distribution. The Dirichlet process is specified by a base distribution and a positive real number called the concentration parameter (also known as scaling parameter). The base distribution is the expected value of the process, i.e., the Dirichlet process draws distributions "around" the base distribution the way a normal distribution draws real numbers around its mean. However, even if the base distribution is continuous, the distributions drawn from the Dirichlet process are almost surely discrete. The scaling parameter specifies how strong this discretization is: in the limit of , the realizations are all concentrated at a single value, while in the limit of the realizations become continuous. Between the two extremes the realizations are discrete distributions with less and less concentration as increases. The Dirichlet process can also be seen as the infinite-dimensional generalization of the Dirichlet distribution.
À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
Cours associés (2)
DH-406: Machine learning for DH
This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple
MATH-305: Introduction to partial differential equations
This is an introductory course on Elliptic Partial Differential Equations. The course will cover the theory of both classical and generalized (weak) solutions of elliptic PDEs.
Publications associées (35)
Unités associées (1)
Concepts associés (6)
Dirichlet-multinomial distribution
In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribution (DCM) or multivariate Pólya distribution (after George Pólya). It is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector , and an observation drawn from a multinomial distribution with probability vector p and number of trials n.
Échantillonnage de Gibbs
L' est une méthode MCMC. Étant donné une distribution de probabilité sur un univers , cet algorithme définit une chaîne de Markov dont la distribution stationnaire est . Il permet ainsi de tirer aléatoirement un élément de selon la loi (on parle d'échantillonnage). Comme pour toutes les méthodes de Monte-Carlo à chaîne de Markov, on se place dans un espace vectoriel Ɛ de dimension finie n ; on veut générer aléatoirement N vecteurs x(i) suivant une distribution de probabilité π ; pour simplifier le problème, on détermine une distribution qx(i) permettant de générer aléatoirement x(i + 1) à partir de x(i).
Modèle de mélange
In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Formally a mixture model corresponds to the mixture distribution that represents the probability distribution of observations in the overall population.
Afficher plus