In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted , is a family of continuous multivariate probability distributions parameterized by a vector of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD). Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact, the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution. The infinite-dimensional generalization of the Dirichlet distribution is the Dirichlet process. The Dirichlet distribution of order K ≥ 2 with parameters α1, ..., αK > 0 has a probability density function with respect to Lebesgue measure on the Euclidean space RK-1 given by where belong to the standard simplex, or in other words: The normalizing constant is the multivariate beta function, which can be expressed in terms of the gamma function: The support of the Dirichlet distribution is the set of K-dimensional vectors whose entries are real numbers in the interval [0,1] such that , i.e. the sum of the coordinates is equal to 1. These can be viewed as the probabilities of a K-way categorical event. Another way to express this is that the domain of the Dirichlet distribution is itself a set of probability distributions, specifically the set of K-dimensional discrete distributions. The technical term for the set of points in the support of a K-dimensional Dirichlet distribution is the open standard (K − 1)-simplex, which is a generalization of a triangle, embedded in the next-higher dimension. For example, with K = 3, the support is an equilateral triangle embedded in a downward-angle fashion in three-dimensional space, with vertices at (1,0,0), (0,1,0) and (0,0,1), i.e. touching each of the coordinate axes at a point 1 unit away from the origin. A common special case is the symmetric Dirichlet distribution, where all of the elements making up the parameter vector have the same value.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (22)
DH-406: Machine learning for DH
This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple
FIN-417: Quantitative risk management
This course is an introduction to quantitative risk management that covers standard statistical methods, multivariate risk factor models, non-linear dependence structures (copula models), as well as p
MATH-231: Probability and statistics I
Introduction to notions of probability and basic statistics.
Show more
Related lectures (54)
Review of Probability
Covers the review of probability concepts including Poisson distribution and moment generating functions.
Counting: Bit Strings and Committees
Explores counting bit strings, committees, balls distribution, poker hands, and coefficients, with examples of the Pigeonhole Principle and card selection.
Fundamental Limits of Gradient-Based Learning
Delves into the fundamental limits of gradient-based learning on neural networks, covering topics such as binomial theorem, exponential series, and moment-generating functions.
Show more
Related publications (123)

Network-based kinetic models: Emergence of a statistical description of the graph topology

Matteo Raviola

In this paper, we propose a novel approach that employs kinetic equations to describe the collective dynamics emerging from graph-mediated pairwise interactions in multi-agent systems. We formally show that for large graphs and specific classes of interact ...
Cambridge2024

CONVERGENCE AND NONCONVERGENCE OF SCALED SELF-INTERACTING RANDOM WALKS TO BROWNIAN MOTION PERTURBED AT EXTREMA

Thomas Mountford

We use generalized Ray-Knight theorems, introduced by B. Toth in 1996, together with techniques developed for excited random walks as main tools for establishing positive and negative results concerning convergence of some classes of diffusively scaled sel ...
Cleveland2023

Parabolic stochastic PDEs on bounded domains with rough initial conditions: moment and correlation bounds

Le Chen, Cheuk Yin Lee, David Jean-Michel Candil

We consider nonlinear parabolic stochastic PDEs on a bounded Lipschitz domain driven by a Gaussian noise that is white in time and colored in space, with Dirichlet or Neumann boundary condition. We establish existence, uniqueness and moment bounds of the r ...
New York2023
Show more
Related concepts (21)
Categorical distribution
In probability theory and statistics, a categorical distribution (also called a generalized Bernoulli distribution, multinoulli distribution) is a discrete probability distribution that describes the possible results of a random variable that can take on one of K possible categories, with the probability of each category separately specified. There is no innate underlying ordering of these outcomes, but numerical labels are often attached for convenience in describing the distribution, (e.g. 1 to K).
Gibbs sampling
In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for obtaining a sequence of observations which are approximated from a specified multivariate probability distribution, when direct sampling is difficult. This sequence can be used to approximate the joint distribution (e.g., to generate a histogram of the distribution); to approximate the marginal distribution of one of the variables, or some subset of the variables (for example, the unknown parameters or latent variables); or to compute an integral (such as the expected value of one of the variables).
Multinomial distribution
In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided dice rolled n times. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.