Summary
A prior probability distribution of an uncertain quantity, often simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the relative proportions of voters who will vote for a particular politician in a future election. The unknown quantity may be a parameter of the model or a latent variable rather than an observable variable. In Bayesian statistics, Bayes' rule prescribes how to update the prior with new information to obtain the posterior probability distribution, which is the conditional distribution of the uncertain quantity given new data. Historically, the choice of priors was often constrained to a conjugate family of a given likelihood function, for that it would result in a tractable posterior of the same family. The widespread availability of Markov chain Monte Carlo methods, however, has made this less of a concern. There are many ways to construct a prior distribution. In some cases, a prior may be determined from past information, such as previous experiments. A prior can also be elicited from the purely subjective assessment of an experienced expert. When no information is available, an uninformative prior may be adopted as justified by the principle of indifference. In modern applications, priors are also often chosen for their mechanical properties, such as regularization and feature selection. The prior distributions of model parameters will often depend on parameters of their own. Uncertainty about these hyperparameters can, in turn, be expressed as hyperprior probability distributions. For example, if one uses a beta distribution to model the distribution of the parameter p of a Bernoulli distribution, then: p is a parameter of the underlying system (Bernoulli distribution), and α and β are parameters of the prior distribution (beta distribution); hence hyperparameters. In principle, priors can be decomposed into many conditional levels of distributions, so-called hierarchical priors.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (32)
MATH-232: Probability and statistics
A basic course in probability and statistics
MATH-131: Probability and statistics
Le cours présente les notions de base de la théorie des probabilités et de l'inférence statistique. L'accent est mis sur les concepts principaux ainsi que les méthodes les plus utilisées.
CS-101: Advanced information, computation, communication I
Discrete mathematics is a discipline with applications to almost all areas of study. It provides a set of indispensable tools to computer science in particular. This course reviews (familiar) topics a
Show more
Related concepts (38)
Principle of maximum entropy
The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data (such as a proposition that expresses testable information). Another way of stating this: Take precisely stated prior data or testable information about a probability distribution function. Consider the set of all trial probability distributions that would encode the prior data.
Principle of indifference
The principle of indifference (also called principle of insufficient reason) is a rule for assigning epistemic probabilities. The principle of indifference states that in the absence of any relevant evidence, agents should distribute their credence (or 'degrees of belief') equally among all the possible outcomes under consideration. In Bayesian probability, this is the simplest non-informative prior.
Bayesian statistics
Bayesian statistics (ˈbeɪziən or ˈbeɪʒən ) is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation that views probability as the limit of the relative frequency of an event after many trials.
Show more