**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Expectation propagation in the large data limit

Abstract

Expectation propagation (EP) is a widely successful algorithm for variational inference. EP is an iterative algorithm used to approximate complicated distributions, typically to find a Gaussian approximation of posterior distributions. In many applications of this type, EP performs extremely well. Surprisingly, despite its widespread use, there are very few theoretical guarantees on Gaussian EP, and it is quite poorly understood. To analyse EP, we first introduce a variant of EP: averaged EP, which operates on a smaller parameter space. We then consider averaged EP and EP in the limit of infinite data, where the overall contribution of each likelihood term is small and where posteriors are almost Gaussian. In this limit, we prove that the iterations of both averaged EP and EP are simple: they behave like iterations of Newton's algorithm for finding the mode of a function. We use this limit behaviour to prove that EP is asymptotically exact, and to obtain other insights into the dynamic behaviour of EP, e.g. that it may diverge under poor initialization exactly like Newton's method. EP is a simple algorithm to state, but a difficult one to study. Our results should facilitate further research into the theoretical properties of this important method.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts

Loading

Related publications

Loading

Related concepts (14)

Data mining

Data mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data

Algorithm

In mathematics and computer science, an algorithm (ˈælɡərɪðəm) is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algo

Normal distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function

Related publications (7)

Loading

Loading

Loading

Mixture models form the essential basis of data clustering within a statistical framework. Here, the estimation of the parameters of a mixture of Gaussian densities is considered. In this particular context, it is well known that the maximum likelihood approach is statistically ill posed, i.e. the likelihood function is not bounded above, because of singularities at the boundary of the parameter domain. We show that such a degeneracy can be avoided by penalizing the likelihood function using a suited type of penalty function. Recently, the resulting penalized maximum likelihood estimator has been proved to be asymptotically well-behaved. Local maximization of the likelihood function can be performed by mean of Green's modified EM algorithm: provided that an inverse gamma is chosen as penalty function, EM re-estimation equations are still explicit and automatically ensure that the estimates are not singular. Numerical examples are provided in the finite data case, showing the performances of the penalized estimator compared to the standard one. Our penalized approach is also compared to a constrained approach, which, up to the authors knowledge, represents the only alternate solution to likelihood degeneracy. Our contribution mainly addresses the case of an independent, identically distributed mixture of Gaussian densities, but the more general case of dependent classes is also tackled, with a particular reference to the important case of hidden Markov models.

2003We address the problem of a non-supervised bidimensional segmentation method based upon a discrete-level unilateral Markov field model for the labels and conditionaly Gaussian densities for the observed voxels. Such models have been shown to yield numerically efficient algorithms, for segmentation and for estimation of the model parameters as well. Our contribution is twofold. First, we deal with the degeneracy of the likelihood function with respect to the parameters of the Gaussian densities, which is a well-known problem for such mixture models. We introduce a bounded penalized likelihood function that has been recently shown to provide a consistent estimator in the simpler cases of independent Gaussian mixtures. On the other hand, implementation with EM reestimation formulas remains possible with only limited changes with respect to the standard case. Second, we propose a telegraphic parameterization of the unilateral Markov field. On a theoretical level, this parameterization ensures that some important properties of the field (e.g., stationarity) do hold. On a practical level, it reduces the computational complexity of the algorithm used in the segmentation and parameter estimation stages of the procedure. In addition, it decreases the number of model parameters that must be estimated, thereby improving convergence speed and accuracy of the corresponding estimation

1997This communication presents a nonsupervised three-dimensional segmentation method based upon a discrete-level unilateral Markov field model for the labels and conditionally Gaussian densities for the observed voxels. Such models have been shown to yield numerically efficient algorithms, for segmentation and for estimation of the model parameters as well. Our contribution is twofold. First, we deal with the degeneracy of the likelihood function with respect to the parameters of the Gaussian densities, which is a well-known problem for such mixture models. We introduce a bounded penalized likelihood function that has been recently shown to provide a consistent estimator in the simpler cases of independent Gaussian mixtures. On the other hand, implementation with EM reestimation formulas remains possible with only limited changes with respect to the standard case. Second, we propose a telegraphic parametrization of the unilateral Markov field. On a theoretical level, this parametrization ensures that some important properties of the field (e.g., stationarity) do hold. On a practical level, it reduces the computational complexity of the algorithm used in the segmentation and parameter estimation stages of the procedure. In addition, it decreases the number of model parameters that must be estimated, thereby improving convergence speed and accuracy of the corresponding estimation method.