**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Stratégies de compromis pour l'estimation des paramètres de régression et pour la classification floue

Abstract

Model specification is an integral part of any statistical inference problem. Several model selection techniques have been developed in order to determine which model is the best one among a list of possible candidates. Another way to deal with this question is the so-called model averaging, and in particular the frequentist approach. An estimation of the parameters of interest is obtained by constructing a weighted average of the estimates of these quantities under each candidate model. We develop compromise frequentist strategies for the estimation of regression parameters, as well as for the probabilistic clustering problem. In the regression context, we construct compromise strategies based on the Pitman estimators associated with various underlying errors distributions. The weight given to each model is equal to its profile likelihood, which gives a measure of the goodness-of-fit. Asymptotic properties of both Pitman estimators and profile likelihood allow us to define a minimax strategy for choosing the distributions of the compromise, involving a notion of distance between distributions. Performances of such estimators are then compared to other usual and robust procedures. In the second part of the thesis, we develop compromise strategies in the probabilistic clustering context. Although this clustering method is based on mixtures of distributions, our compromise strategies are not applied directly to the estimates of the parameters, but on the posterior probabilities of membership. Two types of compromise are presented, and the performances of resulting classification rules are investigated.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related MOOCs (4)

Selected Topics on Discrete Choice

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

Selected Topics on Discrete Choice

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

Neuronal Dynamics - Computational Neuroscience of Single Neurons

The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.

Related publications (16)

Related concepts (22)

Estimator

In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the sample mean is a commonly used estimator of the population mean. There are point and interval estimators. The point estimators yield single-valued results. This is in contrast to an interval estimator, where the result would be a range of plausible values.

Normal distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is The parameter is the mean or expectation of the distribution (and also its median and mode), while the parameter is its standard deviation. The variance of the distribution is . A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.

Likelihood function

In statistical inference, the likelihood function quantifies the plausibility of parameter values characterizing a statistical model in light of observed data. Its most typical usage is to compare possible parameter values (under a fixed set of observations and a particular model), where higher values of likelihood are preferred because they correspond to more probable parameter values.

Victor Panaretos, Laya Ghodrati

We present a framework for performing regression when both covariate and response are probability distributions on a compact interval. Our regression model is based on the theory of optimal transporta

Victor Panaretos, Tomas Masák, Tomas Rubin

Nonparametric inference for functional data over two-dimensional domains entails additional computational and statistical challenges, compared to the one-dimensional case. Separability of the covarian

Functional time series is a temporally ordered sequence of not necessarily independent random curves. While the statistical analysis of such data has been traditionally carried out under the assumptio