Concept# Errors-in-variables models

Summary

In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses.
In the case when some regressors have been measured with errors, estimation based on the standard assumption leads to inconsistent estimates, meaning that the parameter estimates do not tend to the true values even in very large samples. For simple linear regression the effect is an underestimate of the coefficient, known as the attenuation bias. In non-linear models the direction of the bias is likely to be more complicated.
Motivating example
Consider a simple linear regression model of the form
:
y_{t} = \alpha + \beta x_{t}^{*} + \varepsilon_t,, \quad t=1,\ldots,T,
where x_{t}^

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

Loading

Related people

Loading

Related units

Loading

Related concepts

Loading

Related courses

Loading

Related lectures

Loading

Related people

Related units

No results

No results

Related publications (3)

Loading

Loading

Loading

Related concepts (12)

Linear regression

In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variable

Ordinary least squares

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear functio

Instrumental variables estimation

In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or w

Related courses (4)

Le cours présente les notions de base de la théorie des probabilités et de l'inférence statistique. L'accent est mis sur les concepts principaux ainsi que les méthodes les plus utilisées.

Identification of discrete-time linear models using experimental data is studied. The correlation method and spectral analysis are used to identify nonparametric models and the subspace and prediction error methods to estimate the plant and noise model parameters. Hands-on labs are included.

This course teaches how to apply exploratory spatial data analysis to health data. Teaching focuses on the basics of spatial statistics and of epidemiology, and proposes a context to analyse geodatasets making it possible to study the relationship between health and the environment.

Related lectures (7)

Error-correcting codes are normally employed in storage devices to guarantee the integrity of data in the presence of errors. This paper presents two schemes where error-correcting codes are used for entirely different purposes. In the first part of the paper, a new coding paradigm is proposed to improve the write performance of multi-level flash devices. By slightly relaxing the accuracy of cell programming, significant speed-up can be achieved. The resulting write inaccuracies are then corrected by codes that are tailored for the appropriately restricted error model. In the second part, new low-complexity codes are proposed to protect the security of sensitive data in the presence of imperfect physical erasure processes. Codes that have optimal encoding and decoding complexities are constructed to allow fast storing and retrieval of secret data, and guarantee unconditional security of data against an adversary with access to parts of the secret that failed to erase.

Adrien Georges Jean Besson, Paul Hurley, Matthieu Martin Jean-André Simeoni, Martin Vetterli

Finite rate of innovation (FRI) is a powerful reconstruction framework enabling the recovery of sparse Dirac streams from uniform low-pass filtered samples. An extension of this framework, called generalised FRI (genFRI), has been recently proposed for handling cases with arbitrary linear measurement models. In this context, signal reconstruction amounts to solving a joint constrained optimisation problem, yielding estimates of both the Fourier series coefficients of the Dirac stream and its so-called annihilating filter, involved in the regularisation term. This optimisation problem is however highly non convex and non linear in the data. Moreover, the proposed numerical solver is computationally intensive and without convergence guarantee. In this work, we propose an implicit formulation of the genFRI problem. To this end, we leverage a novel regularisation term which does not depend explicitly on the unknown annihilating filter yet enforces sufficient structure in the solution for stable recovery. The resulting optimisation problem is still non convex, but simpler since linear in the data and with less unknowns. We solve it by means of a provably convergent proximal gradient descent (PGD) method. Since the proximal step does not admit a simple closed-form expression, we propose an inexact PGD method, coined as Cadzow plug-and-play gradient descent (CPGD). The latter approximates the proximal steps by means of Cadzow denoising, a well-known denoising algorithm in FRI. We provide local fixed-point convergence guarantees for CPGD. Through extensive numerical simulations, we demonstrate the superiority of CPGD against the state-of-the-art in the case of non uniform time samples.

2020We extend a basic result of Huber's on least favorable distributions to the setting of conditional inference, using an approach based on the notion of log-Gâteaux differentiation and perturbed models. Whereas Huber considered intervals of fixed width for location parameters and their average coverage rates, we study error models having longest confidence intervals, conditional on the location configuration of the sample. Our version of the problem does not have a global solution, but one that changes from configuration to configuration. Asymptotically, the conditionally least-informative shape minimizes the conditional Fisher information. We characterize the asymptotic solution within Huber's contamination model. © 2004 Elsevier B.V. All rights reserved.

2005