Résumé
In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses. In the case when some regressors have been measured with errors, estimation based on the standard assumption leads to inconsistent estimates, meaning that the parameter estimates do not tend to the true values even in very large samples. For simple linear regression the effect is an underestimate of the coefficient, known as the attenuation bias. In non-linear models the direction of the bias is likely to be more complicated. Consider a simple linear regression model of the form where denotes the true but unobserved regressor. Instead we observe this value with an error: where the measurement error is assumed to be independent of the true value . If the ′s are simply regressed on the ′s (see simple linear regression), then the estimator for the slope coefficient is which converges as the sample size increases without bound: This is in contrast to the "true" effect of , estimated using the ,: Variances are non-negative, so that in the limit the estimated is smaller than , an effect which statisticians call attenuation or regression dilution. Thus the ‘naïve’ least squares estimator is an inconsistent estimator for . However, is a consistent estimator of the parameter required for a best linear predictor of given the observed : in some applications this may be what is required, rather than an estimate of the ‘true’ regression coefficient , although that would assume that the variance of the errors in the estimation and prediction is identical. This follows directly from the result quoted immediately above, and the fact that the regression coefficient relating the ′s to the actually observed ′s, in a simple linear regression, is given by It is this coefficient, rather than , that would be required for constructing a predictor of based on an observed which is subject to noise.
À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.