Regression dilution, also known as regression attenuation, is the biasing of the linear regression slope towards zero (the underestimation of its absolute value), caused by errors in the independent variable.
Consider fitting a straight line for the relationship of an outcome variable y to a predictor variable x, and estimating the slope of the line. Statistical variability, measurement error or random noise in the y variable causes uncertainty in the estimated slope, but not bias: on average, the procedure calculates the right slope. However, variability, measurement error or random noise in the x variable causes bias in the estimated slope (as well as imprecision). The greater the variance in the x measurement, the closer the estimated slope must approach zero instead of the true value.
It may seem counter-intuitive that noise in the predictor variable x induces a bias, but noise in the outcome variable y does not. Recall that linear regression is not symmetric: the line of best fit for predicting y from x (the usual linear regression) is not the same as the line of best fit for predicting x from y.
Regression slope and other regression coefficients can be disattenuated as follows.
The case that x is fixed, but measured with noise, is known as the functional model or functional relationship.
It can be corrected using total least squares and errors-in-variables models in general.
The case that the x variable arises randomly is known as the structural model or structural relationship. For example, in a medical study patients are recruited as a sample from a population, and their characteristics such as blood pressure may be viewed as arising from a random sample.
Under certain assumptions (typically, normal distribution assumptions) there is a known ratio between the true slope, and the expected estimated slope. Frost and Thompson (2000) review several methods for estimating this ratio and hence correcting the estimated slope.
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
En statistiques, en économétrie et en apprentissage automatique, un modèle de régression linéaire est un modèle de régression qui cherche à établir une relation linéaire entre une variable, dite expliquée, et une ou plusieurs variables, dites explicatives. On parle aussi de modèle linéaire ou de modèle de régression linéaire. Parmi les modèles de régression linéaire, le plus simple est l'ajustement affine. Celui-ci consiste à rechercher la droite permettant d'expliquer le comportement d'une variable statistique y comme étant une fonction affine d'une autre variable statistique x.
In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses. In the case when some regressors have been measured with errors, estimation based on the standard assumption leads to inconsistent estimates, meaning that the parameter estimates do not tend to the true values even in very large samples.
In statistics, simple linear regression is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one dependent variable (conventionally, the x and y coordinates in a Cartesian coordinate system) and finds a linear function (a non-vertical straight line) that, as accurately as possible, predicts the dependent variable values as a function of the independent variable. The adjective simple refers to the fact that the outcome variable is related to a single predictor.
Machine learning methods are becoming increasingly central in many sciences and applications. In this course, fundamental principles and methods of machine learning will be introduced, analyzed and pr
This course provides in-depth understanding of the most fundamental algorithms in statistical pattern recognition or machine learning (including Deep Learning) as well as concrete tools (as Python sou
Explore l'apprentissage supervisé en économétrie financière, couvrant la régression linéaire, l'ajustement du modèle, les problèmes potentiels, les fonctions de base, la sélection de sous-ensembles, la validation croisée, la régularisation et les forêts aléatoires.
Explore comment les variables instrumentales corrigent les biais à partir des erreurs de mesure et de la causalité inverse dans les modèles de régression.
Here we discuss "hidden variables", which are typically introduced during an experiment as a consequence of the application of two independent variables together to create a stimulus. With increased sophistication in modern chemical biology tools and relat ...
We report the successful acquisition of surface electromyography (sEMG) signals from an intelligent armband and its application in localized muscle fatigue monitoring with a costume-designed, small-scale front-end readout circuitry. The correlation coeffic ...
We probe the accuracy of linear ridge regression employing a three-body local density representation derived from the atomic cluster expansion. We benchmark the accuracy of this framework in the prediction of formation energies and atomic forces in molecul ...