Concept

Statistical model specification

Summary
In statistics, model specification is part of the process of building a statistical model: specification consists of selecting an appropriate functional form for the model and choosing which variables to include. For example, given personal income together with years of schooling and on-the-job experience , we might specify a functional relationship as follows: where is the unexplained error term that is supposed to comprise independent and identically distributed Gaussian variables. The statistician Sir David Cox has said, "How [the] translation from subject-matter problem to statistical model is done is often the most critical part of an analysis". Specification error occurs when the functional form or the choice of independent variables poorly represent relevant aspects of the true data-generating process. In particular, bias (the expected value of the difference of an estimated parameter and the true underlying value) occurs if an independent variable is correlated with the errors inherent in the underlying process. There are several different possible causes of specification error; some are listed below. An inappropriate functional form could be employed. A variable omitted from the model may have a relationship with both the dependent variable and one or more of the independent variables (causing omitted-variable bias). An irrelevant variable may be included in the model (although this does not create bias, it involves overfitting and so can lead to poor predictive performance). The dependent variable may be part of a system of simultaneous equations (giving simultaneity bias). Additionally, measurement errors may affect the independent variables: while this is not a specification error, it can create statistical bias. Note that all models will have some specification error. Indeed, in statistics there is a common aphorism that "all models are wrong". In the words of Burnham & Anderson, "Modeling is an art as well as a science and is directed toward finding a good approximating model ...
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.