In statistics, generalized least squares (GLS) is a method used to estimate the unknown parameters in a linear regression model when there is a certain degree of correlation between the residuals in the regression model. Least squares and weighted least squares may need to be more statistically efficient and prevent misleading inferences. GLS was first described by Alexander Aitken in 1935. In standard linear regression models one observes data on n statistical units. The response values are placed in a vector , and the predictor values are placed in the design matrix , where is a vector of the k predictor variables (including a constant) for the ith unit. The model forces the conditional mean of given to be a linear function of and assumes the conditional variance of the error term given is a known nonsingular covariance matrix . This is usually written as Here is a vector of unknown constants (known as “regression coefficients”) that must be estimated from the data. Suppose is a candidate estimate for . Then the residual vector for will be . The generalized least squares method estimates by minimizing the squared Mahalanobis length of this residual vector: where the last two terms evaluate to scalars, resulting in This objective is a quadratic form in . Taking the gradient of this quadratic form with respect to and equating it to zero (when ) gives Therefore, the minimum of the objective function can be computed yielding the explicit formula: The quantity is known as the precision matrix (or dispersion matrix), a generalization of the diagonal weight matrix. The GLS estimator is unbiased, consistent, efficient, and asymptotically normal with and . GLS is equivalent to applying ordinary least squares to a linearly transformed version of the data. To see this, factor , for instance using the Cholesky decomposition. Then if one pre-multiplies both sides of the equation by , we get an equivalent linear model where , , and . In this model , where is the identity matrix.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (32)
FIN-403: Econometrics
The course covers basic econometric models and methods that are routinely applied to obtain inference results in economic and financial applications.
PHYS-467: Machine learning for physicists
Machine learning and data analysis are becoming increasingly central in sciences including physics. In this course, fundamental principles and methods of machine learning will be introduced and practi
MATH-342: Time series
A first course in statistical time series analysis and applications.
Show more
Related lectures (179)
Density matrix and von Neumann entropy: first intro
Introduces density matrix and von Neumann entropy in quantum systems.
Heteroskedasticity and Autocorrelation
Explores heteroskedasticity and autocorrelation in econometrics, covering implications, applications, testing methods, and hypothesis testing consequences.
Electric Cable Potential Analysis
Covers the analysis of electric cable potential based on temperature data.
Show more
Related publications (209)

Stability: a search for structure

Wouter Jongeneel

In this thesis we study stability from several viewpoints. After covering the practical importance, the rich history and the ever-growing list of manifestations of stability, we study the following. (i) (Statistical identification of stable dynamical syste ...
EPFL2024

ReLU Neural Network Galerkin BEM

Fernando José Henriquez Barraza

We introduce Neural Network (NN for short) approximation architectures for the numerical solution of Boundary Integral Equations (BIEs for short). We exemplify the proposed NN approach for the boundary reduction of the potential problem in two spatial dime ...
SPRINGER/PLENUM PUBLISHERS2023

Robust Distributed Learning: Tight Error Bounds and Breakdown Point under Data Heterogeneity

Rachid Guerraoui, Nirupam Gupta, Youssef Allouah, Geovani Rizk, Rafaël Benjamin Pinot

The theory underlying robust distributed learning algorithms, designed to resist adversarial machines, matches empirical observations when data is homogeneous. Under data heterogeneity however, which is the norm in practical scenarios, established lower bo ...
2023
Show more
Related concepts (13)
Linear regression
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.
Homoscedasticity and heteroscedasticity
In statistics, a sequence (or a vector) of random variables is homoscedastic (ˌhoʊmoʊskəˈdæstɪk) if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used.
Gauss–Markov theorem
In statistics, the Gauss–Markov theorem (or simply Gauss theorem for some authors) states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal, nor do they need to be independent and identically distributed (only uncorrelated with mean zero and homoscedastic with finite variance).
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.