**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Ordinary least squares

Summary

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable (values of the variable being observed) in the input dataset and the output of the (linear) function of the independent variable.
Geometrically, this is seen as the sum of the squared distances, parallel to the axis of the dependent variable, between each data point in the set and the corresponding point on the regression surface—the smaller the differences, the better the model fits the data. The resulting estimator can be expressed by a simple formula, especially in the case of a simple linear regression, in which there is a single regressor on the right side of the regression equation.
The OLS estimator is consistent for the level-one fixed effects when the regressors are exogenous and forms perfect colinearity (rank condition), consistent for the variance estimate of the residuals when regressors have finite fourth moments and—by the Gauss–Markov theorem—optimal in the class of linear unbiased estimators when the errors are homoscedastic and serially uncorrelated. Under these conditions, the method of OLS provides minimum-variance mean-unbiased estimation when the errors have finite variances. Under the additional assumption that the errors are normally distributed with zero mean, OLS is the maximum likelihood estimator that outperforms any non-linear unbiased estimator.
Linear regression model
Suppose the data consists of observations . Each observation includes a scalar response and a column vector of parameters (regressors), i.e., . In a linear regression model, the response variable, , is a linear function of the regressors:
or in vector form,
where , as introduced previously, is a column vector of the -th observation of all the explanatory variables; is a vector of unknown parameters; and the scalar represents unobserved random variables (errors) of the -th observation.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related courses (81)

FIN-414: Optimization methods

This course presents the problem of static optimization, with and without (equality and inequality) constraints, both from the theoretical (optimality conditions) and methodological (algorithms) point

CS-233(a): Introduction to machine learning (BA3)

Machine learning and data analysis are becoming increasingly central in many sciences and applications. In this course, fundamental principles and methods of machine learning will be introduced, analy

MATH-413: Statistics for data science

Statistics lies at the foundation of data science, providing a unifying theoretical and methodological backbone for the diverse tasks enountered in this emerging field. This course rigorously develops

Related MOOCs (2)

Selected Topics on Discrete Choice

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

Selected Topics on Discrete Choice

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

Related concepts (92)

Orthogonality

In mathematics, orthogonality is the generalization of the geometric notion of perpendicularity. Orthogonality is also used with various meanings that are often weakly related or not related at all with the mathematical meanings. The word comes from the Ancient Greek ὀρθός (), meaning "upright", and γωνία (), meaning "angle". The Ancient Greek ὀρθογώνιον () and Classical Latin orthogonium originally denoted a rectangle. Later, they came to mean a right triangle.

Gauss–Markov theorem

In statistics, the Gauss–Markov theorem (or simply Gauss theorem for some authors) states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal, nor do they need to be independent and identically distributed (only uncorrelated with mean zero and homoscedastic with finite variance).

Ordinary least squares

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable (values of the variable being observed) in the input dataset and the output of the (linear) function of the independent variable.

Related lectures (658)

Density matrix and von Neumann entropy: first intro

Introduces density matrix and von Neumann entropy in quantum systems.

Supervised Learning: Linear Regression

Covers supervised learning with a focus on linear regression, including topics like digit classification, spam detection, and wind speed prediction.

Diagonalization of Symmetric Matrices

Covers the diagonalization of symmetric matrices, the spectral theorem, and the use of spectral decomposition.