**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Mean squared error

Summary

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive (and not zero) is because of randomness or because the estimator does not account for information that could produce a more accurate estimate. In machine learning, specifically empirical risk minimization, MSE may refer to the empirical risk (the average loss on an observed data set), as an estimate of the true MSE (the true risk: the average loss on the actual population distribution).
The MSE is a measure of the quality of an estimator. As it is derived from the square of Euclidean distance, it is always a positive value that decreases as the error approaches z

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

Loading

Related people

Loading

Related units

Loading

Related concepts

Loading

Related courses

Loading

Related lectures

Loading

Related people (7)

Related publications (45)

Loading

Loading

Loading

Related lectures (165)

Related units (3)

Related concepts (57)

Statistics

Statistics (from German: Statistik, "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and present

Linear regression

In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variable

Estimator

In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (th

This paper is concerned with frequency domain theory for functional time series, which are temporally dependent sequences of functions in a Hilbert space. We consider a variance decomposition, which is more suitable for such a data structure than the variance decomposition based on the Karhunen-Loeve expansion. The decomposition we study uses eigenvalues of spectral density operators, which are functional analogs of the spectral density of a stationary scalar time series. We propose estimators of the variance components and derive convergence rates for their mean square error as well as their asymptotic normality. The latter is derived from a frequency domain invariance principle for the estimators of the spectral density operators. This principle is established for a broad class of linear time series models. It is a main contribution of the paper.

In this paper, we derive elementary M- and optimally robust asymptotic linear (AL)-estimates for the parameters of an Ornstein-Uhlenbeck process. Simulation and estimation of the process are already well-studied, see Iacus (Simulation and inference for stochastic differential equations. Springer, New York, 2008). However, in order to protect against outliers and deviations from the ideal law the formulation of suitable neighborhood models and a corresponding robustification of the estimators are necessary. As a measure of robustness, we consider the maximum asymptotic mean square error (maxasyMSE), which is determined by the influence curve (IC) of AL estimates. The IC represents the standardized influence of an individual observation on the estimator given the past. In a first step, we extend the method of M-estimation from Huber (Robust statistics. Wiley, New York, 1981). In a second step, we apply the general theory based on local asymptotic normality, AL estimates, and shrinking neighborhoods due to Kohl et al. (Stat Methods Appl 19:333-354, 2010), Rieder (Robust asymptotic statistics. Springer, New York, 1994), Rieder (2003), and Staab (1984). This leads to optimally robust ICs whose graph exhibits surprising behavior. In the end, we discuss the estimator construction, i.e. the problem of constructing an estimator from the family of optimal ICs. Therefore we carry out in our context the One-Step construction dating back to LeCam (Asymptotic methods in statistical decision theory. Springer, New York, 1969) and compare it by means of simulations with MLE and M-estimator.

In this thesis, we treat robust estimation for the parameters of the Ornstein–Uhlenbeck process, which are the mean, the variance, and the friction. We start by considering classical maximum likelihood estimation. For the simulation study, where we also investigate the choice of the time lag, we use the method of moment (MoM) estimator as initial estimator for the friction parameter of the maximum likelihood estimator (MLE). However, in several aspects the MLE is not robust. For robustification, we first derive elementary M-estimates by extending the method of M-estimation from Huber (1981). We use an intuitively robustified MoM estimate as initial estimate and compare by means of simulation the M-estimate with the MLE. This approach is, however, only ad-hoc since Huber’s minimum Fisher information and minimax asymptotic variance theory remains incomplete for simultaneous location and scale, and does not cover more general models (as for example the Ornstein–Uhlenbeck process). A more general robustness concept due to Kohl et al. (2010), Rieder (1994), and Staab (1984) is based on local asymptotic normality (LAN), asymptotically linear (AL) estimates, and shrinking neighborhoods. We then apply this concept to the Ornstein–Uhlenbeck process. As a measure of robustness, we consider the maximum asymptotic mean square error (maxasyMSE), which is determined by the influence curve (IC) of AL estimates. The IC represents the standardized influence of an individual observation on the estimator given the past. For two kind of neighborhoods (average and average square neighborhoods) we obtain optimally robust ICs. In case of average neighborhoods, their graph exhibits surprising, redescending behavior. For average square neighborhoods the graph is between the one of the elementary M-estimates and the MLE. Finally, we discuss the estimator construction, that is, the problem of constructing an estimator from the family of optimal ICs. We carry out in our context the One-Step construction dating back to LeCam and use both an intuitively robustified MoM estimate and the elementary M-estimate as initial estimate. This results in optimally AL estimates (for average and average square neighborhoods). By means of simulation we then compare the different estimators: MLE, elementary M-estimates, and optimally AL estimates. In addition, we give an application to electricity prices.

Related courses (50)

COM-406: Foundations of Data Science

We discuss a set of topics that are important for the understanding of modern data science but that are typically not taught in an introductory ML course. In particular we discuss fundamental ideas and techniques that come from probability, information theory as well as signal processing.

EE-512: Applied biomedical signal processing

The goal of this course is twofold: (1) to introduce physiological basis, signal acquisition solutions (sensors) and state-of-the-art signal processing techniques, and (2) to propose concrete examples of applications for vital sign monitoring and diagnosis purposes.

CS-233(a): Introduction to machine learning (BA3)

Machine learning and data analysis are becoming increasingly central in many sciences and applications. In this course, fundamental principles and methods of machine learning will be introduced, analyzed and practically implemented.