Summary
In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in Rp×p; however, measured using the intrinsic geometry of positive-definite matrices, the SCM is a biased and inefficient estimator. In addition, if the random variable has a normal distribution, the sample covariance matrix has a Wishart distribution and a slightly differently scaled version of it is the maximum likelihood estimate. Cases involving missing data, heteroscedasticity, or autocorrelated residuals require deeper considerations. Another issue is the robustness to outliers, to which sample covariance matrices are highly sensitive. Statistical analyses of multivariate data often involve exploratory studies of the way in which the variables change in relation to one another and this may be followed up by explicit statistical models involving the covariance matrix of the variables. Thus the estimation of covariance matrices directly from observational data plays two roles: to provide initial estimates that can be used to study the inter-relationships; to provide sample estimates that can be used for model checking. Estimates of covariance matrices are required at the initial stages of principal component analysis and factor analysis, and are also involved in versions of regression analysis that treat the dependent variables in a data-set, jointly with the independent variable as the outcome of a random sample. Given a sample consisting of n independent observations x1,...
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (29)
MATH-444: Multivariate statistics
Multivariate statistics focusses on inferring the joint distributional properties of several random variables, seen as random vectors, with a main focus on uncovering their underlying dependence struc
EE-607: Advanced Methods for Model Identification
This course introduces the principles of model identification for non-linear dynamic systems, and provides a set of possible solution methods that are thoroughly characterized in terms of modelling as
FIN-417: Quantitative risk management
This course is an introduction to quantitative risk management that covers standard statistical methods, multivariate risk factor models, non-linear dependence structures (copula models), as well as p
Show more
Related publications (141)