**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Statistical theory

Summary

The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistical inference, and the actions and deductions that satisfy the basic principles stated for these different approaches. Within a given approach, statistical theory gives ways of comparing statistical procedures; it can find a best possible procedure within a given context for given statistical problems, or can provide guidance on the choice between alternative procedures.
Apart from philosophical considerations about how to make statistical inferences and decisions, much of statistical theory consists of mathematical statistics, and is closely linked to probability theory, to utility theory, and to optimization.
Scope
Statistical theory provides an underlying rationale and provides a consistent basis for the choice

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

Loading

Related people

Loading

Related units

Loading

Related concepts

Loading

Related courses

Loading

Related lectures

Loading

Related people (1)

Related units (1)

Related concepts (21)

Statistics

Statistics (from German: Statistik, "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and present

Design of experiments

The design of experiments (DOE or DOX), also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditi

Statistical assumption

Statistics, like all mathematical disciplines, does not infer valid conclusions from nothing. Inferring interesting conclusions about real statistical populations almost always requires some backgroun

Related courses (12)

MATH-442: Statistical theory

The course aims at developing certain key aspects of the theory of statistics, providing a common general framework for statistical methodology. While the main emphasis will be on the mathematical aspects of statistics, an effort will be made to balance rigor and intuition.

MSE-421: Statistical mechanics

This course presents an introduction to statistical mechanics geared towards materials scientists. The concepts of macroscopic thermodynamics will be related to a microscopic picture and a statistical interpretation. Lectures and exercises will be complemented with hands-on simulation projects.

FIN-403: Econometrics

The course covers basic econometric models and methods that are routinely applied to obtain inference results in economic and financial applications.

Related publications (12)

Loading

Loading

Loading

Related lectures (24)

Garance Hélène Salomé Durr-Legoupil-Nicoud

The StatComp package is a Matlab statistical toolbox developed over the years by Dr. Testa and his students. It has been inspired by M. R. Brown’s paper Magnetohydrodynamic Turbulence: Observation and experiment [2]. It first performed the analysis of the edge magnetic turbulent field in the TCV. It started in 2015 by A. Yantchenko and has been constantly improved and supplemented since then. The last addition to the package was many separate functions for the ”big data” analysis of the results, done by S. Ogier-Collin. The entire code is currently under review for release in the MHD analysis package within the SPC’s General Analysis Toolkit. The present document reports the latest evolution of this package in the perspective of using the charac- terisation plasma turbulence to possibly provide useful information for the optimisation of real-time plasma control and the fusion performance of a tokamak. The mathematical theory of the StatComp analyses and some examples of application are presented in the section 2. The section 3 presents the evolution of the existing functions as well as the addition of the loading function for the electrostatic data from the edge of the plasma, and the multifractality and predictability analyses. These enhancements are put in the perspective of one particular usage: the characterisation of the turbulence in order optimise potentially plasma control. Then, the up-to-date running instructions and interpretation guidelines are detailed in the section 4. The latter are based on the output figures resulting of the analysis of a standard dataset constituted of a white noise sample, three fractional Brownian motions of different known Hurst index, of a linear ramp and of a sample of the solar wind. The section 5 shows the results of the test on four actual shots realised on the TCV tokamak. The varying parameters are the signs of the poloidal magnetic field and of the plasma current. The four shots are each the resultant of a positive or negative poloidal field and a positive or negative plasma current. The shape and position of the plasma in the vacuum vessel are the same for each shot as well as the amplitude of the varied parameters, i.e. the magnetic field and plasma current. The emphasis is made on the presentation and interpretation of the results obtained with the electrostatic data on the low-field side of the plasma. The obtained results are discussed along the limits of the package and its possible improvements in section 6 before concluding in section 7. In the appendix, the structures necessary to the use of the package are detailed and examples of run commands are presented. In order to offer to the reader a frame of reference for reflection, the main parameters and orders of magnitude related to the plasma shots in TCV are given. Some of the mathematical basis of the statistical theory are also elaborated to complete the description of the different tools of the package. Finally, the reduced bibliography of all the sources explicitly mentioned in this report is doubled by a second bibliography presenting a wider selection of relevant sources each accompanied with a brief description of its content and its link to the present study.

2022During the last twenty years, Random matrix theory (RMT) has produced numerous results that allow a better understanding of large random matrices. These advances have enabled interesting applications in the domain of communication. Although this theory can contribute to many other domains such as brain imaging or genetic research, its has been rarely applied. The main barrier to the adoption of RMT may be the lack of concrete statistical results from probabilistic Random matrix theory. Indeed, direct generalisation of classical multivariate theory to high dimensional assumptions is often difficult and the proposed procedures often assume strong hypotheses on the data matrix such as normality or overly restrictive independence conditions on the data.
This thesis proposes a statistical procedure for testing the equality of two independent estimated covariance matrices when the number of potentially dependent data vectors is large and proportional to the size of the vectors corresponding to the number of observed variables. Although the existing theory builds a very good intuition of the behaviour of these matrices, it does not provide enough results to build a satisfactory test for both the power and the robustness. Hence, inspired by spike models, we define the residual spikes and prove many theorems describing the behaviour of many statistics using eigenvectors and eigenvalues in very general cases. For example in the two central theorems of this thesis, the Invariant Angle Theorem and the Invariant Dot Product Theorem.
Using numerous generalisations of the theory, this thesis finally proposes a description of the behaviour of a statistic under a null hypothesis. This statistic allows the user to test the equality of two populations, but also other null hypotheses such as the independence of two sets of variables. Finally, the robustness of the procedure is demonstrated for different classes of models and criteria for evaluating robustness are proposed to the reader.
Therefore, the major contribution of this thesis is to propose a methodology both easy to apply and having good properties. Secondly, a large number of theoretical results are demonstrated and could be easily used to build other applications.

In this paper, we derive elementary M- and optimally robust asymptotic linear (AL)-estimates for the parameters of an Ornstein-Uhlenbeck process. Simulation and estimation of the process are already well-studied, see Iacus (Simulation and inference for stochastic differential equations. Springer, New York, 2008). However, in order to protect against outliers and deviations from the ideal law the formulation of suitable neighborhood models and a corresponding robustification of the estimators are necessary. As a measure of robustness, we consider the maximum asymptotic mean square error (maxasyMSE), which is determined by the influence curve (IC) of AL estimates. The IC represents the standardized influence of an individual observation on the estimator given the past. In a first step, we extend the method of M-estimation from Huber (Robust statistics. Wiley, New York, 1981). In a second step, we apply the general theory based on local asymptotic normality, AL estimates, and shrinking neighborhoods due to Kohl et al. (Stat Methods Appl 19:333-354, 2010), Rieder (Robust asymptotic statistics. Springer, New York, 1994), Rieder (2003), and Staab (1984). This leads to optimally robust ICs whose graph exhibits surprising behavior. In the end, we discuss the estimator construction, i.e. the problem of constructing an estimator from the family of optimal ICs. Therefore we carry out in our context the One-Step construction dating back to LeCam (Asymptotic methods in statistical decision theory. Springer, New York, 1969) and compare it by means of simulations with MLE and M-estimator.