**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Concept# Consistent estimator

Résumé

In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter θ0—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ0. This means that the distributions of the estimates become more and more concentrated near the true value of the parameter being estimated, so that the probability of the estimator being arbitrarily close to θ0 converges to one.
In practice one constructs an estimator as a function of an available sample of size n, and then imagines being able to keep collecting data and expanding the sample ad infinitum. In this way one would obtain a sequence of estimates indexed by n, and consistency is a property of what occurs as the sample size “grows to infinity”. If the sequence of estimates can be mathematically shown to converge in probability to the true value θ0, it is called a consistent estimator;

Source officielle

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Publications associées

Chargement

Personnes associées

Chargement

Unités associées

Chargement

Concepts associés

Chargement

Cours associés

Chargement

Séances de cours associées

Chargement

Publications associées (52)

Chargement

Chargement

Chargement

Personnes associées (6)

Concepts associés (28)

La statistique est la discipline qui étudie des phénomènes à travers la collecte de données, leur traitement, leur analyse, l'interprétation des résultats et leur présentation afin de rendre ces don

En théorie des probabilités et en statistique, les lois normales sont parmi les lois de probabilité les plus utilisées pour modéliser des phénomènes naturels issus de plusieurs événements aléatoires.

vignette|Exemple d'échantillons pour deux populations ayant la même moyenne mais des variances différentes. La population en rouge a une moyenne de 100 et une variance de 100 (écart-type = SD = standa

Unités associées (4)

Cours associés (22)

MATH-442: Statistical theory

The course aims at developing certain key aspects of the theory of statistics, providing a common general framework for statistical methodology. While the main emphasis will be on the mathematical aspects of statistics, an effort will be made to balance rigor and intuition.

FIN-403: Econometrics

The course covers basic econometric models and methods that are routinely applied to obtain inference results in economic and financial applications.

EE-206: Méthodes de mesure

Ce cours vise à transférer les concepts théoriques et les savoir-faire nécessaires à la réalisation de mesures de bonne qualité. Les contenus méthodologiques et technologiques seront exposés sous forme ex-cathedra et les savoir-faire seront entrainés lors des travaux pratiques.

Séances de cours associées (56)

Traditional approaches to analysing functional data typically follow a two-step procedure, consisting in first smoothing and then carrying out a functional principal component analysis. The idea underlying this procedure is that functional data are well approximated by smooth functions, and that rough variations are due to noise. However, it may very well happen that localised features are rough at a global scale but still smooth at some finer scale. In this thesis we put forward a new statistical approach for functional data arising as the sum of two uncorrelated components: one smooth plus one rough. We give non-parametric conditions under which the covariance operators of the smooth and of the rough components are jointly identifiable on the basis of discretely observed data: the covariance operator corresponding to the smooth component must be of finite rank and have real analytic eigenfunctions, while the one corresponding to the rough component must have a banded covariance function. We construct consistent estimators of both covariance operators without assuming knowledge of the true rank or bandwidth. We then use them to estimate the best linear predictors of the the smooth and the rough components of each functional datum. In both the identifiability and the inference part, we do not follow the usual strategy used in functional data analysis which is to first employ smoothing and work with continuous estimate of the covariance operator. Instead, we work directly with the covariance matrix of the discretely observed data, which allows us to use results and tools from linear algebra. In fact, we show that the whole problem of uniquely recovering the covariance operator of the smooth component given the one of the raw data can be seen as a low-rank matrix completion problem, and we make great use of a classical relation between the rank and the minors of a matrix to solve this matrix completion problem. The finite-sample performance of our approach is studied by means of simulation study.

Functional time series is a temporally ordered sequence of not necessarily independent random curves. While the statistical analysis of such data has been traditionally carried out under the assumption of completely observed functional data, it may well happen that the statistician only has access to a relatively low number of sparse measurements for each random curve. These discrete measurements may be moreover irregularly scattered in each curve's domain, missing altogether for some curves, and be contaminated by measurement noise. This sparse sampling protocol escapes from the reach of established estimators in functional time series analysis and therefore requires development of a novel methodology.
The core objective of this thesis is development of a non-parametric statistical toolbox for analysis of sparsely observed functional time series data. Assuming smoothness of the latent curves, we construct a local-polynomial-smoother based estimator of the spectral density operator producing a consistent estimator of the complete second order structure of the data. Moreover, the spectral domain recovery approach allows for prediction of latent curve data at a given time by borrowing strength from the estimated dynamic correlations in the entire time series across time. Further to predicting the latent curves from their noisy point samples, the method fills in gaps in the sequence (curves nowhere sampled), denoises the data, and serves as a basis for forecasting.
A classical non-parametric apparatus for encoding the dependence between a pair of or among a multiple functional time series, whether sparsely or fully observed, is the functional lagged regression model. This consists of a linear filter between the regressors time series and the response. We show how to tailor the smoother based estimators for the estimation of the cross-spectral density operators and the cross-covariance operators and, by means of spectral truncation and Tikhonov regularisation techniques, how to estimate the lagged regression filter and predict the response process.
The simulation studies revealed the following findings: (i) if one has freedom to design a sampling scheme with a fixed number of measurements, it is advantageous to sparsely distribute these measurements in a longer time horizon rather than concentrating over a shorter time horizon to achieve dense measurements in order to diminish the spectral density estimation error, (ii) the developed functional recovery predictor surpasses the static predictor not exploiting the temporal dependence, (iii) neither of the two considered regularisation techniques can, in general, dominate the other for the estimation in functional lagged regression models. The new methodologies are illustrated by applications to real data: the meteorological data revolving around the fair-weather atmospheric electricity measured in Tashkent, Uzbekistan, and at Wank mountain, Germany; and a case study analysing the dependence of the US Treasury yield curve on macroeconomic variables.
As a secondary contribution, we present a novel simulation method for general stationary functional time series defined through their spectral properties. A simulation study shows universality of such approach and superiority of the spectral domain simulation over the temporal domain in some situations.

This work is about time series of functional data (functional time series), and consists of three main parts. In the first part (Chapter 2), we develop a doubly spectral decomposition for functional time series that generalizes the Karhunen–Loève expansion. In the second part (Chapter 3), we develop the theory of estimation for the spectral density operators, which are the main tool involved in the doubly spectral decomposition. The third part (Chapter 4) is concerned with the problem of understanding and comparing the dynamics of DNA. It proposes a methodology for comparing the dynamics of DNA minicircles that are vibrating in solution, using tools developed in this thesis. In the first part, we develop a doubly spectral representation of a stationary functional time series that generalizes the Karhunen–Loève expansion to the functional time series setting. The representation decomposes the time series into an integral of uncorrelated frequency components (Cramér representation), each of which is in turn expanded in a Karhunen-Loève series, thus yielding a Cramér–Karhunen–Loève decomposition of the series. The construction is based on the spectral density operators—whose Fourier coefficients are the lag-t autocovariance operators—which characterise the second-order dynamics of the process. The spectral density operators are the functional analogues of the spectral density matrices, whose eigenvalues and eigenfunctions at different frequencies provide the building blocks of the representation. By truncating the representation at a finite level, we obtain a harmonic principal component analysis of the time series, an optimal finite dimensional reduction of the time series that captures both the temporal dynamics of the process, and the within-curve dynamics, and dominates functional PCA. The proofs rely on the construction of a stochastic integral of operator-valued functions, whose construction is similar to that of the Itô integral. In practice, the spectral density operators are unknown. In the second part, we therefore develop the basic theory of a frequency domain framework for drawing statistical inferences on the spectral density operators of a stationary functional time series. Our main tool is the functional Discrete Fourier Transform(fDFT).We derive an asymptotic Gaussian representation of the fDFT, thus allowing the transformation of the original collection of dependent random functions into a collection of approximately independent complex-valued Gaussian random functions. Our results are then employed in order to construct estimators of the spectral density operators based on smoothed versions of the periodogram kernel, the functional generalisation of the periodogram matrix. The consistency and asymptotic law of these estimators are studied in detail. As immediate consequences, we obtain central limit theorems for the mean and the long-run covariance operator of a stationary functional time series. Our results do not depend on structural modeling assumptions, but only functional versions of classical cumulant mixing conditions. The effect of discrete noisy observations on the consistency of the estimators is studied in a framework general enough to apply to a wide range of smoothing techniques for converting discrete noisy observations into functional data. We also perform a simulation study to assess the finite sample performance of our estimators, and give a discussion of the technical assumptions of our results, and at what cost our weak dependence assumptions could be changed or weakened, and provide examples of processes satisfying the technical assumptions of our asymptotic results. As an application, we consider in the third part the problem of comparing the dynamics of the trajectories of two DNA minicircles that are vibrating in solution, which are obtained via Molecular Dynamics simulations. The approach we take is to view and compare the dynamics through their spectral density operators, which contain the entire second-order structure of the trajectories. As a first step, we compare the spectral density operators of the two DNA minicircles using a new test we develop, which allows us to compare the spectral density operators at a fixed frequencies. Using multiple testing procedures, we are able to localize in frequencies the differences in spectral density operators of the two DNA minicircles, while controlling a type-I error, and conduct numerical simulations to assess the performance of our method. We further investigate the differences between the two minicircles by comparing their spectral density operators within frequencies. This allows us to localize their differences both in frequencies and on the minicircles, while controlling the averaged false discovery rate over the selected frequencies. Our methodology is general enough to be applied to the comparison of the dynamics of any pair of stationary functional time series.