Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
How can we discern whether the covariance operator of a stochastic pro-cess is of reduced rank, and if so, what its precise rank is? And how can we do so at a given level of confidence? This question is central to a great deal of methods for functional data, which require low-dimensional representa-tions whether by functional PCA or other methods. The difficulty is that the determination is to be made on the basis of i.i.d. replications of the process observed discretely and with measurement error contamination. This adds a ridge to the empirical covariance, obfuscating the underlying dimension. We build a matrix-completion inspired test statistic that circumvents this issue by measuring the best possible least square fit of the empirical covariance's off -diagonal elements, optimised over covariances of given finite rank. For a fixed grid of sufficiently large size, we determine the statistic's asymptotic null dis-tribution as the number of replications grows. We then use it to construct a bootstrap implementation of a stepwise testing procedure controlling the fam-ilywise error rate corresponding to the collection of hypotheses formalising the question at hand. Under minimal regularity assumptions, we prove that the procedure is consistent and that its bootstrap implementation is valid. The procedure circumvents smoothing and associated smoothing parameters, is indifferent to measurement error heteroskedasticity, and does not assume a low-noise regime. An extensive simulation study reveals an excellent practi-cal performance, stably across a wide range of settings and the procedure is further illustrated by means of two data analyses.
Victor Panaretos, Yoav Zemel, Valentina Masarotto
,
Daniel Kuhn, Yves Rychener, Viet Anh Nguyen