**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Functional data analysis by matrix completion

Abstract

Traditional approaches to analysing functional data typically follow a two-step procedure, consisting in first smoothing and then carrying out a functional principal component analysis. The idea underlying this procedure is that functional data are well approximated by smooth functions, and that rough variations are due to noise. However, it may very well happen that localised features are rough at a global scale but still smooth at some finer scale. In this thesis we put forward a new statistical approach for functional data arising as the sum of two uncorrelated components: one smooth plus one rough. We give non-parametric conditions under which the covariance operators of the smooth and of the rough components are jointly identifiable on the basis of discretely observed data: the covariance operator corresponding to the smooth component must be of finite rank and have real analytic eigenfunctions, while the one corresponding to the rough component must have a banded covariance function. We construct consistent estimators of both covariance operators without assuming knowledge of the true rank or bandwidth. We then use them to estimate the best linear predictors of the the smooth and the rough components of each functional datum. In both the identifiability and the inference part, we do not follow the usual strategy used in functional data analysis which is to first employ smoothing and work with continuous estimate of the covariance operator. Instead, we work directly with the covariance matrix of the discretely observed data, which allows us to use results and tools from linear algebra. In fact, we show that the whole problem of uniquely recovering the covariance operator of the smooth component given the one of the raw data can be seen as a low-rank matrix completion problem, and we make great use of a classical relation between the rank and the minors of a matrix to solve this matrix completion problem. The finite-sample performance of our approach is studied by means of simulation study.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications (16)

Related MOOCs (16)

Related concepts (23)

Optimization: principles and algorithms - Linear optimization

Introduction to linear optimization, duality and the simplex algorithm.

Optimization: principles and algorithms - Linear optimization

Introduction to linear optimization, duality and the simplex algorithm.

Optimization: principles and algorithms - Network and discrete optimization

Introduction to network optimization and discrete optimization

In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter θ0—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ0. This means that the distributions of the estimates become more and more concentrated near the true value of the parameter being estimated, so that the probability of the estimator being arbitrarily close to θ0 converges to one.

In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of a given random vector. Any covariance matrix is symmetric and positive semi-definite and its main diagonal contains variances (i.e., the covariance of each element with itself). Intuitively, the covariance matrix generalizes the notion of variance to multiple dimensions.

Functional data analysis (FDA) is a branch of statistics that analyses data providing information about curves, surfaces or anything else varying over a continuum. In its most general form, under an FDA framework, each sample element of functional data is considered to be a random function. The physical continuum over which these functions are defined is often time, but may also be spatial location, wavelength, probability, etc. Intrinsically, functional data are infinite dimensional.

This thesis focuses on non-parametric covariance estimation for random surfaces, i.e.~functional data on a two-dimensional domain. Non-parametric covariance estimation lies at the heart of functional

Victor Panaretos, Tomas Masák, Tomas Rubin

Nonparametric inference for functional data over two-dimensional domains entails additional computational and statistical challenges, compared to the one-dimensional case. Separability of the covarian

,

The problem of covariance estimation for replicated surface-valued processes is examined from the functional data analysis perspective. Considerations of statistical and computational efficiency often