**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Fourier Analysis of Functional Time Series, with Applications to DNA Dynamics

Résumé

This work is about time series of functional data (functional time series), and consists of three main parts. In the first part (Chapter 2), we develop a doubly spectral decomposition for functional time series that generalizes the Karhunen–Loève expansion. In the second part (Chapter 3), we develop the theory of estimation for the spectral density operators, which are the main tool involved in the doubly spectral decomposition. The third part (Chapter 4) is concerned with the problem of understanding and comparing the dynamics of DNA. It proposes a methodology for comparing the dynamics of DNA minicircles that are vibrating in solution, using tools developed in this thesis. In the first part, we develop a doubly spectral representation of a stationary functional time series that generalizes the Karhunen–Loève expansion to the functional time series setting. The representation decomposes the time series into an integral of uncorrelated frequency components (Cramér representation), each of which is in turn expanded in a Karhunen-Loève series, thus yielding a Cramér–Karhunen–Loève decomposition of the series. The construction is based on the spectral density operators—whose Fourier coefficients are the lag-t autocovariance operators—which characterise the second-order dynamics of the process. The spectral density operators are the functional analogues of the spectral density matrices, whose eigenvalues and eigenfunctions at different frequencies provide the building blocks of the representation. By truncating the representation at a finite level, we obtain a harmonic principal component analysis of the time series, an optimal finite dimensional reduction of the time series that captures both the temporal dynamics of the process, and the within-curve dynamics, and dominates functional PCA. The proofs rely on the construction of a stochastic integral of operator-valued functions, whose construction is similar to that of the Itô integral. In practice, the spectral density operators are unknown. In the second part, we therefore develop the basic theory of a frequency domain framework for drawing statistical inferences on the spectral density operators of a stationary functional time series. Our main tool is the functional Discrete Fourier Transform(fDFT).We derive an asymptotic Gaussian representation of the fDFT, thus allowing the transformation of the original collection of dependent random functions into a collection of approximately independent complex-valued Gaussian random functions. Our results are then employed in order to construct estimators of the spectral density operators based on smoothed versions of the periodogram kernel, the functional generalisation of the periodogram matrix. The consistency and asymptotic law of these estimators are studied in detail. As immediate consequences, we obtain central limit theorems for the mean and the long-run covariance operator of a stationary functional time series. Our results do not depend on structural modeling assumptions, but only functional versions of classical cumulant mixing conditions. The effect of discrete noisy observations on the consistency of the estimators is studied in a framework general enough to apply to a wide range of smoothing techniques for converting discrete noisy observations into functional data. We also perform a simulation study to assess the finite sample performance of our estimators, and give a discussion of the technical assumptions of our results, and at what cost our weak dependence assumptions could be changed or weakened, and provide examples of processes satisfying the technical assumptions of our asymptotic results. As an application, we consider in the third part the problem of comparing the dynamics of the trajectories of two DNA minicircles that are vibrating in solution, which are obtained via Molecular Dynamics simulations. The approach we take is to view and compare the dynamics through their spectral density operators, which contain the entire second-order structure of the trajectories. As a first step, we compare the spectral density operators of the two DNA minicircles using a new test we develop, which allows us to compare the spectral density operators at a fixed frequencies. Using multiple testing procedures, we are able to localize in frequencies the differences in spectral density operators of the two DNA minicircles, while controlling a type-I error, and conduct numerical simulations to assess the performance of our method. We further investigate the differences between the two minicircles by comparing their spectral density operators within frequencies. This allows us to localize their differences both in frequencies and on the minicircles, while controlling the averaged false discovery rate over the selected frequencies. Our methodology is general enough to be applied to the comparison of the dynamics of any pair of stationary functional time series.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (43)

Série temporelle

thumb|Exemple de visualisation de données montrant une tendances à moyen et long terme au réchauffement, à partir des séries temporelles de températures par pays (ici regroupés par continents, du nord

Fourier analysis

In mathematics, Fourier analysis (ˈfʊrieɪ,_-iər) is the study of the way general functions may be represented or approximated by sums of simpler trigonometric functions. Fourier analysis grew from

Théorème central limite

thumb|upright=2|La loi normale, souvent appelée la « courbe en cloche ».
Le théorème central limite (aussi appelé théorème limite central, théorème de la limite centrale ou théorème de la limite cent

Publications associées (53)

Chargement

Chargement

Chargement

Victor Panaretos, Shahin Tavakoli

We develop the basic building blocks of a frequency domain framework for drawing statistical inferences on the second-order structure of a stationary sequence of functional data. The key element in such a context is the spectral density operator, which generalises the notion of a spectral density matrix to the functional setting, and characterises the second-order dynamics of the process. Our main tool is the functional Discrete Fourier Transform (fDFT). We derive an asymptotic Gaussian representation of the fDFT, thus allowing the transformation of the original collection of dependent random functions into a collection of approximately independent complex-valued Gaussian random functions. Our results are then employed in order to construct estimators of the spectral density operator based on smoothed versions of the periodogram kernel, the functional generalisation of the periodogram matrix. The consistency and asymptotic law of these estimators are studied in detail. As immediate consequences, we obtain central limit theorems for the mean and the long-run covariance operator of a stationary functional time series. Our results do not depend on structural modelling assumptions, but only functional versions of classical cumulant mixing conditions, and are shown to be stable under discrete observation of the individual curves.

This paper is concerned with frequency domain theory for functional time series, which are temporally dependent sequences of functions in a Hilbert space. We consider a variance decomposition, which is more suitable for such a data structure than the variance decomposition based on the Karhunen-Loeve expansion. The decomposition we study uses eigenvalues of spectral density operators, which are functional analogs of the spectral density of a stationary scalar time series. We propose estimators of the variance components and derive convergence rates for their mean square error as well as their asymptotic normality. The latter is derived from a frequency domain invariance principle for the estimators of the spectral density operators. This principle is established for a broad class of linear time series models. It is a main contribution of the paper.

Functional time series analysis, whether based on time or frequency domain methodology, has traditionally been carried out under the assumption of complete observation of the constituent series of curves, assumed stationary. Nevertheless, as is often the case with independent functional data, it may well happen that the data available to the analyst are not the actual sequence of curves, but relatively few and noisy measurements per curve, potentially at different locations in each curve's domain. Under this sparse sampling regime, neither the established estimators of the time series' dynamics nor their corresponding theoretical analysis will apply. The subject of this paper is to tackle the problem of estimating the dynamics and of recovering the latent process of smooth curves in the sparse regime. Assuming smoothness of the latent curves, we construct a consistent nonparametric estimator of the series' spectral density operator and use it to develop a frequency-domain recovery approach, that predicts the latent curve at a given time by borrowing strength from the (estimated) dynamic correlations in the series across time. This new methodology is seen to comprehensively outperform a naive recovery approach that would ignore temporal dependence and use only methodology employed in the i.i.d. setting and hinging on the lag zero covariance. Further to predicting the latent curves from their noisy point samples, the method fills in gaps in the sequence (curves nowhere sampled), denoises the data, and serves as a basis for forecasting. Means of providing corresponding confidence bands are also investigated. A simulation study interestingly suggests that sparse observation for a longer time period may provide better performance than dense observation for a shorter period, in the presence of smoothness. The methodology is further illustrated by application to an environmental data set on fair-weather atmospheric electricity, which naturally leads to a sparse functional time series.