**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Extremal behaviour of aggregated data with an application to downscaling

Raphaël Gérard Théodore Michel Marie de Deloÿe et Fourcade de Fondeville, Sebastian Engelke

*OXFORD UNIV PRESS, *2019

Article

Article

Résumé

The distribution of spatially aggregated data from a stochastic process may exhibit tail behaviour different from that of its marginal distributions. For a large class of aggregating functionals we introduce the -extremal coefficient, which quantifies this difference as a function of the extremal spatial dependence in . We also obtain the joint extremal dependence for multiple aggregation functionals applied to the same process. Formulae for the -extremal coefficients and multivariate dependence structures are derived in important special cases. The results provide a theoretical link between the extremal distribution of the aggregated data and the corresponding underlying process, which we exploit to develop a method for statistical downscaling. We apply our framework to downscale daily temperature maxima in the south of France from a gridded dataset and use our model to generate high-resolution maps of the warmest day during the heatwave.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Publications associées (16)

Chargement

Chargement

Chargement

Concepts associés (16)

Processus stochastique

Un processus ou processus aléatoire (voir Calcul stochastique) ou fonction aléatoire (voir Probabilité) représente une évolution, discrète ou à temps continu, d'une variable aléatoire. Celle-ci inte

Loi normale

En théorie des probabilités et en statistique, les lois normales sont parmi les lois de probabilité les plus utilisées pour modéliser des phénomènes naturels issus de plusieurs événements aléatoires.

Processus

Le mot processus vient du latin pro (au sens de « vers l'avant ») et de cessus, cedere (« aller, marcher ») ce qui signifie donc aller vers l'avant, avancer. Ce mot est également à l'origine du mot

This thesis is a contribution to multivariate extreme value statistics. The tail of a multivariate distribution function is characterized by its spectral distribution, for which we propose a new semi-parametric model based on mixtures of Dirichlet distributions. To estimate the components of this model, reversible jump Monte Carlo Markov chain and EM algorithms are developed. Their performances are illustrated on real and simulated data, obtained using new representations of the extremal logistic and Dirichlet models. In parallel with the estimation of the spectral distribution, extreme value statistic machinery requires the selection of a threshold in order to classify data as extreme or not. This selection is achieved by a new method based on heuristic arguments. It allows a selection independent of the dimension of the data. Its performance is illustrated on real and simulated data. Primal scientific interests behind a multivariate extreme value analysis reside in the estimation of quantiles of rare events and in the exploration of the dependence structure, for which the estimation of the spectral measure is a means rather than an end. These two issues are addressed. For the first, a Monte Carlo method is developed based on simulation of extremes. It is compared with classical and new methods of the literature. For the second one, an original conditional dependence analysis is proposed, which enlightens various aspects of the dependence structure of the data. Examples using real data sets are given. In the last part, the semi-parametric model and the presented methods are extended to spatial extremes. It is made possible by considering the spectral distribution as the distribution of a random probability, an original viewpoint adopted throughout this thesis. Classical multivariate extremes are extended to extremes of random measures. The application is illustrated on rainfall data in China.

Simone Padoan, Stefano Rizzelli

The classical multivariate extreme-value theory concerns the modeling of extremes in a multivariate random sample, suggesting the use of max-stable distributions. In this work, the classical theory is extended to the case where aggregated data, such as maxima of a random number of observations, are considered. We derive a limit theorem concerning the attractors for the distributions of the aggregated data, which boil down to a new family of max-stable distributions. We also connect the extremal dependence structure of classical max-stable distributions and that of our new family of max-stable distributions. Using an inversion method, we derive a semiparametric composite-estimator for the extremal dependence of the unobservable data, starting from a preliminary estimator of the extremal dependence of the aggregated data. Furthermore, we develop the large-sample theory of the composite-estimator and illustrate its finite-sample performance via a simulation study.

The increasing interest in using statistical extreme value theory to analyse environmental data is mainly driven by the large impact extreme events can have. A difficulty with spatial data is that most existing inference methods for asymptotically justified models for extremes are computationally intractable for data at several hundreds of sites, a number easily attained or surpassed by the output of physical climate models or satellite-based data sets. This thesis does not directly tackle this problem, but it provides some elements that might be useful in doing so. The first part of the thesis contains a pointwise marginal analysis of satellite-based measurements of total column ozone in the northern and southern mid-latitudes. At each grid cell, the r-largest order statistics method is used to analyse extremely low and high values of total ozone, and an autoregressive moving average time series model is used for an analogous analysis of mean values. Both models include the same set of global covariates describing the dynamical and chemical state of the atmosphere. The results show that influence of the covariates is captured in both the ``bulk'' and the tails of the statistical distribution of ozone. For some covariates, our results are in good agreement with findings of earlier studies, whereas unprecedented influences are retrieved for two dynamical covariates. The second part concerns the frameworks of multivariate and spatial modelling of extremes. We review one class of multivariate extreme value distributions, the so-called Hüsler--Reiss model, as well as its spatial extension, the Brown--Resnick process. For the former, we provide a detailed discussion of its parameter matrix, including the case of degeneracy, which arises if the correlation matrices of underlying multivariate Gaussian distributions are singular. We establish a simplification for computing the partial derivatives of the exponent function of these two models. As consequence of the considerably reduced number of terms in each partial derivative, computation time for the multivariate joint density of these models can be reduced, which could be helpful for (composite) likelihood inference. Finally, we propose a new variant of the Brown--Resnick process based on the Karhunen--Loève expansion of its underlying Gaussian process. As an illustration, we use composite likelihood to fit a simplified version of our model to a hindcast data set of wave heights that shows highly dependent extremes.