**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Efficient inference for spatial extreme value processes associated to log-Gaussian random functions

Résumé

Max-stable processes arise as the only possible nontrivial limits for maxima of affinely normalized identically distributed stochastic processes, and thus form an important class of models for the extreme values of spatial processes. Until recently, inference for max-stable processes has been restricted to the use of pairwise composite likelihoods, due to intractability of higher-dimensional distributions. In this work we consider random fields that are in the domain of attraction of a widely used class of max-stable processes, namely those constructed via manipulation of log-Gaussian random functions. For this class, we exploit limiting d-dimensional multivariate Poisson process intensities of the underlying process for inference on all d-vectors exceeding a high marginal threshold in at least one component, employing a censoring scheme to incorporate information below the marginal threshold. We also consider the d-dimensional distributions for the equivalent max-stable process, and perform full likelihood inference by exploiting the methods of Stephenson & Tawn (2005), where information on the occurrence times of extreme events is shown to dramatically simplify the likelihood. The Stephenson-Tawn likelihood is in fact simply a special case of the censored Poisson process likelihood. We assess the improvements in inference from both methods over pairwise likelihood methodology by simulation.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Publications associées (29)

Chargement

Chargement

Chargement

Concepts associés (22)

Loi normale multidimensionnelle

En théorie des probabilités, on appelle loi normale multidimensionnelle, ou normale multivariée ou loi multinormale ou loi de Gauss à plusieurs variables, la loi de probabilité qui est la généralisat

Inférence bayésienne

vignette|Illustration comparant les approches fréquentiste et bayésienne (Christophe Michel, 2018).
L’inférence bayésienne est une méthode d'inférence statistique par laquelle on calcule les probabili

Random field

In physics and mathematics, a random field is a random function over an arbitrary domain (usually a multi-dimensional space such as \mathbb{R}^n). That is, it is a function f(x)

xtreme value analysis is concerned with the modelling of extreme events such as floods and heatwaves, which can have large impacts. Statistical modelling can be useful to better assess risks even if, due to scarcity of measurements, there is inherently very large residual uncertainty in any analysis. Driven by the increase in environmental databases, spatial modelling of extremes has expanded rapidly in the last decade. This thesis presents contributions to such analysis.
The first chapter is about likelihood-based inference in the univariate setting and investigates the use of bias-correction and higher-order asymptotic methods for extremes, highlighting through examples and illustrations the unique challenge posed by data scarcity. We focus on parametric modelling of extreme values, which relies on limiting distributional results and for which, as a result, uncertainty quantification is complicated. We find that, in certain cases, small-sample asymptotic methods can give improved inference by reducing the error rate of confidence intervals. Two data illustrations, linked to assessment of the frequency of extreme rainfall episodes in Venezuela and the analysis of survival of supercentenarians, illustrate the methods developed.
In the second chapter, we review the major methods for the analysis of spatial extremes models. We highlight the similarities and provide a thorough literature review along with novel simulation algorithms. The methods described therein are made available through a statistical software package.
The last chapter focuses on estimation for a Bayesian hierarchical model derived from a multivariate generalized Pareto process. We review approaches for the estimation of censored components in models derived from (log)-elliptical distributions, paying particular attention to the estimation of a high-dimensional Gaussian distribution function via Monte Carlo methods. The impacts of model misspecification and of censoring are explored through extensive simulations and we conclude with a case study of rainfall extremes in Eastern Switzerland.

The thesis is a contribution to extreme-value statistics, more precisely to the estimation of clustering characteristics of extreme values. One summary measure of the tendency to form groups is the inverse average cluster size. In extreme-value context, this parameter is called the extremal index, and apart from its relation with the size of groups, it appears as an important parameter measuring the effects of serial dependence on extreme levels in time series. Although several methods exist for its estimation in univariate sequences, these methods are only applicable for strictly stationary series satisfying a long-range asymptotic independence condition on extreme levels, cannot take covariates into consideration, and yield only crude estimates for the corresponding multivariate quantity. These are strong restrictions and great drawbacks. In climatic time series, both stationarity and asymptotic independence can be broken, due to climate change and possible long memory of the data, and not including information from simultaneously measured linked variables may lead to inefficient estimation. The thesis addresses these issues. First, we extend the theorem of Ferro and Segers (2003) concerning the distribution of inter-exceedance times: we introduce truncated inter-exceedance times, called K-gaps, and show that they follow the same exponential-point mass mixture distribution as the inter-exceedance times. The maximization of the likelihood built on this distribution yields a simple closed-form estimator for the extremal index. The method can admit covariates and can be applied with smoothing techniques, which allows its use in a nonstationary setting. Simulated and real data examples demonstrate the smooth estimation of the extremal index. The likelihood, based on an assumption of independence of the K-gaps, is misspecified whenever K is too small. This motivates another contribution of the thesis, the introduction into extreme-value statistics of misspecification tests based on the information matrix. For our likelihood, they are able to detect misspecification from any source, not only those due to a bad choice of the truncation parameter. They provide help also in threshold selection, and show whether the fundamental assumptions of stationarity or asymptotic independence are broken. Moreover, these diagnostic tests are of general use, and could be adapted to many kinds of extreme-value models, which are always approximate. Simulated examples demonstrate the performance of the misspecification tests in the context of extremal index estimation. Two data examples with complex behaviour, one univariate and the other bivariate, offer insight into their power in discovering situations where the fundamental assumptions of the likelihood model are not valid. In the multivariate case, the parameter corresponding to the univariate extremal index is the multivariate extremal index function. As in the univariate case, its appearance is linked to serial dependence in the observed processes. Univariate estimation methods can be applied, but are likely to give crude, unreasonably varying, estimates, and the constraints on the extremal index function implied by the characteristics of the stable tail dependence function are not automatically satisfied. The third contribution of the thesis is the development of methodology based on the M4 approximation of Smith and Weissman (1996), which can be used to estimate the multivariate extremal index, as well as other cluster characteristics. For this purpose, we give a preliminary cluster selection procedure, and approximate the noise on finite levels with a flexible semiparametric model, the Dirichlet mixtures used widely in Bayesian analysis. The model is fitted by the EM algorithm. Advantages and drawbacks of the method are discussed using the same univariate and bivariate examples as the likelihood methods.

Extreme events are responsible for huge material damage and are costly in terms of their human and economic impacts. They strike all facets of modern society, such as physical infrastructure and insurance companies through environmental hazards, banking and finance through stock market crises, and the internet and communication systems through network and server overloads. It is thus of increasing importance to accurately assess the risk of extreme events in order to mitigate them. Extreme value theory is a statistical approach to extrapolation of probabilities beyond the range of the data, which provides a robust framework to learn from an often small number of recorded extreme events.
In this thesis, we consider a conditional approach to modelling extreme values that is more flexible than standard models for simultaneously extreme events. We explore the subasymptotic properties of this conditional approach and prove that in specific situations its finite-sample behaviour can differ significantly from its limit characterisation.
For modelling extremes in time series with short-range dependence, the standard peaks-over-threshold method relies on a pre-processing step that retains only a subset of observations exceeding a high threshold and can result in badly-biased estimates. This method focuses on the marginal distribution of the extremes and does not estimate temporal extremal dependence.
We propose a new methodology to model time series extremes using Bayesian semiparametrics and allowing estimation of functionals of clusters of extremes.
We apply our methodology to model river flow data in England and improve flood risk assessment by explicitly describing extremal dependence in time, using information from all exceedances of a high threshold.
We develop two new bivariate models which are based on the conditional tail approach, and use all observations having at least one extreme component in our inference procedure, thus extracting more information from the data than existing approaches. We compare the efficiency of these models in a simulation study and discuss generalisations to higher-dimensional setups.
Existing models for extremes of Markov chains generally rely on a strong assumption of asymptotic dependence at all lags and separately consider marginal and joint features. We introduce a more flexible model and show how Bayesian semiparametrics can provide a suitable framework allowing simultaneous inference for the margins and the extremal dependence structure, yielding efficient risk estimates and a reliable assessment of uncertainty.