**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Likelihood-Based Inference for Max-Stable Processes

Résumé

The last decade has seen max-stable processes emerge as a common tool for the statistical modeling of spatial extremes However, their application is complicated due to the unavailability of the multivariate density function and so likehhood-based methods remain far from providing a complete and flexible framework kit inference In this article we develop inferentially practical likehhood-based methods for fitting max-stable processes derived from a composite-likehhood approach The procedure is sufficiently reliable and versatile to permit the simultaneous modeling of marginal and dependence parameters in the spatial context at a moderate computational cost The utility of this methodology is examined via simulation. and illustrated by the analysts of United States precipitation extremes

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (9)

Modèle statistique

Un modèle statistique est une description mathématique approximative du mécanisme qui a généré les observations, que l'on suppose être un processus stochastique et non un processus déterministe. Il s

Inférence statistique

vignette|Illustration des 4 principales étapes de l'inférence statistique
L'inférence statistique est l'ensemble des techniques permettant d'induire les caractéristiques d'un groupe général (la popul

Maximum de vraisemblance

En statistique, l'estimateur du maximum de vraisemblance est un estimateur statistique utilisé pour inférer les paramètres de la loi de probabilité d'un échantillon donné en recherchant les valeurs

Publications associées (4)

Chargement

Chargement

Chargement

The thesis is a contribution to extreme-value statistics, more precisely to the estimation of clustering characteristics of extreme values. One summary measure of the tendency to form groups is the inverse average cluster size. In extreme-value context, this parameter is called the extremal index, and apart from its relation with the size of groups, it appears as an important parameter measuring the effects of serial dependence on extreme levels in time series. Although several methods exist for its estimation in univariate sequences, these methods are only applicable for strictly stationary series satisfying a long-range asymptotic independence condition on extreme levels, cannot take covariates into consideration, and yield only crude estimates for the corresponding multivariate quantity. These are strong restrictions and great drawbacks. In climatic time series, both stationarity and asymptotic independence can be broken, due to climate change and possible long memory of the data, and not including information from simultaneously measured linked variables may lead to inefficient estimation. The thesis addresses these issues. First, we extend the theorem of Ferro and Segers (2003) concerning the distribution of inter-exceedance times: we introduce truncated inter-exceedance times, called K-gaps, and show that they follow the same exponential-point mass mixture distribution as the inter-exceedance times. The maximization of the likelihood built on this distribution yields a simple closed-form estimator for the extremal index. The method can admit covariates and can be applied with smoothing techniques, which allows its use in a nonstationary setting. Simulated and real data examples demonstrate the smooth estimation of the extremal index. The likelihood, based on an assumption of independence of the K-gaps, is misspecified whenever K is too small. This motivates another contribution of the thesis, the introduction into extreme-value statistics of misspecification tests based on the information matrix. For our likelihood, they are able to detect misspecification from any source, not only those due to a bad choice of the truncation parameter. They provide help also in threshold selection, and show whether the fundamental assumptions of stationarity or asymptotic independence are broken. Moreover, these diagnostic tests are of general use, and could be adapted to many kinds of extreme-value models, which are always approximate. Simulated examples demonstrate the performance of the misspecification tests in the context of extremal index estimation. Two data examples with complex behaviour, one univariate and the other bivariate, offer insight into their power in discovering situations where the fundamental assumptions of the likelihood model are not valid. In the multivariate case, the parameter corresponding to the univariate extremal index is the multivariate extremal index function. As in the univariate case, its appearance is linked to serial dependence in the observed processes. Univariate estimation methods can be applied, but are likely to give crude, unreasonably varying, estimates, and the constraints on the extremal index function implied by the characteristics of the stable tail dependence function are not automatically satisfied. The third contribution of the thesis is the development of methodology based on the M4 approximation of Smith and Weissman (1996), which can be used to estimate the multivariate extremal index, as well as other cluster characteristics. For this purpose, we give a preliminary cluster selection procedure, and approximate the noise on finite levels with a flexible semiparametric model, the Dirichlet mixtures used widely in Bayesian analysis. The model is fitted by the EM algorithm. Advantages and drawbacks of the method are discussed using the same univariate and bivariate examples as the likelihood methods.

Multiple generalized additive models are a class of statistical regression models wherein parameters of probability distributions incorporate information through additive smooth functions of predictors. The functions are represented by basis function expansions, whose coefficients are the regression parameters. The smoothness is induced by a quadratic roughness penalty on the functionsâ curvature, which is equivalent to a weighted $L_2$ regularization controlled by smoothing parameters. Regression fitting relies on maximum penalized likelihood estimation for the regression coefficients, and smoothness selection relies on maximum marginal likelihood estimation for the smoothing parameters.
Owing to their nonlinearity, flexibility and interpretability, generalized additive models are widely used in statistical modeling, but despite recent advances, reliable and fast methods for automatic smoothing in massive datasets are unavailable. Existing approaches are either reliable, complex and slow, or unreliable, simpler and fast, so a compromise must be made. A bridge between these categories is needed to extend use of multiple generalized additive models to settings beyond those possible in existing software. This thesis is one step in this direction. We adopt the marginal likelihood approach to develop approximate expectation-maximization methods for automatic smoothing, which avoid evaluation of expensive and unstable terms. This results in simpler algorithms that do not sacrifice reliability and achieve state-of-the-art accuracy and computational efficiency.
We extend the proposed approach to big-data settings and produce the first reliable, high-performance and distributed-memory algorithm for fitting massive multiple generalized additive models. Furthermore, we develop the underlying generic software libraries and make them accessible to the open-source community.

xtreme value analysis is concerned with the modelling of extreme events such as floods and heatwaves, which can have large impacts. Statistical modelling can be useful to better assess risks even if, due to scarcity of measurements, there is inherently very large residual uncertainty in any analysis. Driven by the increase in environmental databases, spatial modelling of extremes has expanded rapidly in the last decade. This thesis presents contributions to such analysis.
The first chapter is about likelihood-based inference in the univariate setting and investigates the use of bias-correction and higher-order asymptotic methods for extremes, highlighting through examples and illustrations the unique challenge posed by data scarcity. We focus on parametric modelling of extreme values, which relies on limiting distributional results and for which, as a result, uncertainty quantification is complicated. We find that, in certain cases, small-sample asymptotic methods can give improved inference by reducing the error rate of confidence intervals. Two data illustrations, linked to assessment of the frequency of extreme rainfall episodes in Venezuela and the analysis of survival of supercentenarians, illustrate the methods developed.
In the second chapter, we review the major methods for the analysis of spatial extremes models. We highlight the similarities and provide a thorough literature review along with novel simulation algorithms. The methods described therein are made available through a statistical software package.
The last chapter focuses on estimation for a Bayesian hierarchical model derived from a multivariate generalized Pareto process. We review approaches for the estimation of censored components in models derived from (log)-elliptical distributions, paying particular attention to the estimation of a high-dimensional Gaussian distribution function via Monte Carlo methods. The impacts of model misspecification and of censoring are explored through extensive simulations and we conclude with a case study of rainfall extremes in Eastern Switzerland.