**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Strong convergence of multivariate maxima

Abstract

It is well known and readily seen that the maximum of n independent and uniformly on [0, 1] distributed random variables, suitably standardised, converges in total variation distance, as n increases, to the standard negative exponential distribution. We extend this result to higher dimensions by considering copulas. We show that the strong convergence result holds for copulas that are in a differential neighbourhood of a multivariate generalised Pareto copula. Sklar's theorem then implies convergence in variational distance of the maximum of n independent and identically distributed random vectors with arbitrary common distribution function and (under conditions on the marginals) of its appropriately normalised version. We illustrate how these convergence results can be exploited to establish the almost-sure consistency of some estimation procedures for max-stable models, using sample maxima.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts

Loading

Related publications

Loading

Related concepts (16)

Multivariate normal distribution

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) no

Random variable

A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. The term 'random va

Probability density function

In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a function whose value at any given sample (or point) in th

Related publications (4)

Loading

Loading

Loading

Modern data storage systems are extremely large and consist of several tens or hundreds of nodes. In such systems, node failures are daily events, and safeguarding data from them poses a serious design challenge. The focus of this thesis is on the data reliability analysis of storage systems and, in particular, on the effect of different design choices and parameters on the system reliability. Data redundancy, in the form of replication or advanced erasure codes, is used to protect data from node failures. By storing redundant data across several nodes, the surviving redundant data on surviving nodes can be used to rebuild the data lost by the failed nodes if node failures occur. As these rebuild processes take a finite amount of time to complete, there exists a nonzero probability of additional node failures during rebuild, which eventually may lead to a situation in which some of the data have lost so much redundancy that they become irrecoverably lost from the system. The average time taken by the system to suffer an irrecoverable data loss, also known as the mean time to data loss (MTTDL), is a measure of data reliability that is commonly used to compare different redundancy schemes and to study the effect of various design parameters. The theoretical analysis of MTTDL, however, is a challenging problem for non-exponential real-world failure and rebuild time distributions and for general data placement schemes. To address this issue, a methodology for reliability analysis is developed in this thesis that is based on the probability of direct path to data loss during rebuild. The reliability analysis is detailed in the sense that it accounts for the rebuild times involved, the amounts of partially rebuilt data when additional nodes fail during rebuild, and the fact that modern systems use an intelligent rebuild process that will first rebuild the data having the least amount of redundancy left. Through rigorous arguments and simulations it is established that the methodology developed is well-suited for the reliability analysis of real-world data storage systems. Applying this methodology to data storage systems with different types of redundancy, various data placement schemes, and rebuild constraints, the effect of the design parameters on the system reliability is studied. When sufficient network bandwidth is available for rebuild processes, it is shown that spreading the redundant data corresponding to the data on each node across a higher number of other nodes and using a distributed and intelligent rebuild process will improve the system MTTDL. In particular, declustered placement, which corresponds to spreading the redundant data corresponding to each node equally across all other nodes of the system, is found to potentially have significantly higher MTTDL values than other placement schemes, especially for large storage systems. This implies that more reliable data storage systems can be designed merely by changing the data placement without compromising on the storage efficiency or performance. The effect of a limited network rebuild bandwidth on the system reliability is also analyzed, and it is shown that, for certain redundancy schemes, spreading redundant data across more number of nodes can actually have a detrimental effect on reliability. It is also shown that the MTTDL values are invariant in a large class of node failure time distributions with the same mean. This class includes the exponential distribution as well as the real-world distributions, such as Weibull or gamma. This result implies that the system MTTDL will not be affected if the failure distribution is changed to a corresponding exponential one with the same mean. This observation is also of great importance because it suggests that the MTTDL results obtained in the literature by assuming exponential node failure distributions may still be valid for real-world storage systems despite the fact that real-world failure distributions are non-exponential. In contrast, it is shown that the MTTDL is sensitive to the node rebuild time distribution. A storage system reliability simulator is built to verify the theoretical results mentioned above. The simulator is sufficiently complex to perform all required failure events and rebuild tasks in a storage system, to use real-world failure and rebuild time distributions for scheduling failures and rebuilds, to take into account partial rebuilds when additional node failures occur, and to simulate different data placement schemes and compare their reliability. The simulation results are found to match the theoretical predictions with high confidence for a wide range of system parameters, thereby validating the methodology of reliability analysis developed.

We address adaptive multivariate polynomial approximation by means of the discrete least-squares method with random evaluations, to approximate in the L2 probability sense a smooth function depending on a random variable distributed according to a given probability density. The polynomial least-squares approximation is computed using random noiseless pointwise evaluations of the target function. Here noiseless means that the pointwise evaluation of the function is not polluted by the presence of noise. Recent works Migliorati et al. (Found Comput Math 14:419–456, 2014), Cohen et al. (Found Comput Math 13:819–834, 2013), and Chkifa et al. (Discrete least squares polynomial approximation with random evaluations – application to parametric and stochastic elliptic PDEs, EPFL MATHICSE report 35/2013, submitted) have analyzed the univariate and multivariate cases, providing error estimates for (a priori) given sequences of polynomial spaces. In the present work, we apply the results developed in the aforementioned analyses to devise adaptive least-squares polynomial approximations. We build a sequence of quasi-optimal best n-term sets to approximate multivariate functions that feature strong anisotropy in moderately high dimensions. The adaptive approximation relies on a greedy selection of basis functions, which preserves the downward closedness property of the polynomial approximation space. Numerical results show that the adaptive approximation is able to catch effectively the anisotropy in the function.

The reliability of new overhead electric and telecommunication lines depends principally on the quality of their support structures. These structures are generally made of wood, metal or concrete. The complexity of a natural substance such as wood requires a thorough analysis of the various factors that influence its overall quality. In the case of wood poles, such factors include initial forest growth pattern, the species of wood and its preservative treatment, ageing characteristics, and its various mechanical defects such as knots, cracks etc. The accumulation of knowledge on the effect of the various variables that contribute to the overall quality of a wood support structure permits an optimum use of such a resource. For example, less variability and higher strength of wood support structures permits optimum loading and spacing between structures, thus reducing the number needed in a specific length of an overhead line. If one assumes that in Western Europe 1 wood pole is employed for every 2 inhabitants, and that this proportion increases in less densely populated countries such as the US and Scandinavia, the economics of optimum use of wood as a resource soon become apparent. In less developed countries, the proportions and the economics vary depending on the natural resources such as wood that they employ. The goal of this research is to establish, thanks to non destructive evaluations, a general ageing probabilistic law of the wooden pole based on two distinguished laws: one on the new pole in studying the influence of a grading of the bad elements based on a normal law: "left-truncation of a normal distribution", point 1; and another one based on the in-field wooden pole in exploiting the different parameters such as: the age of the pole, its chemical treatment, its species, its knots etc. in order to define the pole's damage law, point 2. Statistical distribution law of the new wooden pole after grading by non destructive sorting (ultrasounds) of the high mechanical performances supports: This new distribution law is a Gaussian law or evolves to a Log or Weibull's law with 3 parameters according to the inspected species. This grading allows a revalorization of the properties of the new poles and of the design values while guaranteeing an index of reliability required by the design standards, or in improving directly this nominal reliability (economic gain and reliability gain). Statistical distribution law of an aged in-field population (20-50 years old) approached by a bi-modal law which depends on: The distribution law of the new component (see point 1) and its minimal extreme law, which is asymmetrical, for an observation on 50 years. The statistical distribution at the time t of the residual mechanical performances of a group of supports making a local net, evaluated by non destructive methods. The non destructive evaluation is based on the measurements of physical variables (density, biological moisture content) and some descriptive variables from natural origins (diameter, knots, cracks...) and from accidental origins (diameter reduction, lightning cracks...). The statistical distribution at the time t is then obtained on the basis of a model of multivariate non destructive evaluation, generalized to the whole of species and treatments. This model is the other concrete goal to reach in this thesis. As a conclusion, the research demonstrates the influence and the interaction of the new pole grading (distribution at t0) on the modelisation of the distribution at ti (multivariate non destructive model). The data used for the mentioned modelisations come from a significant international database with a large amount of inspected wood poles and with studied cases. This database is the synthesis of about 15 years of research and development leaded by IBOIS-EPFL and its international partners. The probabilistic approaches are then validated by a huge database allowing thus to be directly exploitable. On this basis, all the standards dealing with the new poles and dealing with the controls and maintenances of a wooden pole networks, could be re-examined for a double gain: Concerning the economy: by increasing the capacity of the new poles profiting of an objective quality assurance, and by increasing the life time of the in-field pole, in knowing how to purge only the ones which are under the critical threshold of damage Concerning the reliability: by increasing the reliability of the network from the stage "new pole", by eliminating the weakest components, and by maintaining this reliability during all the life time of the network thanks to a cyclic preventive maintenance (every 5 to 8 years) and the replacement of only the weakened poles.