**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Linear Autoregression for Prediction: Small Sample Inference

1993

Report or working paper

Report or working paper

Abstract

This paper describes inferences based on linear predictors for stationary time series. These methods are flexible, since relatively few assumptions are needed to fit a linear predictor. A confidence interval for the resulting predicted value, which takes account of the variance of the estimated parameters, is discussed. The possible non-parsimony of the linear prediction compared to the classical ARMA forecasting method is a drawback often mentioned in the literature. On the other hand, as we show in a small simulation study, the usual predictive inference based on an ARMA modelling is overoptimistic in small samples, whereas the coverage rate of our confidence interval is close to the nominal value even for small series.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts

Loading

Related publications

Loading

Related concepts (15)

Linear regression

In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variable

Prediction

A prediction (Latin præ-, "before," and dicere, "to say"), or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no u

Bayesian inference

Bayesian inference (ˈbeɪziən or ˈbeɪʒən ) is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes avail

Related publications (7)

Loading

Loading

Loading

The analysis of an observed univariate time series is often undertaken in order to get a prediction of a future event. With this purpose one can fix a class of predictors from which the optimal one will be identified and estimated. The more simple and common choice is the linear family, that is linear combinations of the lags of the series. However, it is well known that considering non-linearities in the lags may improve the prediction. We introduce in this paper a class of non-linear predictors based on polynomials and neural network methodology. These predictors have both the advantages of being relatively simple to identify and of introducing non-linearity without increasing the number of estimated parameters by much compared to linear predictors

1995The thesis is a contribution to extreme-value statistics, more precisely to the estimation of clustering characteristics of extreme values. One summary measure of the tendency to form groups is the inverse average cluster size. In extreme-value context, this parameter is called the extremal index, and apart from its relation with the size of groups, it appears as an important parameter measuring the effects of serial dependence on extreme levels in time series. Although several methods exist for its estimation in univariate sequences, these methods are only applicable for strictly stationary series satisfying a long-range asymptotic independence condition on extreme levels, cannot take covariates into consideration, and yield only crude estimates for the corresponding multivariate quantity. These are strong restrictions and great drawbacks. In climatic time series, both stationarity and asymptotic independence can be broken, due to climate change and possible long memory of the data, and not including information from simultaneously measured linked variables may lead to inefficient estimation. The thesis addresses these issues. First, we extend the theorem of Ferro and Segers (2003) concerning the distribution of inter-exceedance times: we introduce truncated inter-exceedance times, called K-gaps, and show that they follow the same exponential-point mass mixture distribution as the inter-exceedance times. The maximization of the likelihood built on this distribution yields a simple closed-form estimator for the extremal index. The method can admit covariates and can be applied with smoothing techniques, which allows its use in a nonstationary setting. Simulated and real data examples demonstrate the smooth estimation of the extremal index. The likelihood, based on an assumption of independence of the K-gaps, is misspecified whenever K is too small. This motivates another contribution of the thesis, the introduction into extreme-value statistics of misspecification tests based on the information matrix. For our likelihood, they are able to detect misspecification from any source, not only those due to a bad choice of the truncation parameter. They provide help also in threshold selection, and show whether the fundamental assumptions of stationarity or asymptotic independence are broken. Moreover, these diagnostic tests are of general use, and could be adapted to many kinds of extreme-value models, which are always approximate. Simulated examples demonstrate the performance of the misspecification tests in the context of extremal index estimation. Two data examples with complex behaviour, one univariate and the other bivariate, offer insight into their power in discovering situations where the fundamental assumptions of the likelihood model are not valid. In the multivariate case, the parameter corresponding to the univariate extremal index is the multivariate extremal index function. As in the univariate case, its appearance is linked to serial dependence in the observed processes. Univariate estimation methods can be applied, but are likely to give crude, unreasonably varying, estimates, and the constraints on the extremal index function implied by the characteristics of the stable tail dependence function are not automatically satisfied. The third contribution of the thesis is the development of methodology based on the M4 approximation of Smith and Weissman (1996), which can be used to estimate the multivariate extremal index, as well as other cluster characteristics. For this purpose, we give a preliminary cluster selection procedure, and approximate the noise on finite levels with a flexible semiparametric model, the Dirichlet mixtures used widely in Bayesian analysis. The model is fitted by the EM algorithm. Advantages and drawbacks of the method are discussed using the same univariate and bivariate examples as the likelihood methods.

Rémi Emonet, Jean-Marc Odobez, Jagannadan Varadarajan

In this article, we present a new model for unsupervised discovery of recurrent temporal patterns (or motifs) in time series (or documents). The model is designed to handle the difficult case of multivariate time series obtained from a mixture of activities, that is, our observations are caused by the superposition of multiple phenomena occurring concurrently and with no synchronization. The model uses non parametric Bayesian methods to describe both the motifs and their occurrences in documents. We derive an inference scheme to automatically and simultaneously recover the recurrent motifs (both their characteristics and number) and their occurrence instants in each document. The model is widely applicable and is illustrated on datasets coming from multiple modalities, mainly, videos from static cameras and audio localization data. The rich semantic interpretation that the model offers can be leveraged in tasks such as event counting or for scene analysis. The approach is also used as a mean of doing soft camera calibration in a camera network. A thorough study of the model parameters is provided and a cross-platform implementation of the inference algorithm will be made publicly available.

2014