Autoregressive integrated moving averageIn statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. To better comprehend the data or to forecast upcoming series points, both of these models are fitted to time series data. ARIMA models are applied in some cases where data show evidence of non-stationarity in the sense of mean (but not variance/autocovariance), where an initial differencing step (corresponding to the "integrated" part of the model) can be applied one or more times to eliminate the non-stationarity of the mean function (i.
Trend-stationary processIn the statistical analysis of time series, a trend-stationary process is a stochastic process from which an underlying trend (function solely of time) can be removed, leaving a stationary process. The trend does not have to be linear. Conversely, if the process requires differencing to be made stationary, then it is called difference stationary and possesses one or more unit roots. Those two concepts may sometimes be confused, but while they share many properties, they are different in many aspects.
Training, validation, and test data setsIn machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data. These input data used to build the model are usually divided into multiple data sets. In particular, three data sets are commonly used in different stages of the creation of the model: training, validation, and test sets.
Instrumental variables estimationIn statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment. Intuitively, IVs are used when an explanatory variable of interest is correlated with the error term, in which case ordinary least squares and ANOVA give biased results.
Projection matrixIn statistics, the projection matrix , sometimes also called the influence matrix or hat matrix , maps the vector of response values (dependent variable values) to the vector of fitted values (or predicted values). It describes the influence each response value has on each fitted value. The diagonal elements of the projection matrix are the leverages, which describe the influence each response value has on the fitted value for that same observation.
CUSUMIn statistical quality control, the CUsUM (or cumulative sum control chart) is a sequential analysis technique developed by E. S. Page of the University of Cambridge. It is typically used for monitoring change detection. CUSUM was announced in Biometrika, in 1954, a few years after the publication of Wald's sequential probability ratio test (SPRT). E. S. Page referred to a "quality number" , by which he meant a parameter of the probability distribution; for example, the mean.
Quasi-likelihoodIn statistics, quasi-likelihood methods are used to estimate parameters in a statistical model when exact likelihood methods, for example maximum likelihood estimation, are computationally infeasible. Due to the wrong likelihood being used, quasi-likelihood estimators lose asymptotic efficiency compared to, e.g., maximum likelihood estimators. Under broadly applicable conditions, quasi-likelihood estimators are consistent and asymptotically normal. The asymptotic covariance matrix can be obtained using the so-called sandwich estimator.
Generalized linear mixed modelIn statistics, a generalized linear mixed model (GLMM) is an extension to the generalized linear model (GLM) in which the linear predictor contains random effects in addition to the usual fixed effects. They also inherit from GLMs the idea of extending linear mixed models to non-normal data. GLMMs provide a broad range of models for the analysis of grouped data, since the differences between groups can be modelled as a random effect. These models are useful in the analysis of many kinds of data, including longitudinal data.
Omitted-variable biasIn statistics, omitted-variable bias (OVB) occurs when a statistical model leaves out one or more relevant variables. The bias results in the model attributing the effect of the missing variables to those that were included. More specifically, OVB is the bias that appears in the estimates of parameters in a regression analysis, when the assumed specification is incorrect in that it omits an independent variable that is a determinant of the dependent variable and correlated with one or more of the included independent variables.
Positive and negative predictive valuesThe positive and negative predictive values (PPV and NPV respectively) are the proportions of positive and negative results in statistics and diagnostic tests that are true positive and true negative results, respectively. The PPV and NPV describe the performance of a diagnostic test or other statistical measure. A high result can be interpreted as indicating the accuracy of such a statistic. The PPV and NPV are not intrinsic to the test (as true positive rate and true negative rate are); they depend also on the prevalence.