Test statisticA test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing. A hypothesis test is typically specified in terms of a test statistic, considered as a numerical summary of a data-set that reduces the data to one value that can be used to perform the hypothesis test. In general, a test statistic is selected or defined in such a way as to quantify, within observed data, behaviours that would distinguish the null from the alternative hypothesis, where such an alternative is prescribed, or that would characterize the null hypothesis if there is no explicitly stated alternative hypothesis.
Normalization (statistics)In statistics and applications of statistics, normalization can have a range of meanings. In the simplest cases, normalization of ratings means adjusting values measured on different scales to a notionally common scale, often prior to averaging. In more complicated cases, normalization may refer to more sophisticated adjustments where the intention is to bring the entire probability distributions of adjusted values into alignment. In the case of normalization of scores in educational assessment, there may be an intention to align distributions to a normal distribution.
Prediction intervalIn statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are often used in regression analysis.
Pivotal quantityIn statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters (including nuisance parameters). A pivot quantity need not be a statistic—the function and its value can depend on the parameters of the model, but its distribution must not. If it is a statistic, then it is known as an ancillary statistic. More formally, let be a random sample from a distribution that depends on a parameter (or vector of parameters) .
Bootstrapping (statistics)Bootstrapping is any test or metric that uses random sampling with replacement (e.g. mimicking the sampling process), and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy (bias, variance, confidence intervals, prediction error, etc.) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods. Bootstrapping estimates the properties of an estimand (such as its variance) by measuring those properties when sampling from an approximating distribution.
Homoscedasticity and heteroscedasticityIn statistics, a sequence (or a vector) of random variables is homoscedastic (ˌhoʊmoʊskəˈdæstɪk) if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used.
68–95–99.7 ruleIn statistics, the 68–95–99.7 rule, also known as the empirical rule, is a shorthand used to remember the percentage of values that lie within an interval estimate in a normal distribution: 68%, 95%, and 99.7% of the values lie within one, two, and three standard deviations of the mean, respectively. In mathematical notation, these facts can be expressed as follows, where Pr() is the probability function, Χ is an observation from a normally distributed random variable, μ (mu) is the mean of the distribution, and σ (sigma) is its standard deviation: The usefulness of this heuristic especially depends on the question under consideration.
Studentized residualIn statistics, a studentized residual is the quotient resulting from the division of a residual by an estimate of its standard deviation. It is a form of a Student's t-statistic, with the estimate of error varying between points. This is an important technique in the detection of outliers. It is among several named in honor of William Sealey Gosset, who wrote under the pseudonym Student. Dividing a statistic by a sample standard deviation is called studentizing, in analogy with standardizing and normalizing.
Ancillary statisticAn ancillary statistic is a measure of a sample whose distribution (or whose pmf or pdf) does not depend on the parameters of the model. An ancillary statistic is a pivotal quantity that is also a statistic. Ancillary statistics can be used to construct prediction intervals. They are also used in connection with Basu's theorem to prove independence between statistics. This concept was first introduced by Ronald Fisher in the 1920s, but its formal definition was only provided in 1964 by Debabrata Basu.
Likelihood-ratio testIn statistics, the likelihood-ratio test assesses the goodness of fit of two competing statistical models, specifically one found by maximization over the entire parameter space and another found after imposing some constraint, based on the ratio of their likelihoods. If the constraint (i.e., the null hypothesis) is supported by the observed data, the two likelihoods should not differ by more than sampling error. Thus the likelihood-ratio test tests whether this ratio is significantly different from one, or equivalently whether its natural logarithm is significantly different from zero.