Index of dispersionIn probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a normalized measure of the dispersion of a probability distribution: it is a measure used to quantify whether a set of observed occurrences are clustered or dispersed compared to a standard statistical model.
Linear regressionIn statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.
AverageIn ordinary language, an average is a single number taken as representative of a list of numbers, usually the sum of the numbers divided by how many numbers are in the list (the arithmetic mean). For example, the average of the numbers 2, 3, 4, 7, and 9 (summing to 25) is 5. Depending on the context, an average might be another statistic such as the median, or mode. For example, the average personal income is often given as the median—the number below which are 50% of personal incomes and above which are 50% of personal incomes—because the mean would be higher by including personal incomes from a few billionaires.
High-dimensional statisticsIn statistical theory, the field of high-dimensional statistics studies data whose dimension is larger than typically considered in classical multivariate analysis. The area arose owing to the emergence of many modern data sets in which the dimension of the data vectors may be comparable to, or even larger than, the sample size, so that justification for the use of traditional techniques, often based on asymptotic arguments with the dimension held fixed as the sample size increased, was lacking.
Overdetermined systemIn mathematics, a system of equations is considered overdetermined if there are more equations than unknowns. An overdetermined system is almost always inconsistent (it has no solution) when constructed with random coefficients. However, an overdetermined system will have solutions in some cases, for example if some equation occurs several times in the system, or if some equations are linear combinations of the others. The terminology can be described in terms of the concept of constraint counting.
Resampling (statistics)In statistics, resampling is the creation of new samples based on one observed sample. Resampling methods are: Permutation tests (also re-randomization tests) Bootstrapping Cross validation Permutation test Permutation tests rely on resampling the original data assuming the null hypothesis. Based on the resampled data it can be concluded how likely the original data is to occur under the null hypothesis.
InterestIn finance and economics, interest is payment from a borrower or deposit-taking financial institution to a lender or depositor of an amount above repayment of the principal sum (that is, the amount borrowed), at a particular rate. It is distinct from a fee which the borrower may pay to the lender or some third party. It is also distinct from dividend which is paid by a company to its shareholders (owners) from its profit or reserve, but not at a particular rate decided beforehand, rather on a pro rata basis as a share in the reward gained by risk taking entrepreneurs when the revenue earned exceeds the total costs.
Homoscedasticity and heteroscedasticityIn statistics, a sequence (or a vector) of random variables is homoscedastic (ˌhoʊmoʊskəˈdæstɪk) if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used.
Congestion pricingCongestion pricing or congestion charges is a system of surcharging users of public goods that are subject to congestion through excess demand, such as through higher peak charges for use of bus services, electricity, metros, railways, telephones, and road pricing to reduce traffic congestion; airlines and shipping companies may be charged higher fees for slots at airports and through canals at busy times. Advocates claim this pricing strategy regulates demand, making it possible to manage congestion without increasing supply.
Linear discriminant analysisLinear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.