In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartile range. For instance, when the variance of data in a set is large, the data is widely scattered. On the other hand, when the variance is small, the data in the set is clustered.
Dispersion is contrasted with location or central tendency, and together they are the most used properties of distributions.
A measure of statistical dispersion is a nonnegative real number that is zero if all the data are the same and increases as the data become more diverse.
Most measures of dispersion have the same units as the quantity being measured. In other words, if the measurements are in metres or seconds, so is the measure of dispersion. Examples of dispersion measures include:
Standard deviation
Interquartile range (IQR)
Range
Mean absolute difference (also known as Gini mean absolute difference)
Median absolute deviation (MAD)
Average absolute deviation (or simply called average deviation)
Distance standard deviation
These are frequently used (together with scale factors) as estimators of scale parameters, in which capacity they are called estimates of scale. Robust measures of scale are those unaffected by a small number of outliers, and include the IQR and MAD.
All the above measures of statistical dispersion have the useful property that they are location-invariant and linear in scale. This means that if a random variable has a dispersion of then a linear transformation for real and should have dispersion , where is the absolute value of , that is, ignores a preceding negative sign .
Other measures of dispersion are dimensionless. In other words, they have no units even if the variable itself has units.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
In the academic or industrial world, to optimize a system, it is necessary to establish strategies for the experimental approach. The DOE allows you to choose the best set of measurement points to min
We cover the theory and applications of sparse stochastic processes (SSP). SSP are solutions of differential equations driven by non-Gaussian innovations. They admit a parsimonious representation in a
This is an introductory course to the concentration of measure phenomenon - random functions that depend on many random variables tend to be often close to constant functions.
In statistics, the median absolute deviation (MAD) is a robust measure of the variability of a univariate sample of quantitative data. It can also refer to the population parameter that is estimated by the MAD calculated from a sample. For a univariate data set X1, X2, ..., Xn, the MAD is defined as the median of the absolute deviations from the data's median : that is, starting with the residuals (deviations) from the data's median, the MAD is the median of their absolute values. Consider the data (1, 1, 2, 2, 4, 6, 9).
In probability theory and statistics, the coefficient of variation (COV), also known as Normalized Root-Mean-Square Deviation (NRMSD), Percent RMS, and relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is defined as the ratio of the standard deviation to the mean (or its absolute value, , and often expressed as a percentage ("%RSD"). The CV or RSD is widely used in analytical chemistry to express the precision and repeatability of an assay.
In statistics, count data is a statistical data type describing countable quantities, data which can take only the counting numbers, non-negative integer values {0, 1, 2, 3, ...}, and where these integers arise from counting rather than ranking. The statistical treatment of count data is distinct from that of binary data, in which the observations can take only two values, usually represented by 0 and 1, and from ordinal data, which may also consist of integers but where the individual values fall on an arbitrary scale and only the relative ranking is important.
As large, data-driven artificial intelligence models become ubiquitous, guaranteeing high data quality is imperative for constructing models. Crowdsourcing, community sensing, and data filtering have long been the standard approaches to guaranteeing or imp ...
EPFL2024
, ,
In this paper we consider two aspects of the inverse problem of how to construct merge trees realizing a given barcode. Much of our investigation exploits a recently discovered connection between the symmetric group and barcodes in general position, based ...
Minimising the longest travel distance for a group of mobile robots with interchangeable goals requires knowledge of the shortest length paths between all robots and goal destinations. Determining the exact length of the shortest paths in an environment wi ...