**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Concept# Summary statistics

Summary

In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate the largest amount of information as simply as possible. Statisticians commonly try to describe the observations in
a measure of location, or central tendency, such as the arithmetic mean
a measure of statistical dispersion like the standard mean absolute deviation
a measure of the shape of the distribution like skewness or kurtosis
if more than one variable is measured, a measure of statistical dependence such as a correlation coefficient
A common collection of order statistics used as summary statistics are the five-number summary, sometimes extended to a seven-number summary, and the associated box plot.
Entries in an analysis of variance table can also be regarded as summary statistics.
Common measures of location, or central tendency, are the arithmetic mean, median, mode, and interquartile mean.
Common measures of statistical dispersion are the standard deviation, variance, range, interquartile range, absolute deviation, mean absolute difference and the distance standard deviation. Measures that assess spread in comparison to the typical size of data values include the coefficient of variation.
The Gini coefficient was originally developed to measure income inequality and is equivalent to one of the L-moments.
A simple summary of a dataset is sometimes given by quoting particular order statistics as approximations to selected percentiles of a distribution.
Common measures of the shape of a distribution are skewness or kurtosis, while alternatives can be based on L-moments. A different measure is the distance skewness, for which a value of zero implies central symmetry.
The common measure of dependence between paired random variables is the Pearson product-moment correlation coefficient, while a common alternative summary statistic is Spearman's rank correlation coefficient. A value of zero for the distance correlation implies independence.
Humans efficiently use summary statistics to quickly perceive the gist of auditory and visual information.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related MOOCs (8)

Related publications (71)

Related people (10)

Related concepts (23)

Related courses (32)

Ontological neighbourhood

Related lectures (179)

Optimization: principles and algorithms - Linear optimization

Introduction to linear optimization, duality and the simplex algorithm.

Optimization: principles and algorithms - Linear optimization

Introduction to linear optimization, duality and the simplex algorithm.

Optimization: principles and algorithms - Network and discrete optimization

Introduction to network optimization and discrete optimization

Order statistic

In statistics, the kth order statistic of a statistical sample is equal to its kth-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference. Important special cases of the order statistics are the minimum and maximum value of a sample, and (with some qualifications discussed below) the sample median and other sample quantiles.

Quartile

In statistics, a quartile is a type of quantile which divides the number of data points into four parts, or quarters, of more-or-less equal size. The data must be ordered from smallest to largest to compute quartiles; as such, quartiles are a form of order statistic. The three main quartiles are as follows: The first quartile (Q1) is defined as the middle number between the smallest number (minimum) and the median of the data set. It is also known as the lower quartile, as 25% of the data is below this point.

Average absolute deviation

The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean, median, mode, or the result of any other measure of central tendency or any reference value related to the given data set. AAD includes the mean absolute deviation and the median absolute deviation (both abbreviated as MAD). Several measures of statistical dispersion are defined in terms of the absolute deviation.

MATH-131: Probability and statistics

Le cours présente les notions de base de la théorie des probabilités et de l'inférence statistique. L'accent est mis sur les concepts principaux ainsi que les méthodes les plus utilisées.

MICRO-428: Metrology

The course deals with the concept of measuring in different domains, particularly in the electrical, optical, and microscale domains. The course will end with a perspective on quantum measurements, wh

MATH-413: Statistics for data science

Statistics lies at the foundation of data science, providing a unifying theoretical and methodological backbone for the diverse tasks enountered in this emerging field. This course rigorously develops

Probability and Statistics

Introduces probability, statistics, distributions, inference, likelihood, and combinatorics for studying random events and network modeling.

Probability and Statistics

Covers fundamental concepts in probability and statistics, including distributions, properties, and expectations of random variables.

Modes of Convergence of Random Variables

Covers the modes of convergence of random variables and the Central Limit Theorem, discussing implications and approximations.

Cheng Zhao, Ginevra Favole, Yu Yu

Context. We present a novel approach to the construction of mock galaxy catalogues for large-scale structure analysis based on the distribution of dark matter halos obtained with effective bias models at the field level. Aims. We aim to produce mock galaxy ...

The presence of competing events, such as death, makes it challenging to define causal effects on recurrent outcomes. In this thesis, I formalize causal inference for recurrent events, with and without competing events. I define several causal estimands an ...

Frédéric Courbin, Gianluca Castignani, Jean-Luc Starck, Austin Chandler Peel, Maurizio Martinelli, Yi Wang, Richard Massey, Fabio Finelli, Marcello Farina

Recent cosmic shear studies have shown that higher-order statistics (HOS) developed by independent teams now outperform standard two-point estimators in terms of statistical precision thanks to their sensitivity to the non-Gaussian features of large-scale ...