Concept

Seven-number summary

In descriptive statistics, the seven-number summary is a collection of seven summary statistics, and is an extension of the five-number summary. There are three similar, common forms. As with the five-number summary, it can be represented by a modified box plot, adding hatch-marks on the "whiskers" for two of the additional numbers. The following percentiles are (approximately) evenly spaced under a normally distributed variable: the 2nd percentile (better: 2.15%) the 9th percentile (better: 8.87%) the 25th percentile or lower quartile or first quartile the 50th percentile or median (middle value, or second quartile) the 75th percentile or upper quartile or third quartile the 91st percentile (better: 91.13%) the 98th percentile (better: 97.85%) The middle three values – the lower quartile, median, and upper quartile – are the usual statistics from the five-number summary and are the standard values for the box in a box plot. The two unusual percentiles at either end are used because the locations of all seven values will be approximately equally spaced if the data is normally distributed Some statistical tests require normally distributed data, so the plotted values provide a convenient visual check for validity of later tests, simply by scanning to see if the marks for those seven percentiles appear to be equal distances apart on the graph. Notice that whereas the extreme values of the five-number summary depend on the number of samples, this seven-number summary does not, and is somewhat more stable, since its whisker-ends are protected from the usual wild swings in the extreme values of the sample by replacing them with the more steady 2nd and 98th percentiles. The values can be represented using a modified box plot. The 2nd and 98th percentiles are represented by the ends of the whiskers, and hatch-marks across the whiskers mark the 9th and 91st percentiles. Arthur Bowley used a set of non-parametric statistics, called a "seven-figure summary", including the extremes, deciles, and quartiles, along with the median.

Official source

https://en.wikipedia.org/wiki/Seven-number_summary

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related courses (3)

PHYS-423: Plasma I

Following an introduction of the main plasma properties, the fundamental concepts of the fluid and kinetic theory of plasmas are introduced. Applications concerning laboratory, space, and astrophysica

PHYS-201(d): General physics: electromagnetism

The topics covered by the course are concepts of fluid mechanics, waves, and electromagnetism.

CS-401: Applied data analysis

This course teaches the basic techniques, methodologies, and practical skills required to draw meaningful insights from a variety of data, with the help of the most acclaimed software tools in the dat

Related concepts (5)

Sample maximum and minimum

In statistics, the sample maximum and sample minimum, also called the largest observation and smallest observation, are the values of the greatest and least elements of a sample. They are basic summary statistics, used in descriptive statistics such as the five-number summary and Bowley's seven-figure summary and the associated box plot. The minimum and the maximum value are the first and last order statistics (often denoted X(1) and X(n) respectively, for a sample size of n).

Five-number summary

The five-number summary is a set of descriptive statistics that provides information about a dataset. It consists of the five most important sample percentiles: the sample minimum (smallest observation) the lower quartile or first quartile the median (the middle value) the upper quartile or third quartile the sample maximum (largest observation) In addition to the median of a single set of data there are two related statistics called the upper and lower quartiles.

Box plot

In descriptive statistics, a box plot or boxplot is a method for graphically demonstrating the locality, spread and skewness groups of numerical data through their quartiles. In addition to the box on a box plot, there can be lines (which are called whiskers) extending from the box indicating variability outside the upper and lower quartiles, thus, the plot is also called the box-and-whisker plot and the box-and-whisker diagram. Outliers that differ significantly from the rest of the dataset may be plotted as individual points beyond the whiskers on the box-plot.

Official source

https://en.wikipedia.org/wiki/Seven-number_summary

About this result

Related courses (3)

PHYS-423: Plasma I

PHYS-201(d): General physics: electromagnetism

The topics covered by the course are concepts of fluid mechanics, waves, and electromagnetism.

CS-401: Applied data analysis

Related lectures (4)

Statistical Analysis: Boxplot and Normal Distribution

Introduces statistical analysis concepts like boxplot and normal distribution using real data examples.

Quantiles, Sampling, Histogram Density

Explores quantiles, sampling, and histogram density for understanding distributions and constructing confidence intervals.

Statistical Measures: Mean, Median, and Dispersion Techniques

Discusses statistical measures of central tendency and dispersion, focusing on mean, median, and their implications in data analysis.

Related publications (12)

DESI mock challenge Halo and galaxy catalogues with the bias assignment method

Cheng Zhao, Ginevra Favole, Yu Yu

Context. We present a novel approach to the construction of mock galaxy catalogues for large-scale structure analysis based on the distribution of dark matter halos obtained with effective bias models at the field level. Aims. We aim to produce mock galaxy ...

EDP SCIENCES S A2023

High spatial resolution dataset of La Mobiliere insurance customers

Claudia Rebeca Binder Signer, Emanuele Massaro

We present the La Mobiliere insurance customers dataset: a 12-year-long longitudinal collection of data on policies of customers of the Swiss insurance company La Mobiliere. To preserve the privacy of La Mobiliere customers, we propose the data aggregated ...

NATURE PORTFOLIO2022

Angular systematics-free cosmological analysis of galaxy clustering in configuration space

Cheng Zhao, Anand Stéphane Raichoor

Galaxy redshift surveys are subject to incompleteness and inhomogeneous sampling due to the various constraints inherent to spectroscopic observations. This can introduce systematic errors on the summary statistics of interest, which need to be mitigated i ...

OXFORD UNIV PRESS2022

Related concepts (5)

Sample maximum and minimum

Five-number summary

Box plot