Descriptive statistics

A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and analysing those statistics. Descriptive statistics is distinguished from inferential statistics (or inductive statistics) by its aim to summarize a sample, rather than use the data to learn about the population that the sample of data is thought to represent. This generally means that descriptive statistics, unlike inferential statistics, is not developed on the basis of probability theory, and are frequently nonparametric statistics. Even when a data analysis draws its main conclusions using inferential statistics, descriptive statistics are generally also presented. For example, in papers reporting on human subjects, typically a table is included giving the overall sample size, sample sizes in important subgroups (e.g., for each treatment or exposure group), and demographic or clinical characteristics such as the average age, the proportion of subjects of each sex, the proportion of subjects with related co-morbidities, etc. Some measures that are commonly used to describe a data set are measures of central tendency and measures of variability or dispersion. Measures of central tendency include the mean, median and mode, while measures of variability include the standard deviation (or variance), the minimum and maximum values of the variables, kurtosis and skewness. Descriptive statistics provide simple summaries about the sample and about the observations that have been made. Such summaries may be either quantitative, i.e. summary statistics, or visual, i.e. simple-to-understand graphs. These summaries may either form the basis of the initial description of the data as part of a more extensive statistical analysis, or they may be sufficient in and of themselves for a particular investigation. For example, the shooting percentage in basketball is a descriptive statistic that summarizes the performance of a player or a team.

Ontology

Official source

Ontology

Official source

Related categories (32)

Statistical graphics

Statistical graphics, also known as statistical graphical techniques, are graphics used in the field of statistics for data visualization. Whereas statistics and data analysis procedures generally yield their output in numeric or tabular form, graphical techniques allow such results to be displayed in some sort of pictorial form. They include plots such as scatter plots, histograms, probability plots, spaghetti plots, residual plots, box plots, block plots and biplots. Exploratory data analysis (EDA) relies heavily on such techniques.

Mathematical statistics

Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory. Statistical data collection is concerned with the planning of studies, especially with the design of randomized experiments and with the planning of surveys using random sampling.

Probability distribution

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events (subsets of the sample space). For instance, if X is used to denote the outcome of a coin toss ("the experiment"), then the probability distribution of X would take the value 0.5 (1 in 2 or 1/2) for X = heads, and 0.

Related concepts (28)

Rank–size distribution

Rank–size distribution is the distribution of size by rank, in decreasing order of size. For example, if a data set consists of items of sizes 5, 100, 5, and 8, the rank-size distribution is 100, 8, 5, 5 (ranks 1 through 4). This is also known as the rank–frequency distribution, when the source data are from a frequency distribution. These are particularly of interest when the data vary significantly in scales, such as city size or word frequency.

Algebra of random variables

The algebra of random variables in statistics, provides rules for the symbolic manipulation of random variables, while avoiding delving too deeply into the mathematically sophisticated ideas of probability theory. Its symbolism allows the treatment of sums, products, ratios and general functions of random variables, as well as dealing with operations such as finding the probability distributions and the expectations (or expected values), variances and covariances of such combinations.

Parallel coordinates

Parallel coordinates are a common way of visualizing and analyzing high-dimensional datasets. To show a set of points in an n-dimensional space, a backdrop is drawn consisting of n parallel lines, typically vertical and equally spaced. A point in n-dimensional space is represented as a polyline with vertices on the parallel axes; the position of the vertex on the i-th axis corresponds to the i-th coordinate of the point.

Related courses (32)

CS-401: Applied data analysis

This course teaches the basic techniques, methodologies, and practical skills required to draw meaningful insights from a variety of data, with the help of the most acclaimed software tools in the dat

COM-480: Data visualization

Understanding why and how to present complex data interactively in an effective manner has become a crucial skill for any data scientist. In this course, you will learn how to design, judge, build and

ME-474: Numerical flow simulation

This course provides practical experience in the numerical simulation of fluid flows. Numerical methods are presented in the framework of the finite volume method. A simple solver is developed with Ma

Related lectures (32)

Data Visualization: Principles and Practices

Explores data visualization principles, including chart navigation, histograms, scatter plots, box plots, and color usage.

Visualizing Data: Techniques and Applications

Explores techniques and applications of data visualization, emphasizing the importance of effective communication and unconventional examples.

Data Visualization: Principles and Practices

Emphasizes the importance of data visualization techniques and practices for effective data analysis and communication.

Related publications (26)

An extension of the stochastic sewing lemma and applications to fractional stochastic calculus

Toyomu Matsuda

We give an extension of Le's stochastic sewing lemma. The stochastic sewing lemma proves convergence in

L_m

of Riemann type sums

\sum _{[s,t] \in \pi } A_{s,t}

for an adapted two-parameter stochastic process A, under certain conditions on the moments o ...

Cambridge Univ Press2024

Orchestrating chromosome conformation capture analysis with Bioconductor

Cyril Matthey-Doret

Genome-wide chromatin conformation capture assays provide formidable insights into the spatial organization of genomes. However, due to the complexity of the data structure, their integration in multi-omics workflows remains challenging. We present data st ...

Nature Research2024

Probing Catalytic Sites and Adsorbate Spillover on Ultrathin FeO2-x Film on Ir(111) during CO Oxidation

Harald Brune, Hao Yin, Wei Fang

The spatially resolved identification of active sites on the heterogeneous catalyst surface is an essential step toward directly visualizing a catalytic reaction with atomic scale. To date, ferrous centers on platinum group metals have shown promising pote ...

Amer Chemical Soc2024