**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Category# Statistical sampling

Summary

In statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt to collect samples that are representative of the population. Sampling has lower costs and faster data collection compared to recording data from the entire population, and thus, it can provide insights in cases where it is infeasible to measure an entire population.
Each observation measures one or more properties (such as weight, location, colour or mass) of independent objects or individuals. In survey sampling, weights can be applied to the data to adjust for the sample design, particularly in stratified sampling. Results from probability theory and statistical theory are employed to guide the practice. In business and medical research, sampling is widely used for gathering information about a population. Acceptance sampling is used to determine if a production lot of material meets the governing specifications.
Successful statistical practice is based on focused problem definition. In sampling, this includes defining the "population" from which our sample is drawn. A population can be defined as including all people or items with the characteristics one wishes to understand. Because there is very rarely enough time or money to gather information from everyone or everything in a population, the goal becomes finding a representative sample (or subset) of that population.
Sometimes what defines a population is obvious. For example, a manufacturer needs to decide whether a batch of material from production is of high enough quality to be released to the customer or should be scrapped or reworked due to poor quality. In this case, the batch is the population.
Although the population of interest often consists of physical objects, sometimes it is necessary to sample over time, space, or some combination of these dimensions.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (18)

Related people (193)

Related MOOCs (8)

Related courses (46)

Related units (11)

Related categories (12)

Related publications (1,000)

Related lectures (695)

Poisson sampling

In survey methodology, Poisson sampling (sometimes denoted as PO sampling) is a sampling process where each element of the population is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sample. Each element of the population may have a different probability of being included in the sample (). The probability of being included in a sample during the drawing of a single sample is denoted as the first-order inclusion probability of that element ().

Sampling design

In the theory of finite population sampling, a sampling design specifies for every possible sample its probability of being drawn. Mathematically, a sampling design is denoted by the function which gives the probability of drawing a sample During Bernoulli sampling, is given by where for each element is the probability of being included in the sample and is the total number of elements in the sample and is the total number of elements in the population (before sampling commenced).

Survivorship bias

Survivorship bias or survival bias is the logical error of concentrating on entities that passed a selection process while overlooking those that did not. This can lead to incorrect conclusions because of incomplete data. Survivorship bias is a form of selection bias that can lead to overly optimistic beliefs because multiple failures are overlooked, such as when companies that no longer exist are excluded from analyses of financial performance.

Path Integral Methods in Atomistic Modelling

The course provides an introduction to the use of path integral methods in atomistic simulations.
The path integral formalism allows to introduce quantum mechanical effects on the equilibrium and (ap

Path Integral Methods in Atomistic Modelling

The course provides an introduction to the use of path integral methods in atomistic simulations.
The path integral formalism allows to introduce quantum mechanical effects on the equilibrium and (ap

Synchrotrons and X-Ray Free Electron Lasers (part 1)

Synchrotrons and X-Ray Free Electron Lasers (part 1)

DH-406: Machine learning for DH

This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple

PHYS-467: Machine learning for physicists

Machine learning and data analysis are becoming increasingly central in sciences including physics. In this course, fundamental principles and methods of machine learning will be introduced and practi

EE-205: Signals and systems (for EL)

Ce cours pose les bases d'un concept essentiel en ingénierie : la notion de système. Plus spécifiquement, le cours présente la théorie des systèmes linéaires invariants dans le temps (SLIT), qui sont

Mathematical statistics

Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory. Statistical data collection is concerned with the planning of studies, especially with the design of randomized experiments and with the planning of surveys using random sampling.

Social psychology

Social psychology is the scientific study of how thoughts, feelings, and behaviors are influenced by the real or imagined presence of other people or by social norms. Social psychologists typically explain human behavior as a result of the relationship between mental states and social situations, studying the social conditions under which thoughts, feelings, and behaviors occur, and how these variables influence social interactions.

Experiment

An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when a particular factor is manipulated. Experiments vary greatly in goal and scale but always rely on repeatable procedure and logical analysis of the results. There also exist natural experimental studies.

Quantum Information

Explores the CHSH operator, self-testing, eigenstates, and quantifying randomness in quantum systems.

Efficient Stochastic Numerical Methods

Explores efficient stochastic numerical methods for modeling and learning, covering topics like the Analytical Engine and kinase inhibitors.

Graph Coloring: Theory and Applications

Covers the theory and applications of graph coloring, focusing on disassortative stochastic block models and planted coloring.

François Maréchal, Jonas Schnidrig, Cédric Terrier

The recent geopolitical conflicts in Europe have underscored the vulnerability of the current energy system to the volatility of energy carrier prices. In the prospect of defining robust energy systems ensuring sustainable energy supply in the future, the ...

Surrogate-based optimization is widely used for aerodynamic shape optimization, and its effectiveness depends on representative sampling of the design space. However, traditional sampling methods are hard-pressed to effectively sample high-dimensional desi ...

2024Lenka Zdeborová, Giovanni Piccioli, Emanuele Troiani

In this paper, we study sampling from a posterior derived from a neural network. We propose a new probabilistic model consisting of adding noise at every pre- and post-activation in the network, arguing that the resulting posterior can be sampled using an ...