Concept# Sampling bias

Summary

In statistics, sampling bias is a bias in which a sample is collected in such a way that some members of the intended population have a lower or higher sampling probability than others. It results in a biased sample of a population (or non-human factors) in which all individuals, or instances, were not equally likely to have been selected. If this is not accounted for, results can be erroneously attributed to the phenomenon under study rather than to the method of sampling.
Medical sources sometimes refer to sampling bias as ascertainment bias. Ascertainment bias has basically the same definition, but is still sometimes classified as a separate type of bias.
Distinction from selection bias
Sampling bias is usually classified as a subtype of selection bias, sometimes specifically termed sample selection bias, but some classify it as a separate type of bias.
A distinction, albeit not universally accepted, of sampling bias is that it undermines the external va

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

Loading

Related people

Loading

Related units

Loading

Related concepts

Loading

Related courses

Loading

Related lectures

Loading

Related publications (11)

Related people (1)

Loading

Loading

Loading

Related concepts (5)

Related units (1)

Statistics

Statistics (from German: Statistik, "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and present

Sampling (statistics)

In statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a statistical population

Selection bias

Selection bias is the bias introduced by the selection of individuals, groups, or data for analysis in such a way that proper randomization is not achieved, thereby failing to ensure that the sample o

Related courses (4)

MGT-581: Introduction to econometrics

The course provides an introduction to econometrics. The objective is to learn how to make valid (i.e., causal) inference from economic data. It explains the main estimators and present methods to deal with endogeneity issues.

DH-500: Computational Social Media

The course integrates concepts from media studies, machine learning, multimedia and network science to characterize social practices and analyze content in sites like Facebook, Twitter and YouTube. Students will learn computational methods to infer individual and networked phenomena in social media.

MICRO-435: Quantum and nanocomputing

The course teaches non von-Neumann architectures. The first part of the course deals with quantum computing, sensing, and communications. The second focuses on field-coupled and conduction-based nanocomputing, in-memory and molecular computing, cellular automata, and spintronic computing.

Related lectures (6)

Counter-intuitive associations appear frequently in epidemiology, and these results are often debated. In particular, several scenarios are characterized by a general risk factor that appears protective in particular subpopulations, for example, individuals suffering from a specific
disease. However, the associations are not necessarily representing causal effects. Selection bias due to conditioning on a collider may often be involved, and causal graphs are widely used to highlight such biases. These graphs, however, are qualitative, and they do not provide information
on the real life relevance of a spurious association. Quantitative estimates of such associations can be obtained from simple statistical models. In this study, we present several paradoxical associations that occur in epidemiology, and we explore these associations in a causal, frailty framework.
By using frailty models, we are able to put numbers on spurious effects that often are neglected in epidemiology. We discuss several counter-intuitive findings that have been reported in real life analyses, and we present calculations that may expand the understanding of these associations.
In particular, we derive novel expressions to explain the magnitude of bias in index-event studies.

2017Claudio Bruschini, Edoardo Charbon, Utku Karaca, Ekin Kizilkan, Myung Jae Lee, Vladimir Pesic

This work presents a novel InGaAs/InP SPAD structure fabricated using a selective area growth (SAG) method. The surface topography of the selectively grown film deposited within the 70 mu m diffusion apertures is used to engineer the Zn diffusion profile to suppress premature edge breakdown. The device achieves a highly uniform active area without the need for shallow diffused guard ring (GR) regions that are inherent in standard InGaAs/InP SPADs. We have obtained 33% and 43% photon detection probability (PDP) at 1550 nm, with 5 V and 7 V excess bias, respectively. These measurements were performed at 300 K and 225 K. The dark count rate (DCR) per unit area at room temperature and at 5 V excess bias is 430 cps/mu m(2) and it decreases to 5 cps/mu m(2) at 225 K. Timing jitter is measured with passive quenching at 1550nm as 149 ps at full-width-at-half-maximum (FWHM), (300 K, 5 V excess bias). The proposed technology is suitable for a number of applications, including optical time-domain reflectometry (OTDR), quantum information, and light detection and ranging (LiDAR).

Pal Christie Ryalen, Mats Julius Stensrud

Time-to-event outcomes are often evaluated on the hazard scale, but interpreting hazards may be difficult. Recently in the causal inference literature concerns have been raised that hazards actually have a built-in selection bias that prevents simple causal interpretations. This is a problem even in randomized controlled trials, where hazard ratios have become a standard measure of treatment effects. Modelling on the hazard scale is nevertheless convenient, for example to adjust for covariates; using hazards for intermediate calculations may therefore be desirable. In this paper we present a generic method for transforming hazard estimates consistently to other scales at which these built-in selection biases are avoided. The method is based on differential equations and generalizes a well-known relation between the Nelson–Aalen and Kaplan–Meier estimators. Using the martingale central limit theorem, we show that covariances can be estimated consistently for a large class of estimators, thus allowing for rapid calculation of confidence intervals. Hence, given cumulative hazard estimates based on, for example, Aalen’s additive hazard model, we can obtain many other parameters without much more effort. We give several examples and the associated estimators. Coverage and convergence speed are explored via simulations, and the results suggest that reliable estimates can be obtained in real-life scenarios.

2018