In statistics, the Bonferroni correction is a method to counteract the multiple comparisons problem.
The method is named for its use of the Bonferroni inequalities.
An extension of the method to confidence intervals was proposed by Olive Jean Dunn.
Statistical hypothesis testing is based on rejecting the null hypothesis if the likelihood of the observed data under the null hypotheses is low. If multiple hypotheses are tested, the probability of observing a rare event increases, and therefore, the likelihood of incorrectly rejecting a null hypothesis (i.e., making a Type I error) increases.
The Bonferroni correction compensates for that increase by testing each individual hypothesis at a significance level of , where is the desired overall alpha level and is the number of hypotheses. For example, if a trial is testing hypotheses with a desired , then the Bonferroni correction would test each individual hypothesis at . Likewise, when constructing multiple confidence intervals the same phenomenon appears.
Let be a family of hypotheses and their corresponding p-values. Let be the total number of null hypotheses, and let be the number of true null hypotheses (which is presumably unknown to the researcher). The family-wise error rate (FWER) is the probability of rejecting at least one true , that is, of making at least one type I error. The Bonferroni correction rejects the null hypothesis for each , thereby controlling the FWER at . Proof of this control follows from Boole's inequality, as follows:
This control does not require any assumptions about dependence among the p-values or about how many of the null hypotheses are true.
Rather than testing each hypothesis at the level, the hypotheses may be tested at any other combination of levels that add up to , provided that the level of each test is decided before looking at the data. For example, for two hypothesis tests, an overall of 0.05 could be maintained by conducting one test at 0.04 and the other at 0.01.
The procedure proposed by Dunn can be used to adjust confidence intervals.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. The more inferences are made, the more likely erroneous inferences become. Several statistical techniques have been developed to address that problem, typically by requiring a stricter significance threshold for individual comparisons, so as to compensate for the number of inferences being made.
In null-hypothesis significance testing, the p-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small p-value means that such an extreme observed outcome would be very unlikely under the null hypothesis. Even though reporting p-values of statistical tests is common practice in academic publications of many quantitative fields, misinterpretation and misuse of p-values is widespread and has been a major topic in mathematics and metascience.
The harmonic mean p-value (HMP) is a statistical technique for addressing the multiple comparisons problem that controls the strong-sense family-wise error rate (this claim has been disputed). It improves on the power of Bonferroni correction by performing combined tests, i.e. by testing whether groups of p-values are statistically significant, like Fisher's method. However, it avoids the restrictive assumption that the p-values are independent, unlike Fisher's method.
Statistics lies at the foundation of data science, providing a unifying theoretical and methodological backbone for the diverse tasks enountered in this emerging field. This course rigorously develops
This course teaches the basic techniques, methodologies, and practical skills required to draw meaningful insights from a variety of data, with the help of the most acclaimed software tools in the dat
In diverse fields such as medical imaging, astrophysics, geophysics, or material study, a common challenge exists: reconstructing the internal volume of an object using only physical measurements taken from its exterior or surface. This scientific approach ...
In this paper we study the problem of social learning under multiple true hypotheses and self-interested agents. In this setup, each agent receives data that might be generated from a different hypothesis (or state) than the data other agents receive. In c ...
Objectives To determine and compare the qualitative and quantitative diagnostic performance of a single sagittal fast spin echo (FSE) T2-weighted Dixon sequence in differentiating benign and malignant vertebral compression fractures (VCF), using multiple r ...
Explores statistical hypothesis testing, error types, thresholding, and multiple comparisons in GLM.
Explores constructing confidence regions, inverting hypothesis tests, and the pivotal method, emphasizing the importance of likelihood methods in statistical inference.
Explores the principles and applications of Analysis of Variance (ANOVA), including test hypotheses, models, assumptions, and post-hoc tests.