The look-elsewhere effect is a phenomenon in the statistical analysis of scientific experiments where an apparently statistically significant observation may have actually arisen by chance because of the sheer size of the parameter space to be searched.
Once the possibility of look-elsewhere error in an analysis is acknowledged, it can be compensated for by careful application of standard mathematical techniques.
More generally known in statistics as the problem of multiple comparisons, the term gained some media attention in 2011, in the context of the search for the Higgs boson at the Large Hadron Collider.
Bonferroni correction
Many statistical tests deliver a p-value, the probability that a given result could be obtained by chance, assuming the hypothesis one seeks to prove is in fact false. When asking "does X affect Y?", it is common to vary X and see if there is significant variation in Y as a result. If this p-value is less than some predetermined statistical significance threshold α, one considers the result "significant".
However, if one is performing multiple tests ("looking elsewhere" if the first test fails) then a p value of 1/n is expected to occur once per n tests. For example, when there is no real effect, an event with p < 0.05 will still occur once, on average, for each 20 tests performed. In order to compensate for this, you could divide your threshold α by the number of tests n, so a result is significant when p < α/n. Or, equivalently, multiply the observed p value by the number of tests (significant when np < α).
This is a simplified case; the number n is actually the number of degrees of freedom in the tests, or the number of effectively independent tests. If they are not fully independent, the number may be lower than the number of tests.
The look-elsewhere effect is a frequent cause of "significance inflation" when the number of independent tests n is underestimated because failed tests are not published.
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
vignette|Exemple de Data dredging. Le data dredging (littéralement le dragage de données mais mieux traduit comme étant du triturage de données) est une technique statistique qui . Une des formes du data dredging est de partir de données ayant un grand nombre de variables et un grand nombre de résultats, et de choisir les associations qui sont « statistiquement significatives », au sens de la valeur p (on parle aussi de p-hacking).
vignette|redresse=1.5|Illustration de la valeur-p. X désigne la loi de probabilité de la statistique de test et z la valeur calculée de la statistique de test. Dans un test statistique, la valeur-p (en anglais p-value pour probability value), parfois aussi appelée p-valeur, est la probabilité pour un modèle statistique donné sous l'hypothèse nulle d'obtenir une valeur au moins aussi extrême que celle observée. L'usage de la valeur-p est courant dans de nombreux domaines de recherche comme la physique, la psychologie, l'économie et les sciences de la vie.
The hydrogenase maturation proteins HypF and HypE catalyze the synthesis of the CN ligands of the active site iron of the NiFe-hydrogenases using carbamoylphosphate as a substrate. HypE protein from Escherichia coli was purified from a transformant overexp ...
In chronic disorders such as Parkinson’s disease (PD), fear of falling (FOF) is associated with falls and reduced quality of life. With inertial measurement units (IMUs) and dedicated algorithms, different aspects of mobility can be obtained during supervi ...