**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Binomial test

Summary

In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories using sample data.
The binomial test is useful to test hypotheses about the probability () of success:
where is a user-defined value between 0 and 1.
If in a sample of size there are successes, while we expect , the formula of the binomial distribution gives the probability of finding this value:
If the null hypothesis were correct, then the expected number of successes would be . We find our -value for this test by considering the probability of seeing an outcome as, or more, extreme. For a one-tailed test, this is straightforward to compute. Suppose that we want to test if . Then our -value would be,
An analogous computation can be done if we're testing if using the summation of the range from to instead.
Calculating a -value for a two-tailed test is slightly more complicated, since a binomial distribution isn't symmetric if . This means that we can't just double the -value from the one-tailed test. Recall that we want to consider events that are as, or more, extreme than the one we've seen, so we should consider the probability that we would see an event that is as, or less, likely than . Let denote all such events. Then the two-tailed -value is calculated as,
One common use of the binomial test is the case where the null hypothesizes that two categories occur with equal frequency (), such as a coin toss. Tables are widely available to give the significance observed numbers of observations in the categories for this case. However, as the example below shows, the binomial test is not restricted to this case.
When there are more than two categories, and an exact test is required, the multinomial test, based on the multinomial distribution, must be used instead of the binomial test.
For large samples such as the example below, the binomial distribution is well approximated by convenient continuous distributions, and these are used as the basis for alternative tests that are much quicker to compute, such as Pearson's chi-squared test and the G-test.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (7)

Related courses (5)

Related lectures (16)

Chi-squared test

A chi-squared test (also chi-square or χ2 test) is a statistical hypothesis test used in the analysis of contingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine whether two categorical variables (two dimensions of the contingency table) are independent in influencing the test statistic (values within the table). The test is valid when the test statistic is chi-squared distributed under the null hypothesis, specifically Pearson's chi-squared test and variants thereof.

Binomial test

In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories using sample data. The binomial test is useful to test hypotheses about the probability () of success: where is a user-defined value between 0 and 1. If in a sample of size there are successes, while we expect , the formula of the binomial distribution gives the probability of finding this value: If the null hypothesis were correct, then the expected number of successes would be .

Pearson's chi-squared test

Pearson's chi-squared test () is a statistical test applied to sets of categorical data to evaluate how likely it is that any observed difference between the sets arose by chance. It is the most widely used of many chi-squared tests (e.g., Yates, likelihood ratio, portmanteau test in time series, etc.) – statistical procedures whose results are evaluated by reference to the chi-squared distribution. Its properties were first investigated by Karl Pearson in 1900.

The two main topics covered by this course are classical molecular dynamics and the Monte Carlo method.

Discrete mathematics is a discipline with applications to almost all areas of study. It provides a set of indispensable tools to computer science in particular. This course reviews (familiar) topics a

This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple

Probability Basics: Events and Independence

Introduces the basics of probability, covering event probabilities, conditional probabilities, and independence.

Confidence Intervals and T-Test

Explores confidence intervals, T-test, and hypothesis testing, including assumptions and critical regions.

Neural Networks: Training and Activation

Explores neural networks, activation functions, backpropagation, and PyTorch implementation.