Concept

G-test

Résumé
In statistics, G-tests are likelihood-ratio or maximum likelihood statistical significance tests that are increasingly being used in situations where chi-squared tests were previously recommended. The general formula for G is where is the observed count in a cell, is the expected count under the null hypothesis, denotes the natural logarithm, and the sum is taken over all non-empty cells. Furthermore, the total observed count should be equal to the total expected count:where is the total number of observations. We can derive the value of the G-test from the log-likelihood ratio test where the underlying model is a multinomial model. Suppose we had a sample where each is the number of times that an object of type was observed. Furthermore, let be the total number of objects observed. If we assume that the underlying model is multinomial, then the test statistic is defined bywhere is the null hypothesis and is the maximum likelihood estimate (MLE) of the parameters given the data. Recall that for the multinomial model, the MLE of given some data is defined byFurthermore, we may represent each null hypothesis parameter asThus, by substituting the representations of and in the log-likelihood ratio, the equation simplifies toRelabel the variables with and with . Finally, multiply by a factor of (used to make the G test formula asymptotically equivalent to the Pearson's chi-squared test formula) to achieve the form Heuristically, one can imagine as continuous and approaching zero, in which case and terms with zero observations can simply be dropped. However the expected count in each cell must be strictly greater than zero for each cell () to apply the method. Given the null hypothesis that the observed frequencies result from random sampling from a distribution with the given expected frequencies, the distribution of G is approximately a chi-squared distribution, with the same number of degrees of freedom as in the corresponding chi-squared test.
À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.