In statistics, the logit (ˈloʊdʒɪt ) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in data transformations.
Mathematically, the logit is the inverse of the standard logistic function , so the logit is defined as
Because of this, the logit is also called the log-odds since it is equal to the logarithm of the odds where p is a probability. Thus, the logit is a type of function that maps probability values from to real numbers in , akin to the probit function.
If p is a probability, then p/(1 − p) is the corresponding odds; the logit of the probability is the logarithm of the odds, i.e.:
The base of the logarithm function used is of little importance in the present article, as long as it is greater than 1, but the natural logarithm with base e is the one most often used. The choice of base corresponds to the choice of logarithmic unit for the value: base 2 corresponds to a shannon, base e to a “nat”, and base 10 to a hartley; these units are particularly used in information-theoretic interpretations. For each choice of base, the logit function takes values between negative and positive infinity.
The “logistic” function of any number is given by the inverse-logit:
The difference between the logits of two probabilities is the logarithm of the odds ratio (R), thus providing a shorthand for writing the correct combination of odds ratios only by adding and subtracting:
There have been several efforts to adapt linear regression methods to a domain where the output is a probability value, , instead of any real number . In many cases, such efforts have focused on modeling this problem by mapping the range to and then running the linear regression on these transformed values. In 1934 Chester Ittner Bliss used the cumulative normal distribution function to perform this mapping and called his model probit an abbreviation for "probability unit";. However, this is computationally more expensive.