Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables. Negative binomial regression is a popular generalization of Poisson regression because it loosens the highly restrictive assumption that the variance is equal to the mean made by the Poisson model. The traditional negative binomial regression model is based on the Poisson-gamma mixture distribution. This model is popular because it models the Poisson heterogeneity with a gamma distribution. Poisson regression models are generalized linear models with the logarithm as the (canonical) link function, and the Poisson distribution function as the assumed probability distribution of the response. If is a vector of independent variables, then the model takes the form where and . Sometimes this is written more compactly as where is now an (n + 1)-dimensional vector consisting of n independent variables concatenated to the number one. Here is simply concatenated to . Thus, when given a Poisson regression model and an input vector , the predicted mean of the associated Poisson distribution is given by If are independent observations with corresponding values of the predictor variables, then can be estimated by maximum likelihood. The maximum-likelihood estimates lack a closed-form expression and must be found by numerical methods. The probability surface for maximum-likelihood Poisson regression is always concave, making Newton–Raphson or other gradient-based methods appropriate estimation techniques. Given a set of parameters θ and an input vector x, the mean of the predicted Poisson distribution, as stated above, is given by and thus, the Poisson distribution's probability mass function is given by Now suppose we are given a data set consisting of m vectors , along with a set of m values .
Florent Gérard Krzakala, Lenka Zdeborová, Hugo Chao Cui
Mathieu Salzmann, Alexandre Massoud Alahi, Megh Hiren Shukla
Nikolaos Stergiopulos, Georgios Rovas, Vasiliki Bikia, Stamatia Zoi Pagoulatou, Emma Marie Roussel