**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Concept# Poisson regression

Summary

In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.
Negative binomial regression is a popular generalization of Poisson regression because it loosens the highly restrictive assumption that the variance is equal to the mean made by the Poisson model. The traditional negative binomial regression model is based on the Poisson-gamma mixture distribution. This model is popular because it models the Poisson heterogeneity with a gamma distribution.
Poisson regression models are generalized linear models with the logarithm as the (canonical) link function, and the Poisson distribution function as the assumed probability distribution of the response.
If is a vector of independent variables, then the model takes the form
where and . Sometimes this is written more compactly as
where is now an (n + 1)-dimensional vector consisting of n independent variables concatenated to the number one. Here is simply concatenated to .
Thus, when given a Poisson regression model and an input vector , the predicted mean of the associated Poisson distribution is given by
If are independent observations with corresponding values of the predictor variables, then can be estimated by maximum likelihood. The maximum-likelihood estimates lack a closed-form expression and must be found by numerical methods. The probability surface for maximum-likelihood Poisson regression is always concave, making Newton–Raphson or other gradient-based methods appropriate estimation techniques.
Given a set of parameters θ and an input vector x, the mean of the predicted Poisson distribution, as stated above, is given by
and thus, the Poisson distribution's probability mass function is given by
Now suppose we are given a data set consisting of m vectors , along with a set of m values .

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related courses (32)

Machine learning and data analysis are becoming increasingly central in sciences including physics. In this course, fundamental principles and methods of machine learning will be introduced and practi

General graduate course on regression methods

This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple

Related lectures (106)

Related publications (83)

Related people (21)

, , , , , , , , ,

Related units (3)

Related concepts (12)

In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.

In statistics, overdispersion is the presence of greater variability (statistical dispersion) in a data set than would be expected based on a given statistical model. A common task in applied statistics is choosing a parametric model to fit a given set of empirical observations. This necessitates an assessment of the fit of the chosen model. It is usually possible to choose the model parameters in such a way that the theoretical population mean of the model is approximately equal to the sample mean.

In statistics, count data is a statistical data type describing countable quantities, data which can take only the counting numbers, non-negative integer values {0, 1, 2, 3, ...}, and where these integers arise from counting rather than ranking. The statistical treatment of count data is distinct from that of binary data, in which the observations can take only two values, usually represented by 0 and 1, and from ordinal data, which may also consist of integers but where the individual values fall on an arbitrary scale and only the relative ranking is important.

Inference: Model Checking

Covers iterative weighted least squares, generalized linear models, and model checking.

Modern Regression: Spring Barley Data

Covers inference, weighted least squares, spring barley data analysis, and smoothing techniques.

Regression Analysis: Disentangling Data

Covers regression analysis for disentangling data using linear regression modeling, transformations, interpretations of coefficients, and generalized linear models.

Mathieu Salzmann, Alexandre Massoud Alahi, Megh Hiren Shukla

Deep heteroscedastic regression involves jointly optimizing the mean and covariance of the predicted distribution using the negative log-likelihood. However, recent works show that this may result in sub-optimal convergence due to the challenges associated ...

2024Nikolaos Stergiopulos, Georgios Rovas, Vasiliki Bikia, Stamatia Zoi Pagoulatou, Emma Marie Roussel

Stroke volume (SV) is a major biomarker of cardiac function, reflecting ventricular-vascular coupling. Despite this, hemodynamic monitoring and management seldomly includes assessments of SV and remains predominantly guided by brachial cuff blood pressure ...

Florent Gérard Krzakala, Lenka Zdeborová, Hugo Chao Cui

We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights. We consider the asymptotic limit where the number of samples, the input dimension and the network width ...

2023