Poisson regression is a regression used to model count data (i.e. when the dependent variable is a count of some occurrence). Poisson regression falls in the family of generalized linear models (GLMs). The canonical link function for Poisson regression is logarithm

Poisson Distribution

Poisson regressions assume that the dependent variable is drawn from a Poisson distribution. The probability mass function for the Poisson distribution is:

One of the features of the Poisson distribution is that the mean and the variance are both , i.e. they are equal.

Equation

The equation for a Poisson regression is given by:

where is a vector of coefficients and is a vector of predictor variables.

There are a few other ways to write this. The idea is the same as with any other linear model — we’re estimating the expected value of conditioned on some . To make the above clear that we’re modeling in a Poisson regression, we could write it as:

Or equivalently

The parameters can be estimated using Maximum Likelihood Estimation, where the likelihood equation is given by:

Implementation in R

Like other GLMs, we can fit a Poisson regression in R via the glm() function, e.g.

mod <- glm(y ~ x1 + x2, data = my_data, family = poisson)