In this video, we'll review the Poisson distribution and present the Poisson regression model and describe when that model would be appropriate in real world settings. So recall from an earlier video that generalized linear models consist of three main components. First, there's a random component which is the response that we require to come from the exponential family of distributions. Second is a systematic component, and third is a link between the two. Now once we specify these components, we can use sample data and the model to estimate parameters, explain the data and maybe predict new measurements. So let's specify those three components for the Poisson data. Now a Poisson data frame might look like the following. So in this table, we have a response column, and that's the random component. So we believe that the ys are realizations from Poisson random variables and then we have predictors or covariant classes as the systematic component. Now what characterizes Poisson regression is the fact that the response again, we think is generated from Poisson random variables, which is from the exponential family of distributions. Now generally, each measurement of the response yi conditioned on the predictors will come from a different Poisson distribution. So we think that if we were to condition on our predictors, the probability that a random variable capital Yi is equal to little yi so little yi is our response is equal to e raised to the negative Lambda i times Lambda i raised to the yi over yi factorial. And it's important to note here that yi is a count. So it's zero, one, two etc., and it's unbounded. So yi could potentially be very large. And here Lambda i is greater than zero. It's called the rate parameter, and it's potentially different for every measurement, and the difference might depend on the covariant class. So Poisson regression is really a way to try to model this rate parameter based on other information, and we'll think about an example of this in just a few moments. So generally, let's think about the random component. Again, it's a response variable, and the components of the response each yi is a realization from a Poisson distributed random variable, and our goal in statistical modeling is to really predict the mean of each response yi using the co variant class. And in the case of Poisson regression, the mean mu i is equal to Lambda i. So we can actually show using a little bit of calculus, and I'll let that be an exercise for you that mu i which we could say is the expected value of yi is just equal to the rate parameter. So if our goal is to estimate the mean which we might use to predict new values of the response for a given covariant class or just to estimate the mean because it's useful information for whatever problem that we're working on. That's equivalent to estimating this rate parameter. So we should also note that the variance of a Poisson random variable is also the rate parameter lambda. So we could say Sigma i squared which is the variance of yi is also equal to Lambda. And the canonical parameter from the exponential family which we defined in a previous lesson is theta i in terms of the mean it's equal to the natural log of mu i. And of course, that's the same as the natural log of lambda i. So this will be useful for selecting a link function because we often like to use this theta i as the canonical link function. So if you think back to binomial regression, the canonical parameter for the binomial distribution was just the logit function, right? The log odds. Now as with binomial regression, a linear combination of the covariate class is the systematic component that can be used to predict or explain the mean of the Poisson random variable. That's the response. And as with binomial regression, we can't simply use an estimate of either the systematic component to directly predict Mu i. And let's think about why that's the case for the Poisson regression model. Well it turns out that there's no reason to think that this linear predictor ADA will be positive, and that's required of a rate, right? So you can imagine that you would collect some data and you estimate your betas and you plug in your x values and you end up with something negative. And that negative thing in Poisson in this context would be a prediction of a rate, and that wouldn't be quite right. So we must have a way to link the mean of this Poisson distribution which is positive to this systematic component, this linear predictor and that's where the link function comes in. So the link function again describes how the mean response is linked to the linear predictor. And we said the canonical link function is what's often used. And for Poisson regression, that's the log link function. So our g would take Lambda i and it would just be the log of Lambda i. And this is the canonical link function because it relates the linear predictor to the canonical parameter theta from the exponential formulation of the Poisson. So that means we set our linear predictor equal to g of Lambda which is equal to the log of Lambda i, and that is equal to the canonical parameter in the exponential family formulation of the Poisson. And this is helpful because if you imagine trying to predict the rate parameter which is the mean, then you can invert g right, that would allow you to show okay, Lambda I is equal to e raised to the ADA raised to that linear predictor. And so once you have estimates of the beta that show up in the linear predictor, you can plug those in, plug in your covariant class and exponentiate and you could get an estimate of your rate parameter. So now that we've specified the three components of the Poisson regression model, we're in the position to use sample data to estimate the parameters of the model. And as with binomial regression, the estimation procedure used is maximum likelihood estimation, and we'll discuss that briefly in the next lesson. Now, once we have the MLEs for the parameter estimates, we can then check the fit of our model. And we can use deviant statistics as a measure of how well the current model fits when compared to the saturated model, which has as many parameters as measurements. And there are other diagnostics that we can look at to see if our model fits well, and we'll discuss those in future lessons. Before we finish up, I want to mention something about rate responses and what's often called an offset term. So the Poisson response counts the number of events that occur and events occur often within what's called an exposure period, say over a particular period of time or within a particular region of space. So to understand that, let's consider an example. Suppose we'd like to construct a model that can predict the number of times an individual would be admitted to a hospital. Now the covariant class, the set of predictors might include things like age, gender, other health conditions, for example, maybe a heart condition or whether a person has diabetes. And a data frame with such data might look like the one presented here. Now consider the response measurements. Suppose I told you that individual three, so that would be this person here. So individual three was observed for one month, and individual one here was observed for one year, that information should really matter, right? Individual one had a longer exposure period a year versus a month, and that really means they had a longer period of time for hospitalization to occur. So we can modify our Poisson model to include this information. So in rate models, the mean of our response should really include the exposure and in the example that we just discussed, the length of time individuals were observed is the exposure period. And so that means that we should really think about our mean. Of course we've mentioned that's our rate is equal to a count over some exposure. And the count is our response, and we might call the exposure for short e. Now we can incorporate this information into our model through the link function. So using the log-link, we would see all right, we know that the log-link is just the log of our rate parameter. And we know that the rate parameter is our yi over ei and then just using a law of logs, we see that we'll have the log of y minus the log of the exposure. And since the exposure terms are known so the eis are known, we know how long that will be observing for example different people in their hospitalization counts. This can be moved to the right hand side of our model as an offset term. Now in R, we would use the glm function to fit a Poisson regression model, and we'll get some experience with this in a future lesson. But in order to include an offset term, so if you have different exposures for your different rows of your data frame for your different units that show up. Then you should include that wrapped in the offset function. So here notice you have glm. And then the formula is Response till the, then you have your some of your predictors and you also have offset. And then the log of whatever the exposure variable is, so that will be important and we will go through examples to help. Now, just one last thing to mention in this ellipsis here, you'll have to specify the family. And in binomial regression, we set family equal to binomial. And, of course, here we will set family equal to Poisson.