Okay, so here is our fitted Poisson regression model overlayed onto our data.

And you can see it actually fits pretty closely to the linear model,

though it has some nice curvature to it, which is what we wanted.

We could have accomplished that in our linear model by adding a squared term,

of course, but it's nice to note that a simpler model

with fewer coefficients seems to fit the data better.

A concern, often, is that the variance has to equal the mean.

So the variance, in this case, needs to go up as the mean goes up.

But here if we plot the fitted values versus the residuals,

it's very clear that the variance is higher for lower mean values.

That's the problem, okay.

So we need at least some way to account for

the fact that the variance is not necessarily constant.

There's a lot of ways to do that, and if you read in the book,

one thing that we talk about are the quasi-Poisson models.

This model would look at the variance being a constant multiple of the mean,

rather than being equal to the mean.

But, in this case, that doesn't appear to be the case in this case because it looks

like we have this issue where there's larger variance for

lower fitted values, when the Poisson model assumes the opposite.

So Jeff actually had this code from this model, the sandwich,

which seems like a funny name for a package.

But it comes from the sandwich variance estimator,

made famous by generalized estimating equations, which by the way was

a technique that was invented here at Johns Hopkins Biostatistics by two very

well-known professors here, Scott Zeger and Kung-Yee Liang.

At any rate, Jeff did some code here for getting model agnostic standard errors.

And if you read in the book, there's a little bit more discussion about this.

This is kind of a more advanced topic than we would like to delve into in this class.

However, it's a very important applied topic, as well.

It's not just a theoretical exercise.

So, the main point is to do some residual plots, to understand whether or

not you think your model's assumptions hold.

Try quasi-Poisson model because that's a very easy thing to do in R,

if you think at least it holds at some level, but maybe not just in the sense

of the variance being a constant multiple of the mean.

But if it really fails, like in this case,

then you have to dig in to some other solutions.