0:13

Here it operates just like the LN function the outcome is

Â on the left hand side of a tilde and then the predictors on the right hand side.

Â In this case the only difference is now we have to specify family equals binomial,

Â and that's telling the GLM function that we have 01 data.

Â If we had count data the bounded count data,

Â real binomial data, then we would also have to give it a sample size.

Â Also by default for the binomial and binary case,

Â it's going to assume that the link function that you have

Â is the logistical link function, which is what we want so that's fine.

Â You see, the output when you do summary, summary works just like it did for Ellen.

Â You see the output below.

Â You ge the coefficients, in this case, -1.68 and for

Â the Ravens coefficient you use 0.1.

Â In this case on the Logit scale you want to look, for example,

Â with the score variable, the Ravens score variable you want to see whether or

Â not the coefficient is close to zero or not on the Logit scale.

Â On the exponentiated scale you want to see whether or not it's close to one.

Â That it'll give us the standard error and a Z value and the P value,

Â we're going to interpret those just like we would in our linear model to

Â acknowledging that this are drive in a different way.

Â 1:34

So here's the fitted curve, so this are the predictive,

Â responses put back on the probability scale.

Â So what R did is it plugged in the x values associated with the coefficient.

Â So it took the scores, multiplied them times the coefficient of the scores, added

Â the estimated intercept, and then took e to that value over 1 plus e to that value.

Â And that gives us the probabilities.

Â Now this is only part of the fitted S

Â curve because the component of

Â the S curve where the data actually are observed was restricted.

Â So the S curve up here is one so the S curve looks something like this and

Â then it would go down so you'll notice that this ends at 0.4 not zero.

Â So that's the actual curve that's fitting,

Â though here on the fitted values we're only showing part of it.

Â 2:37

If we were to exponentiate the ravens coefficients we get 0.1864 for

Â the intercept and 1.1125 for the score.

Â So that suggests an 11% increase

Â in the probability of winning for every additional points that the Ravens score.

Â Okay, that's how you would interpret this logistic regression coefficient.

Â You can get confidence intervals for

Â these two coefficients very easily with a confint operator.

Â So on the output from the fitted model, we do confint, and then because most people,

Â myself included, would prefer to look at these things on the exponentiated scale,

Â you just exponentiate two endpoints with the x function.

Â So what we see now is that our interval does contain one,

Â it goes from 0.99 to 1.3.

Â So it's saying that even though we for

Â sure know that scoring points is what causes the Ravens to win the game,

Â this coefficient turns out to not be significant, okay.

Â The ANOVA function works just like it does in LM you put the output of

Â the fitted model and you can put multiple models in there.

Â For example nested models and ANOVA will give you a series of sequential tests.

Â And so here, in this case, the variable is for the score,

Â 4:11

just the one variable that we're kind of interested in.

Â So it just has a one Degree of Freedom test.

Â This isn't that useful in this particular example, but if you had several

Â models that you were interested in you'd put them into the ANOVA function.

Â Or I think also it is especially useful if you have a factor variable,

Â because sometimes you want the factor variable included or

Â the factor variable, all levels of the factor variable removed.

Â Then you might want to test whether or not that all levels of

Â 4:44

them are in total are necessary in the model.

Â Notice that when you do summary and just get the coefficient table out from R,

Â that test each one of the levels of the factor independently and

Â doesn't test them all as a whole.

Â So something like the ANOVA function is useful for

Â putting a factor in and out of the model.

Â 5:06

So for interpreting our odds ratios remember that our odds ratios are not

Â probabilities they're functions of probabilities but

Â they're not probabilities.

Â An odds ratio of one means no difference, okay?

Â But on the Logit scale of the log odds ratio zero means no difference.

Â In odds, [NOISE] often if we have an odds ratio below 0.5 or

Â above two, that's considered,

Â I would consider it a kind of strong effect.

Â It depends very context dependent so if your working in the field of

Â epidemiology or something like that, you often get very small odds ratios.

Â 1.01 might be significant.

Â You're looking at giant studies and

Â the reason that these small odds ratios are important is because they're studying

Â rather noisy things, like nutrition or something like that.

Â How nutrition impacts health or something like that.

Â And you wind up with because of all the various factors that incorporate,

Â influence this study, you end up with very small odds ratios that are still

Â meaningful, even if significant.

Â On the other hand,

Â you might run a very tightly controlled experimental clinical trial, or

Â something like that, and then the odds ratios that you would want In order to

Â declare something kind of meaningful, it would have to be much larger.

Â So again, this is just less than 0.5 or

Â bigger than two is a little bit of a benchmark.

Â But remember, like all benchmarks, we only have a certain amount of utility.

Â Really, how strong an odds ratio is relative to the scientific setting

Â is incredibly dependent on the context that you're looking at in.

Â So the relative risk is another entity that's often thought of

Â in the same vein as the odds ratio, and many people like it.

Â So the relative risk is just the ratio of the two probabilities.

Â And many people like it because they tend to instinctively think a little bit better

Â in terms of probabilities rather than odds.

Â And relative probability seems like a reasonable thing to do.

Â The problem with the Relative risk unlike the Odds is,

Â it puts in some model constraints there quite hard, so we don't see

Â relative risk regression for binary variable it's a very common thing to do.

Â There are some software languages I know are has some packages to do it.

Â It says has some packages to do it.

Â 7:55

So I give you couple more references here on Odds Ratios, but I think that's

Â enough to get you started on generalizing your models for binomial data.

Â I'm hoping that you can use all of your knowledge you got from linear models and

Â help you work that into your use for generalized linear models.

Â Next lecture, we're going to consider Poisson data.

Â That's sort of the last real lecture of the series.

Â Then we have kind of a some bonus fun content it's the last lecture,

Â the very last lecture.

Â So I look forward to seeing you next week.

Â We're going to talk about poison random variables.

Â