[MUSIC]

We've now reached the end of week 5 of this introductory MOOC looking at

probability and statistics.

So what are the key takeaways from week five?

So here we've moved into the realm of a decision theory,

specifically hypothesis testing.

So we began with a simple legal analogy, which hopefully everyone can

easily relate to, whereby a jury has to decide on a defendants guilt or

innocence based on the evidence i.e., the data provided to them in the courtroom.

So of course the jury was making a binary decision, finding the defendant guilty or

not guilty, just as in statistical hypothesis testing

we are deciding whether or not to reject some null hypothesis.

But we also noted that juries don't always get it right, and

sometimes they make mistakes, not ideal, but it's not an ideal world, and

we introduce the concepts of Type I errors and Type II errors.

Effectively, we could think of these as false positives and

false negatives, respectively.

Now which of these is worse, which is more problematic,

we are now going into subjective terrain, however, on balance,

we tend to consider Type I errors to be more problematic than Type II errors.

So when we design a hypothesis test we seek to control four

of the probability of committing a Type I error, and we saw that we achieved this

using our significance level denoted generically by alpha.

Now what is this appropriate significance level to choose?

Well remember, we were looking for

the statistical equivalent of being beyond a reasonable doubt.

Now again, very much a subjective decision here, but nonetheless convention

tends to suggest a 5% significance level is often quite appropriate.

So having considered those main mechanics of hypothesis testing, and

the possible errors which could occur, we then went into our to p, or

not to p discussion, i.e., the introduction of the term of a p-value.

So this is an instrument which allow us to very easily make this a binary

decision of whether or not to reject some null hypothesis,

whereby a p-value is simply a probability.

So a value between zero and one, and

we compare this to our chosen significance level,

which is also a probability i.e., the probability of committing a Type I error,

and we introduce that very simple decision rule, that should the p-value be below

the significance level we would deem this to reflect statistically significant

evidence and hence we would be justified in rejecting our null hypothesis.

But be warned, of course, rejection of the null hypothesis

either leads to a correct decision, but of course, it might be a Type I error.

Unfortunately, we won't know which of those two events has occurred.