0:00

In this video, we will virtually play a

game to introduce a Bayesian approach to inference.

Throughout the video, we will be making use of Bayes'

theorem, properties of conditional probabilities,

as well as probability trees.

0:13

So here's the setup. I have a die in each hand.

One of them is a six-sided die, looks something like this.

And the other one is a 12-sided die, looks something like this.

The ultimate goal of the game is to guess which

hand is holding which die, but this is more than just a guessing game.

Before you make a final decision, you will be able

to collect data by asking me to roll the die

in one hand, and I'll tell you whether the outcome

of the roll is greater than or equal to 4.

Before we delve further into the rules of the game, let's pause for a moment

and think about what it means to roll a number greater than or equal to 4

with the two types of dice we have. We're going to ask two questions.

What is the probability of rolling a value greater

than or equal to 4 with a six-sided die?

And what is that probability with a 12-sided die?

With a six-sided die, the sample space is made up of numbers between 1 and 6.

We're interested in an outcome greater than or equal to 4, the probability

of getting such an outcome is then 3 out of 6, or 1 out of 2 or, 50%.

With a 12-sided die, the sample space is bigger, number is between 1 and 12.

And once again, we're interested in outcomes 4 or greater.

The probability of getting such an outcome is 9 out of 12, or 3 4ths, or 75%.

1:39

Say you're playing a game where the goal is to roll a number

greater than or equal to 4, like the one we're playing right now.

If you could get your pick, which die would you prefer to play this game with.

The six-sided or the 12-sided die?

1:55

Hopefully, your answer is a 12-sided die.

Remember, we already worked through the probabilities here.

The probability of rolling a number greater than or equal

to 4 is much higher, 75%, compared to the 50% with the 12 sided die.

So what we're going to call this die is the good die.

This is the ultimate goal.

You're going to try to figure out which hand is

holding the good die or in other words, the 12-sided die.

2:23

Here are the rules. Remember, I have two dice.

One six-sided, the other 12-sided. I keep one die in

the left hand and the other in the right.

But I won't tell you which die I'm holding in which hand.

You pick a hand, left or right. I roll it.

And I tell you if the outcome is greater or equal to 4 or not.

I won't tell you what the outcome actually is, since

that could actually give away which die is in which hand.

Think about it.

If I tell you that you rolled an 11, you'd know that

you had picked the hand holding the 12-sided die, since it's impossible

to roll an 11 with a six-sided die. Then, based on that piece of information,

you make a decision as to which hand holds the good, the 12-sided die.

You could also choose to try again, in other words, collect more data.

You could ask me to roll again, and I could tell you one more time if

the hand you picked resulted in a roll that's greater than or equal to 4 or not.

But each round costs you money.

So you don't want to keep trying too many times.

You want to make a call.

This is obviously just a game.

And we're kind of making up some rules to make a point.

But if you think about data collection, it's always costly.

And while we love large sample sizes, it takes

a huge amount of resources to obtain such samples.

So the rules we're imposing aren't haphazardly made up,

and they reflect some reality about conducting scientific studies.

3:56

There are two possibilities for the truth.

Either the good die is in the right hand or the good die is in the left hand.

If you guessed that the right hand is holding the good die, and

the good die is indeed on the right, then you win the game.

However, if the good die is on the left but you picked right, you lose the game.

4:23

To avoid losing the game, you might want to collect

as much data as possible, but remember, we said that's costly.

So at some point before you're entirely sure, you'll

have to just go ahead and make a guess.

4:37

If there're no consequences to losing the game, like in

this scenario, you might not care much whether you win or lose.

But say you had lots of money running on it,

then you might be conservative about calling the game too early.

4:51

Here we're basically talking about balancing

the cost associated with making the wrong

decision and losing the game, against the

certainty that comes with additional data collection.

5:05

Before we collect any data,

you have no idea if I'm holding the good die,

the 12-sided die, on the right hand or the left hand.

Then what are the probabilities associated with the following hypotheses?

The first hypothesis is that the good die is on the right,

and the second hypothesis is that the good die is on the left.

5:28

the chances are you answered 50% chance that

the good die is on the right, and 50% chance that the good die is on the left.

These are your prior probabilities of the

two competing claims, the two competing hypotheses.

That is, these probabilities represent what you believe before seeing any data.

You could have conceivably made up these probabilities, but

instead, you have chosen to make an educated guess.

What would be a situation

where you might not pick this answer but pick something else?

Say, you know that I tend to favor my left.

If you knew this about me then you might put

a higher probability of me holding the good die with my

left hand, but if you don't have any additional information

like that about me, 50-50 is going to be your best bet.

6:17

Now that we have sufficient background information

on the game, we can finally play.

Say you pick the right hand for the first round.

I roll the die in that hand, and voila,

you roll a number greater than or equal to 4.

Remember, I won't tell you which die is on the right hand and I won't tell

you what the outcome is, but at least I'm telling you that you rolled a high number.

6:50

Having observed this data point, how, if at all, do

the probabilities you assign to the same set of hypotheses change?

The first hypothesis was that the good die is on the right,

and the second was that the good die is on the left.

The calculation

of the specific probability will take a few steps,

and we're going to get to that in a minute.

But first, let's try to think whether the new probability for H1 the first

hypothesis should still be 0.5, less than 0.5, or more than 0.5.

Hopefully your answer is b.

The probability of the good die being on the right should now be slightly

more than 0.5.

Because we just rolled the die in that hand and got a high valued outcome.

We know that this is more likely to happen with 12-sided die.

So the probability that the right hand is holding the 12-sided

die should be a little higher than what we had initially assigned.

7:49

Let's actually calculate that probability.

We started with two hypotheses, good die is on the right or the bad die

is on the right.

And we said that initially, we're going to give these equal chances of

50% chance of being true before we

actually get started with the data collection.

Remember these were our priors.

Then, we think about the data collection stage.

If it is true that the good die is on the right, the probability

of rolling a number greater than or equal to 4 is going to be 75%.

And the

complement of that, rolling a number less than 4, is going to be 25%.

If, on the other hand, you're actually holding the bad

die on the right, and you're picking the right hand.

The probability of rolling a number greater than or equal to 4 is

only 50%, and the compliment, rolling a number less than 4, is also 50%.

Usually in probability trees, the next

step is to calculate the joint probabilities.

So we multiply across the branches, there's a 37.5% chance that

the good die is on the right and you roll a number greater than or equal to 4.

There is a 12.5% chance that the good die is

on the right and you roll a number less than 4.

There is a 25% chance that the bad die is on the right and you roll a number greater

than or equal to 4.

And there's a 25% chance that the bad die is

on the right and you roll a number less than 4.

Remember, we did indeed roll a number greater than or equal to 4.

So these are the two outcomes that we're

most interested in, the very top branch and the

third branch, good die on the right and roll a number greater than or equal to 4.

Or, bad die on the right and roll a number greater than or equal to 4.

9:43

We had earlier asked you guys to kind of think about, now

how does the probability change for

the hypothesis, the first hypothesis being true?

Well, that probability could formally be written as probability that the good die

is on the right, given you rolled a number greater than or equal to 4 with the die

on the right.

10:04

If we want to find this probability, it's a

conditional probability, we can make use of base theorem.

Which basically says, if you're looking for A given B, find the

joint probability of A and B, divided by the marginal probability of B.

So in the numerator, we have good die is on the right and you

roll a number greater than or equal to 4, divided by simply the probability

of rolling a number greater than or equal to 4

with the right, with the die on the right hand.

The joint probability is a probability that we're grabbing from the first branch,

the 37.5%, and the marginal probability of rolling a number greater than or equal to

4 is going to be simply the 0.375 plus the 0.25.

You may be rolling a number greater than or

equal to 4 with the die on the right hand.

10:57

Because it was the good die, or because it was the bad die.

And because we're saying or, for these

two disjoint outcomes, we got the two probabilities.

The result comes out to be 60%.

Earlier we had guessed that the probability of

the hypothesis being true should increase from 50%.

And,

in fact, now with the one data point we have

observed, we can indeed see an increase up to 60%.

The

11:26

probability we just calculated is also called the posterior probability.

It's the probability that the good die is on the right, given that you rolled a,

you rolled a number greater than or equal to 4 with the die on the right.

Posterior probability is generally defined as

probability of the hypothesis given the data.

Or in other words, it's the probability of a

hypothesis we set forth, given the data we just observed.

it depends on both the prior probability we set and the observed data.

This is different than what we calculated at

the end of the randomization tests on gender discrimination.

The probability of observed or more extreme

data, given the null hypothesis being true, in

other words, the probability of data given

the hypothesis, which we had called a p-value.

We'll see a lot more of those throughout the rest of the course,

but this time, we're making our decision based on what

we call the posterior probability as opposed to the p-value.

In the Bayesian approach, we evaluate

claims iteratively as we collect more data.

In the next iteration, the next roll, if we were to

play this game one more time, and you had asked me

to roll a die on either the right or the left

hand again, and we had done the calculation of the posterior

one more time.

We get to take advantage of what we learned from the data.

In other words, we update our prior

with our posterior probability from the previous iteration.

12:55

So, in the next iteration, our updated prior for the first hypothesis

being true is going to be the 60%, the posterior from the previous iteration.

And the compliment of that, 40%, is going

to be the probability of the competing hypothesis.

So to recap, the Bayesian approach allows us to take advantage

of prior information, like a previous published study or a physical model.

It also allows us to naturally integrate data

as you collect it and update your priors.

We also get to avoid the counter-intuitive definition of a p-value, the

probability of observed or more extreme

outcome, given the null hypothesis is true.

And instead, we can base our decisions on the posterior probability,

the probability that the hypothesis is true, given the observed data.

13:46

A good prior helps, but a bad prior hurts.

Remember that when we set our priors, the

50-50 chance for the two hypotheses being true, we

said that we were taking an educated guess, so

we don't want to just make up our prior probabilities.

But, the prior matters less, the more data you have.

So you, even if you didn't have a great prior to

begin with, as you collect my, more data, you're going to be able

to converge to the right probabilities.

In this course, the Bayesian inference examples that

we're going to give will be much simpler.

However, they will provide a solid framework, should you decide to

continue your studies with statistics and work with more advanced Bayesian models.