0:00

In this video, we will virtually play a

Â game to introduce a Bayesian approach to inference.

Â Throughout the video, we will be making use of Bayes'

Â theorem, properties of conditional probabilities,

Â as well as probability trees.

Â 0:13

So here's the setup. I have a die in each hand.

Â One of them is a six-sided die, looks something like this.

Â And the other one is a 12-sided die, looks something like this.

Â The ultimate goal of the game is to guess which

Â hand is holding which die, but this is more than just a guessing game.

Â Before you make a final decision, you will be able

Â to collect data by asking me to roll the die

Â in one hand, and I'll tell you whether the outcome

Â of the roll is greater than or equal to 4.

Â Before we delve further into the rules of the game, let's pause for a moment

Â and think about what it means to roll a number greater than or equal to 4

Â with the two types of dice we have. We're going to ask two questions.

Â What is the probability of rolling a value greater

Â than or equal to 4 with a six-sided die?

Â And what is that probability with a 12-sided die?

Â With a six-sided die, the sample space is made up of numbers between 1 and 6.

Â We're interested in an outcome greater than or equal to 4, the probability

Â of getting such an outcome is then 3 out of 6, or 1 out of 2 or, 50%.

Â With a 12-sided die, the sample space is bigger, number is between 1 and 12.

Â And once again, we're interested in outcomes 4 or greater.

Â The probability of getting such an outcome is 9 out of 12, or 3 4ths, or 75%.

Â 1:39

Say you're playing a game where the goal is to roll a number

Â greater than or equal to 4, like the one we're playing right now.

Â If you could get your pick, which die would you prefer to play this game with.

Â The six-sided or the 12-sided die?

Â 1:55

Hopefully, your answer is a 12-sided die.

Â Remember, we already worked through the probabilities here.

Â The probability of rolling a number greater than or equal

Â to 4 is much higher, 75%, compared to the 50% with the 12 sided die.

Â So what we're going to call this die is the good die.

Â This is the ultimate goal.

Â You're going to try to figure out which hand is

Â holding the good die or in other words, the 12-sided die.

Â 2:23

Here are the rules. Remember, I have two dice.

Â One six-sided, the other 12-sided. I keep one die in

Â the left hand and the other in the right.

Â But I won't tell you which die I'm holding in which hand.

Â You pick a hand, left or right. I roll it.

Â And I tell you if the outcome is greater or equal to 4 or not.

Â I won't tell you what the outcome actually is, since

Â that could actually give away which die is in which hand.

Â Think about it.

Â If I tell you that you rolled an 11, you'd know that

Â you had picked the hand holding the 12-sided die, since it's impossible

Â to roll an 11 with a six-sided die. Then, based on that piece of information,

Â you make a decision as to which hand holds the good, the 12-sided die.

Â You could also choose to try again, in other words, collect more data.

Â You could ask me to roll again, and I could tell you one more time if

Â the hand you picked resulted in a roll that's greater than or equal to 4 or not.

Â But each round costs you money.

Â So you don't want to keep trying too many times.

Â You want to make a call.

Â This is obviously just a game.

Â And we're kind of making up some rules to make a point.

Â But if you think about data collection, it's always costly.

Â And while we love large sample sizes, it takes

Â a huge amount of resources to obtain such samples.

Â So the rules we're imposing aren't haphazardly made up,

Â and they reflect some reality about conducting scientific studies.

Â 3:56

There are two possibilities for the truth.

Â Either the good die is in the right hand or the good die is in the left hand.

Â If you guessed that the right hand is holding the good die, and

Â the good die is indeed on the right, then you win the game.

Â However, if the good die is on the left but you picked right, you lose the game.

Â 4:23

To avoid losing the game, you might want to collect

Â as much data as possible, but remember, we said that's costly.

Â So at some point before you're entirely sure, you'll

Â have to just go ahead and make a guess.

Â 4:37

If there're no consequences to losing the game, like in

Â this scenario, you might not care much whether you win or lose.

Â But say you had lots of money running on it,

Â then you might be conservative about calling the game too early.

Â 4:51

Here we're basically talking about balancing

Â the cost associated with making the wrong

Â decision and losing the game, against the

Â certainty that comes with additional data collection.

Â 5:05

Before we collect any data,

Â you have no idea if I'm holding the good die,

Â the 12-sided die, on the right hand or the left hand.

Â Then what are the probabilities associated with the following hypotheses?

Â The first hypothesis is that the good die is on the right,

Â and the second hypothesis is that the good die is on the left.

Â 5:28

the chances are you answered 50% chance that

Â the good die is on the right, and 50% chance that the good die is on the left.

Â These are your prior probabilities of the

Â two competing claims, the two competing hypotheses.

Â That is, these probabilities represent what you believe before seeing any data.

Â You could have conceivably made up these probabilities, but

Â instead, you have chosen to make an educated guess.

Â What would be a situation

Â where you might not pick this answer but pick something else?

Â Say, you know that I tend to favor my left.

Â If you knew this about me then you might put

Â a higher probability of me holding the good die with my

Â left hand, but if you don't have any additional information

Â like that about me, 50-50 is going to be your best bet.

Â 6:17

Now that we have sufficient background information

Â on the game, we can finally play.

Â Say you pick the right hand for the first round.

Â I roll the die in that hand, and voila,

Â you roll a number greater than or equal to 4.

Â Remember, I won't tell you which die is on the right hand and I won't tell

Â you what the outcome is, but at least I'm telling you that you rolled a high number.

Â 6:50

Having observed this data point, how, if at all, do

Â the probabilities you assign to the same set of hypotheses change?

Â The first hypothesis was that the good die is on the right,

Â and the second was that the good die is on the left.

Â The calculation

Â of the specific probability will take a few steps,

Â and we're going to get to that in a minute.

Â But first, let's try to think whether the new probability for H1 the first

Â hypothesis should still be 0.5, less than 0.5, or more than 0.5.

Â Hopefully your answer is b.

Â The probability of the good die being on the right should now be slightly

Â more than 0.5.

Â Because we just rolled the die in that hand and got a high valued outcome.

Â We know that this is more likely to happen with 12-sided die.

Â So the probability that the right hand is holding the 12-sided

Â die should be a little higher than what we had initially assigned.

Â 7:49

Let's actually calculate that probability.

Â We started with two hypotheses, good die is on the right or the bad die

Â is on the right.

Â And we said that initially, we're going to give these equal chances of

Â 50% chance of being true before we

Â actually get started with the data collection.

Â Remember these were our priors.

Â Then, we think about the data collection stage.

Â If it is true that the good die is on the right, the probability

Â of rolling a number greater than or equal to 4 is going to be 75%.

Â And the

Â complement of that, rolling a number less than 4, is going to be 25%.

Â If, on the other hand, you're actually holding the bad

Â die on the right, and you're picking the right hand.

Â The probability of rolling a number greater than or equal to 4 is

Â only 50%, and the compliment, rolling a number less than 4, is also 50%.

Â Usually in probability trees, the next

Â step is to calculate the joint probabilities.

Â So we multiply across the branches, there's a 37.5% chance that

Â the good die is on the right and you roll a number greater than or equal to 4.

Â There is a 12.5% chance that the good die is

Â on the right and you roll a number less than 4.

Â There is a 25% chance that the bad die is on the right and you roll a number greater

Â than or equal to 4.

Â And there's a 25% chance that the bad die is

Â on the right and you roll a number less than 4.

Â Remember, we did indeed roll a number greater than or equal to 4.

Â So these are the two outcomes that we're

Â most interested in, the very top branch and the

Â third branch, good die on the right and roll a number greater than or equal to 4.

Â Or, bad die on the right and roll a number greater than or equal to 4.

Â 9:43

We had earlier asked you guys to kind of think about, now

Â how does the probability change for

Â the hypothesis, the first hypothesis being true?

Â Well, that probability could formally be written as probability that the good die

Â is on the right, given you rolled a number greater than or equal to 4 with the die

Â on the right.

Â 10:04

If we want to find this probability, it's a

Â conditional probability, we can make use of base theorem.

Â Which basically says, if you're looking for A given B, find the

Â joint probability of A and B, divided by the marginal probability of B.

Â So in the numerator, we have good die is on the right and you

Â roll a number greater than or equal to 4, divided by simply the probability

Â of rolling a number greater than or equal to 4

Â with the right, with the die on the right hand.

Â The joint probability is a probability that we're grabbing from the first branch,

Â the 37.5%, and the marginal probability of rolling a number greater than or equal to

Â 4 is going to be simply the 0.375 plus the 0.25.

Â You may be rolling a number greater than or

Â equal to 4 with the die on the right hand.

Â 10:57

Because it was the good die, or because it was the bad die.

Â And because we're saying or, for these

Â two disjoint outcomes, we got the two probabilities.

Â The result comes out to be 60%.

Â Earlier we had guessed that the probability of

Â the hypothesis being true should increase from 50%.

Â And,

Â in fact, now with the one data point we have

Â observed, we can indeed see an increase up to 60%.

Â The

Â 11:26

probability we just calculated is also called the posterior probability.

Â It's the probability that the good die is on the right, given that you rolled a,

Â you rolled a number greater than or equal to 4 with the die on the right.

Â Posterior probability is generally defined as

Â probability of the hypothesis given the data.

Â Or in other words, it's the probability of a

Â hypothesis we set forth, given the data we just observed.

Â it depends on both the prior probability we set and the observed data.

Â This is different than what we calculated at

Â the end of the randomization tests on gender discrimination.

Â The probability of observed or more extreme

Â data, given the null hypothesis being true, in

Â other words, the probability of data given

Â the hypothesis, which we had called a p-value.

Â We'll see a lot more of those throughout the rest of the course,

Â but this time, we're making our decision based on what

Â we call the posterior probability as opposed to the p-value.

Â In the Bayesian approach, we evaluate

Â claims iteratively as we collect more data.

Â In the next iteration, the next roll, if we were to

Â play this game one more time, and you had asked me

Â to roll a die on either the right or the left

Â hand again, and we had done the calculation of the posterior

Â one more time.

Â We get to take advantage of what we learned from the data.

Â In other words, we update our prior

Â with our posterior probability from the previous iteration.

Â 12:55

So, in the next iteration, our updated prior for the first hypothesis

Â being true is going to be the 60%, the posterior from the previous iteration.

Â And the compliment of that, 40%, is going

Â to be the probability of the competing hypothesis.

Â So to recap, the Bayesian approach allows us to take advantage

Â of prior information, like a previous published study or a physical model.

Â It also allows us to naturally integrate data

Â as you collect it and update your priors.

Â We also get to avoid the counter-intuitive definition of a p-value, the

Â probability of observed or more extreme

Â outcome, given the null hypothesis is true.

Â And instead, we can base our decisions on the posterior probability,

Â the probability that the hypothesis is true, given the observed data.

Â 13:46

A good prior helps, but a bad prior hurts.

Â Remember that when we set our priors, the

Â 50-50 chance for the two hypotheses being true, we

Â said that we were taking an educated guess, so

Â we don't want to just make up our prior probabilities.

Â But, the prior matters less, the more data you have.

Â So you, even if you didn't have a great prior to

Â begin with, as you collect my, more data, you're going to be able

Â to converge to the right probabilities.

Â In this course, the Bayesian inference examples that

Â we're going to give will be much simpler.

Â However, they will provide a solid framework, should you decide to

Â continue your studies with statistics and work with more advanced Bayesian models.

Â