0:10

Hi, in this video, we will discuss propensity scores and

also talk about the balancing property of propensity scores.

So propensity score is simply the probability of receiving treatment,

given covariates.

So in particular, we are thinking about the probability of receiving

treatment as opposed to take control condition.

So, we'll define treatment as A=1.

And control, we'll define as A=0 as we've done in the past.

So then formally the propensity score,

which we'll define by pi i is just the probability that A=1 given Xi.

So here, the pi i is referring to, a notation for

the propensity score for person i.

It's really a function of x.

So you can think of the propensity score as a of a function of X, but

we are indexing it by i, because person I has a unique set of covariates Xi.

So this is probability of treatment,

given that person's particular set of covariance.

So, that's what we mean by a propensity score.

1:46

I give an example where the probability of treatment among people

are 60 years old would have to be greater than the probability

of treatment among people who were 30 years old, in this case.

Another way you could look at, it is pi i would have to be

greater than pi j if person i is older than person j.

2:32

Next, we're going to think about a balancing score property of

the propensity score.

So to motivate this, we'll think about two subjects that have the same value

of the propensity score, but they might have different covariant values X.

So remember, the propensity score's a function of X.

It's a function of the co variant.

But typically, there's not going to be a unique value

of the propensity score that's associated with just one set of Xs.

In other words, it could be multiple sets of Xs that can lead to the same propensity

score so the same probability treatment.

3:57

So we could think of it like, in this example, there is two sets of Xs,

two unique value Xs that we're thinking about and

we'll expect them to appear in the treatment group at about the same rate.

So what this means is that if we were to restrict to subpopulation of people

that had the same value of the propensity score,

then we should have balance in the two treatment groups.

And so, this is what would mean that the propensity score is a balancing score.

4:25

So a balancing score is something where if you condition on it, you'll have balance.

So, the propensity score is an example of a balancing score.

So if we were to only consider people who have only the same value for

the propensity score, if we restrict our analysis to that group of people.

Then if we stratify an actual treatment received, then we should see the same

distribution of covariance in those two treatment groups.

To make this a little more formal, we could state it as follows.

So here, we have the probability of X.

So the distribution of the covariance themselves,

conditional on the propensity score which we're just writing as a function of X.

Remember, the propensity score depends on X.

Conditional on that propensity score being equal to some value P.

So, this is just some fixed value and conditional on A=1.

So condition on A=1 means we're, let's only look at treated subjects.

And in fact, we're only going to look at treated subjects that have a particular

propensity score that's equal to little p.

5:32

So, what we want to know what is the distribution of the covariance among

treated people who had this particular propensity score?

Well, it turns out that, that's the same as the distribution of covariance

among controls who have value of the propensity score equal to p.

5:53

So by conditioning on the propensity score equaling p,

what we're saying is let's think about all possible combinations of X

that would lead to this one propensity score.

So p could be 0.3, for example.

It's just a fixed value.

So, imagine p is 0.3.

We'll say, okay, what are the set of X's that lead to a propensity score of 0.3?

Now, let's restrict to people who have those Xs.

6:20

Now let's also look at treated versus controlled,

then we'll see that the distribution of those Xs is the same in the two groups.

You can prove this with an application of base theorem and it's not very many stuff.

But hopefully, the intuition is clear is to why that would be the case.

The main idea is really from the previous slide where a propensity score be

in the same for different set of access would mean that you would expect to see,

either type of X about as often in the treatment as in the controller.

6:54

The implication is then that matching on the propensity

score should achieve balance.

So previously, we had talked about matching on the full set of covariance by

taking a distance between them.

And that would achieve balance if we do that well, but

the same thing would work here where if we simply just match on the propensity score.

If we do that well, we should have balance.

And this makes sense because we actually, because we assumed ignorability.

So remember ignorability,

essentially means that treatment assignment is random given X.

So, what we'd really doing by conditioning on propensity score is worth

conditioning on an allocation of probability.

So, all we're really doing by conditioning a propensity score is

conditioning it on the rate of which treatment should be assigned.

So if we condition on propensity score, that's equal to 0.3.

We're really in a randomized trial world where the allocation probability is 0.3,

where you're going to randomly assign treatment with probability 0.3 and

the control with probability 0.7.

But because at that point we're in randomized trial world,

we expect to have covariant balance.

Next, we'll talk about the propensity score itself and

how we actually need to estimate it.

So in a randomized trial, the propensity score is generally known.

So, people who are planning to randomized trial will typically decide what

the allocation probably is.

In the simplest case, the allocation probability, the probability

of treatment given covariates would actually not depend on covariates.

It would just equal 0.5.

So, the standard kind of randomized trial where each person has a 50% chance of

getting treatment.

The allocation probability would just be 0.5 for everyone.

You of course, could have some kind of stratified random

sampling where you might be conditioning on X and so on.

But regardless the allocation probability would be known ahead of time,

which means a propensity scores known ahead of time in a randomized trial.

So, it's known by design.

9:18

However, it's important to note that the propensity score just

involves observed data.

So the propensity score, the probability of treatment,

given covariates only involves observed data.

It just involves A and X.

Both of which we observed together on all of our subjects.

What that means is that we should be able to estimate it.

So, we observe which subjects are treated and which are not.

We observe the values of X for each of those subjects,

so we should be able to estimate the relationship between those.

So most of the time when people actually talk about a propensity score,

they're really referring to an estimated propensity score.

So, how do we actually go about estimating the propensity score?

10:21

And treatment is just binary, so you can use any kind of models that you would

use if you want to predict a binary outcome, given a set of variables.

So the most popular approach for doing that is probably logistic regression, but

I do note that you could use whatever you wanted.

So this is really just a classic kind of machine learning problem, as well.

So you could use any sort of machine learning kind of method, as well.

The importance isn't how you go about it at this point.

It's just we need to estimate a probability of treatment given Xs for

every subject.

So, let's imagine we're going to use logistic regression.

So we have outcome A, covariates X.

So, we'll just fit that model using standard logistic regression methods.

11:21

So, our second step after we fit the model was actually get

predictive probabilities or fitted values for each subject.

And usually, that's one line of code in statistical software.

We can actually get predictive probability for each person.

So for every person, we'll have a predictive probability.

So we'll have a number between zero and one, and

that will be the propensity score.

And in fact, it's an estimated propensity score.

But from here on out, we'll just call it propensity score.

So, it will be a value between zero and one for every subject.