Here, what we've got though is something where that sample is not

really what we use for

the inference because the sample contains cases that we selected that don't respond.

We have, sort of a 3.1 if you will.

A respondent sample.

A set of lower case r respondents from the lower case n sample persons.

It's almost as though we now have another fraction.

It's not the severe a fraction as we had for drawing our sample,

but a fraction of the sample that is retained in the sample.

Our green is what we actually are going to work with.

That's what we're going to see in our data set.

We won't see the full sample of all the elements,

because some of them we didn't obtain data for.

So that respondent sample now, we need to do something with it to compensate for

its selection.

So one way to deal with this, this is not the only way,

but one way to deal with this is to take those respondents and

inflate them back to the sample number, to undo that non-response mechanism.

Now that non-response mechanism now, is not a probability mechanism.

So, undoing this will require that we make some assumptions.

In order to do that, we're going to go backwards from our lowercase r to our

lowercase n in a weighted respondence file that compensates for the non response.

We're concerned that by having that non response we can have possibly, if it's

disproportionately allocated across our groups, some impact on our results.

So suppose, for example, among our 10th graders, that what we observed

was that the response rate across 10th grade students varied by location.

That we had lower response rates among the urban than we did among the rural

school students.

And so we might see this kind of a situation now.

Our sample of lowercase n of 12,000 happened to have 8,000

of them in metro areas our urban locations.

I'll use that labeling here.

So in metropolitan locations,

8,000 of our sample children came from those locations after we've done our

over sampling and 4,000 from the non-metro but they didn't all respond.

Among the metro students, the 8,000, 5,600 responded.

Among the non metro, 3,400.

The mean scores for these two group differ.

And if what happens is that I take the 9,000 responding students so

we get about a 75% response rate.

But it doesn't look like that response rate is the same across these two groups.

We'll look at it in more detail in a minute.

But now if our means scores differ across those groups,

then what happens is that while the mean score for the sample children

comes out to be about 65, a 60 for the metro and a 75 for the non-metro.

Because of the differential nonresponse when I do that averaging

among just the respondents, I don't get the same mean.

Now it's not a big departure in this case.

I didn't want to exaggerate, it's the same mean in each group.

But because there's slightly different response rates between the two groups,

I get a different mean.

I'd like to compensate for that.

I'm going to use the same tool that I did before.

And it's going to evolve then computing a response rate in each group,

a response rate for the metro and for the non-metro.

And then, for each of those response rates, thinking about them for

a moment and treating them as though they are sampling rates.

So here, you see that, again, in our two locations, Metro and

Non-metro, we have our sample size of the sample, lower case n.

And our respondent sample, lower case r,

5,600 metro, 3,400 non-metro.

But we can now see we've calculated the response rate.

The 5,600 from the 8,000 is a response rate of 70%,

0.7 that's the fraction of the original sample the respondent, for

the metro portion of our sample.

And for the non metro 85% responded and so what we've gotten now into our

sample is an over-representation, if you will, of the non-metro.

Not by any deliberate design but, because,

of the way the non response mechanisms worked.

Outside of our control of now we have a disproportion allocation over all

response rate of 75%.

But, 70% metro 85% non metro.

What are we going to do about it?