The problems that are run into,
with respect to the census, is the timing on it is a problem.
They're only done every ten years.
The census in the US is done In years ending in zero, every ten years.
Going back to 1790, the first in the census of population.
And that means the data get rapidly out of date.
There are some problems with non-response that occur.
The US census is at first a process of
self administrative forms being mailed out to people through the postal service.
And then people are supposed to fill out those forms and return them.
But about 25% of the US population alone,
more than that, doesn't return those forms.
And then, there's a sequential process in which individuals go door by door
to those households that have not returned the forms, in order to collect the data.
So, it's a complex undertaking because it's so large and vast in its operation.
The cost per completed unit and the total cost are a little misleading.
It may not cost very much per person to do a census, but when you've
got a large population of hundreds of millions of people, the total cost can be
something that only could be attempted periodically by a government system.
And so, it's not feasible to do censuses all the time.
In between censuses, people can do samples.
They're going to get snapshots and try and
get as good a representation of that population as possible.
So the census and the sample working together.
Generally, the census represents what the sample aspires to be
measuring everybody and getting a good accurate picture of that.
But the sample is more cost effective and timely and can be done more often,
and filling the gaps in terms of time, filling the gaps in terms of population,
filling the gaps in terms of variables that are being measured.
The census can only do a limited number at a time.
[SOUND] So, the sample is far less costly and far less time-consuming to conduct.
It's only dealing with a subset of the population, but we want to make that that
sample somehow represents the population, a miniature of it as I've said before.
And if we can't do the whole and
thereby get perfection, we're going to have some error.
We'd love to have it so there's no error,
but there's going to be some error because we only have a subset of the individuals.
How large is that error?
And just what can we do with it?
Now actually we think that in the case of samples,