0:00

Today I

want to

talk to you about how science

answers questions. We all want the right answers.

Should I prepare for rain today? How can I deal with my

boyfriend or girlfriend?

Should I go to the ball game or the concert tonight?

Every day we face myriads of circumstances

for which we need answers, the right answer.

0:39

Well, sorry gang.

But science never can tell you the right answer.

All we can do is give you a probability that an answer is correct or not.

What this means is, is that we live in a statistical

universe. If the universe is statistical by

nature and we now have quantum mechanics to bla, back up that claim.

It may be that fundamentally a 100% crew deterministic

answer does not even exist. In this very real sense, the universe

with its constituents is not a machine, but an indeterminate

process. However, even if it is indeterminate,

it is not a free for all, but constrained by certain boundaries.

It is these boundaries that physics wishes to explore, quantify, and refine.

So we have a problem on

our hands.

If there are three kinds of lies; lies, damned lies and statistics.

How are we to proceed?

Incidentally, that phrase is often attributed to several

people, among who are Mark Twain and Benjamin Disraeli.

Well, the fact of the matter is, that the only way

we can find out whether we are dealing with lies or

damned lies, is through statistics. The way it works is as follows.

2:13

If a theory predicts a phenomenon that is not observed, we can rule out the theory.

But if it does accord with the data we have, all we can say is that the theory is

consistent with the data on hand, not that the theory has been proven correct.

In fact, we can never prove a theory correct.

What this means in practice is that the most important part of any scientific

experiment is the probable error associated with the measurement.

It is more or less the wiggle room that we give a measurement, by

how much our measurement might be different

if we did the experiment over and

over again. As a concrete example, let's

image we are trying to measure the length of a dining room table.

We get out our trusty old stone-age tape measure, and do an experiment.

3:13

Measurement one, 260 centimeters. Now,

what is the experimental error that, that we can estimate from this measurement?

You might think nothing, since we have nothing to compare it to.

But wait.

If we look at our stone-age device, we see that

it is very crudely made with big unmarked segments and

thick lines marking the intervals. So we can estimate something

called a systematic error, at say, ten centimeters.

But we are not satisfied with that. So we make more measurements.

250 centimeters. 260 centimeters.

250, 270, 260, 100. 100 centimeters?

4:09

Whoa, what happened?

Do we really think that the table length is variable?

It might be.

But at first glance it appears we have made what we call a blunder.

If we use that 100 centimeter measurement in our computation

of, say, an average length, we will throw off everything.

But if we suspect a blunder, we need to track down

its source, if possible. So, using the

six supposedly valid measurements, we can obtain a sample

average of 258 centimeters. Now

we can ask, what is the experimental error associated with this

determination? In other words, how close to 258

centimeters would you expect each measurement to be?

Clearly, we must compare our sample average

with the individual measurements we already have.

Also, we sense that the measurement that's smaller than the

average, should count equally with a measurement that is larger.

So, we better square things first and then take the

square root, in order to avoid minus signs that would

be associated with measurements that might be smaller than average.

Thus, we expect that we need to take the following quantity.

In other words, for each of our N measurements, we compare the actual

measurement, x of i with the mean of all the measurements,

x bar, and square it. Then, add all N

results together and take the square root of the whole shebang.

6:02

But clearly something is missing here because larger samples of measurements,

in other words, N larger, should not imply a larger error.

Measurements have to be worth something.

So we sense that our estimate of standard

deviation should include some factor of 1 over N.

In fact, it turns out that this quantity, usually designated as sigma,

is equal to our previous sum, but divided by the square root of N.

In essence, we are taking the average value of the square deviations

from the mean.

Now, remembering that you need a minimum of two measurements to get any average

value, we lead, this leads to the refinement of our equation as follows.

6:56

It's the same as before in, except it has N minus 1, instead

of N under the square root. The value of this quantity,

sigma, associated with a mean value, can be shown to have an astonishing property.

7:15

That 68% or about 2 3rds of all measurements you can possibly make,

even into the future, will fall within plus or minus 1 sigma of the mean.

As long as the properties of the phenomenon haven't been altered.

To summarize,

7:49

And plus or minus 3 sigma contains 99.7%

of all data measurements. That's

it. It doesn't matter what you are measuring.

You could be interested in the height of 25 year old women in Borneo.

A comparison of daily maximum temperatures of two cities anywhere in the world.

Measurement of returns on investment versus risks in

financial markets. Analysis of statistics of scoring

in sports.

All of these basically use the same ideas presented here.

8:31

However, if the phenomenon has changed, or a new phenomenon is somehow buried in the

data, a smaller standard deviation will enable you to detect it more easily.

Let's get back to our table and fast forward to the 21st century.

New devices now allow us to obtain much better precision in our measurements.

Now, using the same table, we may obtain the following results.

Now, you look you at these numbers closely and say, hmm.

Are we seeing something significant in the fact that the numbers seem to cluster

around 258.65 and

258.85? Maybe, maybe not.

But we pay attention to this detail,

and then find out with additional measurements,

that we have obtained the higher numbers

when the temperature of the room is significantly

higher than when we look at the lower values.

We have discovered something.

The table is changing its length in

response to a temperature change in its environment.

Our measurements have revealed the thermal expansion of the table.

Something we had not anticipated, perhaps. And X-ray astronomy, as we

shall see, is filled with surprises of this sort.

And I'm sure your life is filled with them, as well.

When have you thought that you were exploring or answering one question,

when in reality, you were finding out something quite different instead?

Why not discuss this on the forum?

10:30

We now shift gears and look at, look at a hypothetical astronomical example.

In this case, our determination of the standard deviation, or uncertainty in our

measurements, are even easier to obtain than our result for the table length.

This is because in certain situations,

which fortunately include most astronomical observations,

a very simple result ensues concerning what we might expect

from a measurement of say, the brightness of a cosmic X-ray source

as a function of time. The idea is as follows.

Let's suppose you have a random process, such

as the emission of light from an object.

We know that when an electron changes

its energy from within an atom by jumping from one level to

another, it is accompanied by the emission or absorption of a photon.

And we know that it is random in

the mathematical sense, because we can never know

exactly when this will happen, but it will

probably happen in a certain given time period.

And if it happens to lots and lots

of electrons, lots and lots of times, we will get lots and

lots of photons into our cameras or detectors.

Let's suppose, to make this concrete, we observe a source for ten

minutes and count 21,262 photons.

We sense that if we were to do this measurement

again and again, we would not get exactly 21,262

photons again and again, even if the source were unchanging.

The randomness of the process ensures this.

12:29

Well, in these circumstances, there is a simple way to

estimate what the probability is of getting another result similar

to but not identical with our first trial, if we were to repeat our measurement.

We simply take the square root of the number of photons

observed, and that represents the range, plus and minus from our

observation, that we would expect to see 2 3rds of the

time, if we were to do the observation over and over again.

Thus, if we consider our original observation, we would expect to

observe 21,262 photons

plus or minus 146, about 2 3rds

of the time if we were to repeat the experiment over and over.

Why? Because 146 is

approximately the square root of 21,262.

The number 146, once again, is the

standard deviation of our observation. In astronomy

however, just raw numbers of photons are not particularly interesting.

We are more interested in rates. How much energy

is emitted per second or how many photons

are detected per second during any given observation?

So, let's see how this plays out in practice.

14:10

Let's imagine that we have 100 photons

in ten seconds. Our expected range,

or in statistical language, our standard deviation, will then be

100 plus or minus 10, because

10 is the square root of hundred, of 100 over the ten seconds.

This translates into a rate of 100 counts over

10 seconds, plus or minus 10 counts

over 10 seconds. Or, 10 plus

or minus 1 count per second.

15:02

Let's imagine the same source, which is assumed unchanging, but now we

observe it for 1,000 seconds. In other words, our observation

is 100 times as long. Since we get

100 counts in 10 seconds, we would expect to

get 10,000 counts in 1,000 seconds.

And therefore, we would expect to have 10,000,

plus or minus 100 counts, in our observation.

15:41

Our rate then would be 10,000

divided by 1,000 seconds,

plus or minus 100 counts in

1,000 seconds for 10,

plus or minus 0.1 counts

per second. Notice, that we

16:12

needed 100 times more data to get our standard

deviation down by only a factor of 10.

What a bummer.

So, as you see, it can be slow going and sometimes very

expensive to get better and better results.

But it is the reason why scientists are always

asking for more data, better detection instruments, and bigger telescopes.

We will explore this important issue in greater depth in

week three, when we talk about clocks in the sky.

But for now, we will just state that

the size of the error bar, or standard deviation,

may play a decisive role in what we can legitimately say

about an astronomical source. Consider the following hypothetical

data points measuring the brightness of an object versus time.

So what we're going to do is we're going

to plot the brightness of a source versus time.

And let's imagine that we have the following

points on our graph, something that looks like this.

17:52

Now we ask a simple question, is this source varying?

Well, it depends on the size of the error bars.

With a small sigma, we are more or

less forced to connect the observations with some

kind of variable curve. Let's look at that.

Okay? Let's just look at what

18:18

would happen if we have very small error bars

attached to which of these points. You can

see without even drawing anything, that

it's impossible to just fit an ordinary,

non-varying line through the data that

would meet our requirement that 2 3rds of our data points be

within one sigma of that particular line. We are almost

forced into drawing something like that.

19:08

Let's imagine that we have exactly the same data points.

They're the same, but now we have associated

with each measurement, maybe because we're using a smaller

telescope, or our detectors aren't as good or whatever.

Now, we imagine that associated with the same data are really, really big

error bars. Now, you can see that it's quite

easy to meet our requirement, more or less, of 2 3rds of

all of the data being within plus or minus 1 sigma of our

mean, by fitting a straight unvarying line through the data.

19:59

So you can see that the standard deviation sigma is

critical to our observation and it will determine

whether our scientific estimate of variability

is a lie, a damned lie, or a legitimate

statement

of probable

fact.