A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

Loading...

From the course by Johns Hopkins University

Statistical Reasoning for Public Health 1: Estimation, Inference, & Interpretation

207 ratings

Johns Hopkins University

207 ratings

A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

From the lesson

Module 3B: Sampling Variability and Confidence Intervals

The concepts from the previous module (3A) will be extended create 95% CIs for group comparison measures (mean differences, risk differences, etc..) based on the results from a single study.

- John McGready, PhD, MSAssociate Scientist, Biostatistics

Bloomberg School of Public Health

Okay, so this is a very short section.

Just to give a brief update about ratios, a note about ratios, part two.

So, recall from the first brief note about rations in lecture 8D,

that the scaling of ratios is not symmetric around the value of one.

Which would indicate equal values in the numerator and denominator.

On the log scale, however, and we use the natural log, sometimes

written as ln, the log to the base e, the values of the log ratios are

symmetric about the value of zero.

So this rescaling to the, if you will, egalitarian log scale, also means that the

confidence interval limits are comparable in the

log scale for both positive and negative associations.

The intervals will have the same width on

the log scale, regardless of the direction of comparison.

This may not be the case on the ratio scale.

So let's just look at an example.

Let's go back to our example of HIV transmission where mothers,

pregnant mothers with HIV, were given AZT or placebo during pregnancy.

And so you recall we could estimate the relative

risk of contracting HIV for the infants in either direction.

Could either do, what would probably make the

most sense, compare the risk of those who mother's

got AZT compared with those of the placebo, and the relative risk is 0.32.

But we could, there's no reason we couldn't do it in the opposite direction,

take the proportion of infants contracting HIV amongst, born to mothers

who were given the placebo and compared that to proportion of mothers given

AZT and that is a relative risk in the opposite direction, the 3.1.

And look

at the confidence intervals, neither includes the null

value but the width of the confidence interval before

going in the direction of the group with the

higher risk, the placeboed AZT, is a lot wider

than the width for the conference interval when

we compare the lower risk group to the plac-

the lower risk group born to mothers given AZT

to the higher risk group, those born to mothers

given a placebo. So it may look like the second approach,

down here yields a much less precise estimate.

Because that confidence interval is wider.

However, that's because of the distorted ranges

of possible values for positive and negative associations.

So, on the log scale, everything gets equalized.

In the log scale, the log of 0.32 is just the opposite

of the log of 3.1. So, the effect is the same size on the log

scale, just different sign. So, the log of 0.32 is negative 1.14.

A log of 3.1 is positive 1.14.

And regardless of the direction of the comparison, the standard error estimate

of the log relative risk is the same. It's exactly the same.

So consequently the confidence intervals we create on

the log sale will be of the same width.

Because we'll be adding and subtracting the

same two times the same standard error.

So when they get exponentiated back to the ratio scale that things

get distorted depending on which side of one the values are on.

And why is this standard error the same?

Well you can see that we could

arrange the table just, just if we were feeding it to

a computer we would arrange it differently and switch the columns.

But if you apply that formula where you

take one over the first cell minus one over

the second cell down, plus one over the third cell

minus one over the fourth cell you get the

same result regardless of the ordering of the columns.

That standard error is consistent regardless

of how we compute the relative risk.

So here is another example we've looked

at before mortality on dialysis, race, and age.

And one of the tables, the graphics they show.

Actually, they show the relative adjusted

hazard or incidence rate of death in black versus white dialysis patients by h.

And what they did was the estimated ratio and the confidence limits on the ratio.

And what they're showing is the relative risk.

Incidence rate ratio, mortality for black

versus white patients for separate age groups.

And you can see, the incidence rate ratio comparing blacks

to whites is higher for blacks, to whites, and statistically significantly

so in the younger ages, but then it changes direction in the older ages.

Being black is protective mortality amongst those on dialysis.

And this is a phenomenon we'll get to in the second term called interaction.

The relationship between the outcome in one thing, race and

mortality, depends on the level of another thing, which is age.

But what they've done is they've actually

plotted these confidence intervals and points even

though they labeled this with a ratio value, it's actually on the log scale.

And so for example, it's not quite that evident with

numbers that are so close to one making up the

axes, but if you look at the distance between 0.75

and one, it's equal to the distance between 0.1 and 1.33.

I can't draw this very accurately, because those log

distance is the same.

So this helps keep the graphics a little more honest in terms

of the relative precisions of these

estimates, we're not now getting a distortion.

And incomparable precision between ratios greater than one, and

ratios less than one, because of that unequal scaling.

What we see here is that these precision bars

are comparable.

And we have less precision in the younger ages than the older ages.

So in summary, sometimes you'll see ratios and

their confidence level is presented on the log scale.

Ideally they'll do what the authors of the previous paper

did and relabel it with the actual corresponding ratio values and

just scale it relative to the log distance so that you

don't have to exponentiate you know the axes values to actually

get the ratios.

But this is a nice way to display things such that it keeps the comparability

of both the size of the estimates and the interval widths.

It allows us to compare them across values that are greater than one and less

than one without that distortion occurring because of

the unequal ranges on the ratio scale itself.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.