A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

Loading...

From the course by Johns Hopkins University

Statistical Reasoning for Public Health 1: Estimation, Inference, & Interpretation

136 ratings

Johns Hopkins University

136 ratings

A conceptual and interpretive public health approach to some of the most commonly used methods from basic statistics.

From the lesson

Module 2B: Summarization and Measurement

Module 2B includes a single lecture set on summarizing binary outcomes. While at first, summarization of binary outcome may seem simpler than that of continuous outcomes, things get more complicated with group comparisons. Included in the module are examples of and comparisons between risk differences, relative risk and odds ratios. Please see the posted learning objectives for these this module for more details.

- John McGready, PhD, MSAssociate Scientist, Biostatistics

Bloomberg School of Public Health

OK. In this section, we're going to actually talk about comparing binary outcomes between two or more populations using results from two or more samples. And, I alluded to this at the end of the previous section. But, even though we can simply summarize binary outcomes distribution any single group using the sample proportion, things get a little more nuanced and tricky when we start comparing across samples. So, we're going to talk about two ways to compare them can sound absolutely different numerically using the same information. Something called the risk difference and something called the relative risk. So upon completion of this lecture section, you will be able to compute the risk difference in relative risk for comparing binary outcomes between two samples. Interpret the risk difference and relative risk in public health or personal health contexts. Understand that the risk difference in relative risk will always agree in terms of direction but, can differ greatly in magnitude. And, understand that neither the risk difference alone or the relative risk alone is necessarily sufficient to quantify the association of interest. You get the full story if you will. That doesn't stop people from reporting only one or the other though. So, it's something to think about when you hear results quoted in the press especially. OK. So, let's go back to the HIV sample of 1,000 HIV positive patients from a citywide clinical population that we were looking at in section A. And, you may recall we were looking at the overall response to anti-retroviral therapy in this sample. And, we had a 1,000 patients and 206 patients responded. And, as such the proportion and response in our sample was 20.6 or 21 percent as we characterized it. So now, let's break out the sample by their CD4 counts the subject CD4 count at the start of the therapy. And, let's make it binary and classify it as to whether there's a CD4 count at the start of therapy, was less than 250 or greater than or equal to 250. And within each of those two CD4 counts, let's look at the number that responded to the therapy and those that didn't. So, a nearly half the sample, a little over half the sample has CD4 counts of less than 250 at the start of therapy and 127 of these 503 persons responded to therapy. Of the 497 persons who had CD4 counts of greater than or equal to 250 at the start of therapy, 79 responded to the therapy. So, one way to actually detail this is a commonly used tool for displaying data where the outcome of interest in this case responses are not as binary. And a classifier predictor is also binary is a two by two table, two rows and two columns. And, what I put here in the rows is the response that does respond or not respond. And in the columns is the CD4 count. So, at the bottom of the CD4 count less than 250, I put the total number of people who have CD4 counts of less than 250 in the sample 503. Where, the row corresponding or responding needs the column corresponding to CD4 less than 250. I put the number of persons where CD4 counts of less than 250 who responded to therapy 127. Similarly, in the row where non-respond meets up with the group CD4 count less than 250. I put the number who didn't respond in that group 376 the remaining 376 out of the 503. You can fill out the rest of the tables. Well, you will see the road totals in total across the two CD4 count groups there were 206 persons who responded. 794 who didn't. And, we had already seen before that the overall response ignoring CD4 counts was 20.6 percent. So, how can we actually summarize the difference in response between the CD4 count groups. Well, there's several different ways. But, to get started let's start with the sample proportion responding in each sample based on the CD4 counts. So, of the 503 persons who had CD4 counts of less than 250 when they started therapy, 127 responded who were about 25 percent. Compare this with 79 of the 497 persons who had CD4 counts of greater than or equal to 250 at the start of therapy which gives a proportion of about 16 percent. So, one way to characterize the difference here between these two CD4 count groups the difference in response in the study is to take, what's called the difference in proportions. Also, called the risk difference or attributable risk. If we take the proportion who responded in the CD4 count group with less than 250 cells at the start of therapy and subtracted the proportion who responded in the higher CD4 count group, if an absolute difference of positive nine percent. So, how can we interpret this 0.09 or nine percent? We could say that there is a nine percent greater response to therapy in the CD4 count less than 250 group as compared to the CD4 count greater than or equal to 250 group. Or there's a nine percent greater absolute risk of response to therapy. And, it's proper to say risk of response to therapy even though we think the risk being associated with negative outcomes risk is just a synonym for proportional probability. So, a nine percent greater absolute risk of response to therapy in the CD4 less than 250 group as compared to the CD4 greater than or equal to 250 group. Another way to measure this, is using the exact same two number as it will give us something that looks different. Instead of taking the difference in these risks or proportions, we take the ratio- the ratio of proportions. This is sometimes called the ratio of proportions, the relative risk or the risk ratio. So in this case, if we took the 25 percent who responded in the lower CD4 count group and divided by the 16 percent who responded in the greater CD4 count group, we get a ratio of 1.56. How can we interpret this measure? Well, we could say those in the lower CD4 count group- the CD4 less than 250 group have 1.56 times the chances or risk of responding to therapy as compared to the CD4 greater than or equal to 250 group. So, they have a higher chance of responding. Another way to quantify this increase is to say they have a 56 percent greater relative risk of response to therapy in the CD4 less than 250 group as compared to CD4 greater than or equal to 250 group. How did I get that? Well, you could take (0.25 - 0.16)/016. That risk difference of 0.09/0.16 and that's equal to 0.56. The numerator is 56 percent greater than the denominator. Notice this 56 percent relative increase is numerically different than the nine percent absolute increase we saw in the previous slide. Let's look at another example and then we'll spend a little time delving into the differences of how to interpret these two numbers we've laid out how to compute them. Let's look at another example of how to compute them and then we'll really get into the subtleties of interpreting and why they differ numerically. So, this is that classic HIV in mother maternal infant transmission study where pregnant mothers with HIV were randomized to either receive AZT during pregnancy or a placebo. And now, we're going to actually break out the results before we look at the overall proportion of infants across the 363 mothers whose birth outcomes were known in terms of HIV. We looked at them all together. Now, we're going to split them out by the group who were born to mothers given AZT during pregnancy versus those born to mothers who were given placebo. And, the result said from April 1991 through December 20 1993. The cutoff date for the first interim analysis efficacy 477 pregnant women were enrolled. During the study, 409 gave birth to 415 live born infants. HIV infection status was known for 363 births. 180 in the AZT. Again I'm calling this zidovudine AZT. It's more commonly referenced name and easier to pronounce. A 180 in AZT group and 183 in the placebo group. 13 infants of the 180 in the AZT group were HIV positive within 18 months of birth as compared to 40 in the placebo group. So, I've actually taken this verbal output and translated into a two by two table where my row- row is represent the outcome. Either, the child was HIV positive within 18 months or HIV negative and the columns represent whether the child was born to a mother who was given AZT during pregnancy or a placebo. So, here's that two by two table let's now go ahead and summarize. So, if we actually broke this out and looked at it the sample person and I'll leave you to do this and verify. But, the sample proportion of children who contracted HIV within 18 months born to mothers in AZT group was that 13 out of 180 or roughly seven percent. And, if we did the same computation for the placebo group, there were 183 infants born to mothers who were given the placebo during pregnancy and 40 of them contracted HIV with 18 months. So, 40 out of 183 was the proportion who contracted HIV in the placebo group and that was roughly 22 percent. So, if we actually took this difference in proportions this risk difference, it would be the seven percent of infants who contracted HIV from mothers who were given AZT minus the 22 percent who contracted HIV among mothers given the placebo, 0.07 - 0.22 = -0.15 or -15 percent. So, that number is negative because it indicates that the proportion rather the outcome was smaller in the first group the AZT group. So, how do we interpret this? We could say there is a 15 percent absolute reduction in HIV transmission to children born to mothers given AZT as compared to children born to mothers given the placebo. Or, we can say there is a 15 percent lower absolute risk of HIV transmission to children born to mothers given AZT compared to those mothers given placebo. There's a two ways of phrasing that. Now, let's look at summary measure number two. This ratio proportions, also called again the relative risk or risk ratio. So I'll represent that with the RR_hat. And, we're going to do the same direction we're going to compare those in the AZT group to those in the placebo. But, instead of taking the difference of those two numbers will take the ratio. So, we take that 7 percent of infants who contracted HIV in the AZT group and divide it by the 22 percent who contracted it among mothers given the placebo. And, this relative risk is 0.32. So, how can we interpret that 0.32? A couple of different ways of saying this. We could say, the risk of the mother child HIV and transmission for mothers given AZT is 0.32 times the chances for the risk of the mother infant HIV transmission amongst the mothers given placebo. That's perfectly legitimate way to say this. But, it may not drive the point home that the group whose mothers got AZT had lower risk. So, another way to say this and effectively communicate that there was a decrease here when explaining this ratio is to say, there was a 68 percent lower relative risk a mother child HIV transmission from mothers given AZT. How did I get that 68 percent reduction? Well, technically you could sort of Intuit it by taking one and subtracting this number that's less than one, It gives you 0.68. But literally, this means if we were to take that absolute difference 0.07 - 0.22, that negative 15 percent and divide it by the risk in the group in the denominator that 22 percent. That -0.15 over the risk in the placebo group of 22 percent is -0.68. It's 68 percent lesser. -0.15 or 0.15 is 68 percent of that 0.22. So well, what gives here? We've got two different things using the same two inputs that look very different numerically. So, let's first concentrate on what the difference of interpretation will first get that squared away and then we'll talk about why they different numerically. So, the risk difference versus the relative risk. Let's get to the substantive interpretation the difference in them. Both measures use the exact same inputs, same two numbers but give seemingly different results. So, the risk difference here indicated that was a 15 percent reduction in HIV transmission. The relative risk, indicate there was a 68 percent reduction in HIV transmission. How can the reduction be both 15 percent and 68 percent? Seems strange doesn't it? Well, they're measuring- there are different ways of measuring this reduction. Notice, they both agree in terms of the direction, they both show- They talk about a reduction for those infants born to mothers given AZT compared to the group whose mothers were given placebos. So, they have each other's back in terms of the fact that AZT is protective producing these numbers. Well, the risk difference- the substantive interpretation of this give some window into impact. Assuming causation, assuming that AZT is the thing that makes this thing happen. And given that this is a randomized trial that's a pretty safe conclusion. But, can be interpreted as the impact assuming causation at the population level. How much of an impact would this treatment or this intervention or this exposure have if it were applied to a fixed number of persons. So, for example, with this risk difference of negative 15 percent, we can say something like if we had a population where there were estimated 1,000 HIV positive pregnant women per year for example. We'd expect to see 15 percent fewer that would translate to 150 fewer mother trial transmission's if these 1,000 women were given the AZT in pregnancy as opposed to not. So, this gives us some window. If I am in the health minister of a country or the health commissioner of a city or town and I can measure the burden of HIV amongst pregnant women in any given year. I can use this risk difference to help me understand the impact it could have on my population. If I were a health commissioner for a larger population and where I would expect to see about 50,000 HIV pregnant positive women given year. Well, we'd expect to see 7,500 which is 15 percent of 50,000 fewer mother child transmissions. If these 50,000 women were given the AZT as opposed to not treated. That's how we would interpret this. And so this can help us understand resource allocation and what's the burden of the problem here HIV infant transmission. How much of an impact could we have on the actual numbers of transmission. How about the relative risk? How can we. What's the difference and substitute interpretation there? Well, this can be interpreted as the impact assuming causation at the personal level, at the individual level. So, for example, with this relative risk we've seen of 0.32. We could say something like the risk that an HIV positive mother takes each AZT during pregnancy, the risk that she transmits HIV to her child is 0.32 times her risk, if she did not take the AZT. Or, in other words the risk of an HIV positive mother transmits HIV to her child is 60 percent lower if she takes HIV during pregnancy as compared if she were not. So, we can use this to advise individual woman about the potential outcomes for their birth given the treatment versus not.

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.