A practical and example filled tour of simple and multiple regression techniques (linear, logistic, and Cox PH) for estimation, adjustment and prediction.

Loading...

From the course by Johns Hopkins University

Statistical Reasoning for Public Health 2: Regression Methods

81 ratings

A practical and example filled tour of simple and multiple regression techniques (linear, logistic, and Cox PH) for estimation, adjustment and prediction.

From the lesson

Module 2B: Effect Modification (Interaction

Effect modification (Interaction), unlike confounding, is a phenomenon of "nature" and cannot be controlled by study design choice. However, it can be investigated in a manner similar to that of confounding. This set of lectures will define and give examples of effect modification, and compare and contrast it with confounding.

- John McGready, PhD, MSAssociate Scientist, Biostatistics

Bloomberg School of Public Health

Hello, everyone. In this section,

we're just going to do a short overview of what we've just been talking about in

Lecture Sets 4 and 5.

And even in the beginning of Lecture Set 5,

I wanted to talk about distinguishing confounding from effect modification.

And the reason I'm so hip to doing this is is because these two

concepts are easily confused because the language around them sounds similar.

And a lot of times people have difficulty rectifying the differences.

So I am going to take one more step in comparing and contrasting these two

things now that we have seen examples of both and the way to investigate each.

So this short lecture section will reinforce the difference between

confounding and effect modification in terms of each phenomena,

how to investigate each, and I'll just talk very briefly about implications with

regards to design studies.

So confounding is a possibility in non-randomized studies as we discussed,

and quite truthfully with, with small probability in randomized studies it

could occur but generally, randomization minimizes the threat of confounding.

And the nice thing about randomization is it randomizes the threat of confounding by

things that we could conceptualize as potential confounders and

things we never think to.

Measure if we were not doing a randomized study.

So confounding of the two variable relationship,

let's say between two variables, Y and X, can occur if a third variable, Z, or

other factors of variable, Z1, Z2, etc, are related to both Y and X.

And as we've seen via example confounding results in

a distorted estimate either higher than it should be or

lower than it should be of the Y/X relationship via the crude association.

The crude association is impacted by imbalances in

the distribution of the confounding factor or factors between the exposure groups.

Effect modification, on the other hand, is a possibility regardless of study design.

The potential for effect modification is not minimized by randomization.

Effect modification for a two variable relationship, say Y and X again,

can occur if another factor, Z.

And we could extend this to multiple factors, but

it gets very complicated to think of multifactor effect modification.

For purposes of this course, and for most courses, even in advanced statistics.

The thought will be modification by another single factor.

So effect modification for a two variable relationship can occur

if a third variable is related to the association between Y and X.

This third variable is not necessarily related to Y or X, both, or either.

But it is related to the association between the two.

So ignoring, or failing to investigate effect modification may result in

estimating one overall Y/X association when separate,

group specific estimates may be more appropriate.

For example, one overall effect of a drug on a certain outcome for

men and women combined.

When in fact the drug may be more beneficial for one of those groups and

not so effective for the other.

So if we did not investigate the effect modification potential of

sex in that example there, we might have missed a key piece of

the story about the relationship between the drug and the condition.

So how do we assess confounding?

Well, confounding can be controlled for in the estimation process.

And adjusted Y/X association can be estimated and

adjusted for a measured potential confounder or a set of confounders.

And will soon learn that multiple regression allows for

relatively easy adjustment for one or more than one potential confounders and

we'll show how to interpret the results from multiple regressions that do this.

Once we have adjusted estimates then confounding can be

assessed whether there is confounding or

not as well as the degree of confounding by comparing the crude overall.

Unadjusted Y/X association and its resulting uncertainty the confidence

interval to the adjusted Y/X association and it's confidence interval.

And ultimately, the decision about whether the adjusted, unadjusted or

different, qualitatively has to come from an expert in the subject matter.

But we have the tools to do this in assess confounding if we

have both an estimate of the unadjusted and the adjusted association adjusted for

the potential confounders of interest.

Effect modification can also be investigated in the estimation process.

But this requires us not to just adjust for

the potential third factor of interest.

But we actually have to look or estimate a separate Y/X association for

separate values of a potential effect modifier.

For example, look at the relationship between disease and

treatment separately for males and females.

Or separately by age group across three different age groups.

We will also soon learn that multiple regression analysis allows for

this to be done relatively efficiently and easily as well.

Once we've done this however,

effect modification can be assessed as well as the degree of effect

modification by comparing the separate Y/X associations, the estimates, and

the 95% confidence intervals across the values of a potential effect modifier.

So for example, the relative risk of relapse for patients with a given disease

on a drug versus placebo, we might want to look at that estimate separately for

males and females, and the resulting confidence intervals if we wish to

see whether sex modifies the relationship between relapse and the drug.

Multiple regression will actually allow us to do a formal hypothesis test of

interaction as well, to test whether the association of separate sub

groups is statistically significant or not.

Well, in order to actually deal with confounding or

effect modification we might if we are designing this study we

might have implications on how it is designed.

So I'm just going to talk about this briefly, just something to think about.

This certainly confounding when it comes to confounding.

The potential for confounding can be minimized by designing a randomized study

to investigate a relationship between two things, an outcome and exposure.

However, as we've talked about in the beginning of statistical reasoning one,

much research can only be done observationally.

It is not ethical or otherwise possible in many cases to randomize people

to different exposure groups, smoking and non-smoking, socioeconomic status, etc.

But these are exposures of interest on health outcomes.

But in observational studies confounding is always a possibility.

So, it's not possible to minimize the potential for confounding in

an observational study but if the researchers can conceptualize potential

confounders of the key relationships or relationships under study.

If this can be done before the start of the study,

then these can be measured as part of the study.

So this is ideal because if we've measured these potential confounders,

then the key associations of interest can be adjusted for them and

a confounding can be accessed.

So once that occurs the associations can be adjusted for

potential confounders that have been measured.

Another approach that has sometimes been used.

We won't talk about it much in this course other than here is,

instead of adjusting for potential confounders after the study is

completed an observational study can move forward if its perspective.

And exposed and unexposed subjects for

the exposure primary interest can be matched on similar potential confounders,

so long as these confounders were measured at the start of the study.

Like things that are static and not depend on it as the study evolves over time.

Things like the age at the start of the study, the sex of the person, etc.

And so by matching, the researchers can

reduce the systematic difference between those who are exposed and unexposed.

But rather the researcher is going to adjust when all the data is in.

And adjust for the potential confounders that were measured as part of this study.

Or performers to match at the start of the study on some of

the potential confounders that were measured at baseline.

The difficulty with observational studies is that these things whether they are used

for adjustment or for matching, have to be measured.

They have to be conceptualized and measured.

And so the nagging difficulty with interpreting the results

from observational studies is that, there maybe confounding by

factors that were not measured and therefore could not be adjusted for.

That's one way the things that makes randomization ideal because in theory,

randomization balances the exposed and

unexposed groups on, on confounders that a researcher could think of in advance, and

ones that would never enter their thought process.

What are the effect, what are the starting design considerations potentially for

effect modification?

Well, at, at, at face value it sounds like there wouldn't be because

the potential for effect modification is not by effected by study design.

However, it is best as a researcher to have a sense of

any potential effect modifications of interest prior to designing a study.

This will first and foremost,

limit the number of investigations done once the data is collected.

In other words, researchers will not to be colloquial here,

will not drive themselves crazy looking at all possibilities for effect modification.

Every possible interaction of interest.

But the other reason, that, that is potentially important, and

this does have implications for design in this study is this will allow

the study design to be powered to detect an effect modification of interest

with the certain level of power if there is one or two effect modifiers that

the researchers really interested in investigating as part of the study.

Just a, a note on this is that when the FDA in the Unites States

started doing clinical trials, they mainly use men in the research studies.

And obviously that was faulty logic because men and women are very different

biologically and the results for males may not be generalizable to female.

So at some point somebody raised this issue and

it became the norm to include men and women in clinical drug trials.

But at some kind it became important to include enough men and women

in the trials such that not only could the overall association between the outcome of

interest in the drug be estimated with a certain level of precision.

>> And there is a certain amount of power to detect in association between the drug

and the outcome, but this also extended to looking at

the drug outcome relationship separately from males and

females and there was a need to have enough men and women.

Enough of each sex in the study to be able to estimate set, ex specific

associations with a reasonable level of precision and detect a difference of some

degree in these associations were already existing in the population at large.

So it wasn't enough to just have men and

women, these, these trials have to be designed to have enough men and

women such that sex specific estimates can be done with a certain level of precision.

So just to give you an example.

Recall, in the first section, section A, we looked at example of

an observational study done from sort of an environmental health perspective.

Where 64 sites on the eastern US were looked at in terms

of the tree damage in those sites and the elevation of the sites were measured and

we looked at this and we saw that the overall

unadjusted relationship between percent of damaged trees on the site and elevation.

There was no association.

The slope was very small and not statistically significant.

When this association was adjusted for

regional differences between the sites, then we saw positive statistically assoc,

significant association between degree of damage and increased elevation.

But upon further investigation, I showed you that the results look

different between the Northern and Southern sites.

That is the relationship between damage and

elevation, did not look to be the same.

And in fact we saw statistical differences if you

look at the confidence intervals for these two estimated slopes, they do not overlap.

But this study was certainly not designed to look at this type of interaction or

effect modification.

There were only eight sites chosen in the South.

So I don't feel particularly comfortable about the precision.

Our ability to, to quantify the association between damage and

elevation of south,

even if in this small sample study we saw a difference between the north and south.

So as a neat researcher, I might be interested in taking the study to the next

level and designing it not only to estimated more precisely the relationship

between damage and elevation, but to do so within these two regional subgroups, and

have a certain power to detect a difference of a certain magnitude.

So using this preliminary data, were I to be able to get funding and

go forward, I might design a study where I sampled more si,

more sites in both the North and the South.

To achieve a certain power to be able to detect a difference in

the association between damage and

elevation in these regions were to exist at the population level.

So again this study was not designed to estimate precisely damage elevation

estimate separately by region but as we discussed and

were in another researcher interested in designing a follow up study to better and

more precisely quantify regional differences.

They can design a study that had enough observations in both the north and south

enough sites to detect a difference in relationship between damage and elevation

between the north and the south with a certain level of precision or power.

So anyway I hope this brief summary pulled together some ideas that we

were working on in lectures 4 and 5,

and will con, continue to talk about adjusting for confounding.

And accessing confounding by comparison of unadjusted and adjusted estimates.

And will also show how to test for effect modification while considering the other

factors of interest on the outcome in a multiple regression framework.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.