0:01

Now we will be moving on to Section C of the design lecture, and

Â in this section, we're going to talk

Â about testing for hypotheses other than superiority.

Â Frequently, when we think of the hypothesis of interest

Â in a clinical trial, we think of the superiority hypothesis.

Â Is treatment A better than treatment B?

Â Or is treatment B better than treatment A?

Â In this section we'll be talking about designs

Â where we're testing for equivalency or non-inferiority instead

Â of superiority.

Â These are designs that can be used to

Â compare a new intervention to an established intervention.

Â When we use one of these designs we might think that treatment A is as

Â good as or the same as treatment

Â B for treating or preventing a specific condition.

Â But we believe that the use of treatment A might have some other kind of

Â benefit such as less severe adverse events,

Â or treatment A might be easier to administer

Â than treatment B, or treatment A might be cheaper than treatment B.

Â Another use of these designs is to do head-to-head comparisons

Â to two or more established treatments for a specific condition.

Â This uses has been discussed recently quite

Â a bit with respect to comparative effectiveness research.

Â 1:11

First, I'm going to introduce the equivalence design.

Â In the equivalence design, the objective is to show

Â that the intervention response falls

Â sufficiently close to the control response.

Â That is, we are trying to demonstrate the equivalence of the two treatments.

Â We could never show that the two treatments are

Â exactly equivalent, because that would require an infinite sample size.

Â So with the equivalence design, an important

Â question that we have to address very early

Â on in the design process is, how large can the difference be between two treatments,

Â for the treatments to be considered equivalent?

Â Usually we want that detective difference to be extremely small.

Â We want to say that the difference in between the treatments is

Â within a certain small margin in order to call the two treatments equivalent.

Â If the difference that we observe is larger than the margin that

Â we've set, we would say that these two treatments are not equivalent.

Â In equivalence design, we also want to make sure

Â that we have a high probability of detecting a difference

Â if it's larger than the small margin that we've defined.

Â So for both of these reasons, to rule out large differences and to have a large

Â probability of detecting a difference should it exist,

Â we need a large sample size for equivalence designs.

Â 2:21

So as with the superiority design, the comparison that we want to

Â make in an equivalence design is between a null and an alternative hypothesis.

Â However, for an equivalence design, we flip the way

Â we define these two hypotheses.

Â That is where for superiority design we are used to saying that

Â the null hypothesis is that there is no difference between the treatments.

Â For an equivalence design, we say that the null hypothesis

Â is that there is a difference between the two treatments.

Â And our alternative hypothesis for the equivalence design

Â is that there's no difference between the two treatments.

Â So then since we flipped our null and

Â alternative hypotheses.

Â We are also essentially flipping our Type I and Type II errors.

Â So that for an equivalence design, the Type I

Â error is to show no difference when there is one.

Â And the Type II error is to show a difference when there isn't one.

Â 3:13

I pulled this example of an equivalence design from PubMed.

Â In this study, which was coordinated by the Jaeb center in Tampa,

Â the objective was to compare two treatments for the treatment of moderate

Â amblyopia in children ages 7 to 12 years old.

Â The two treatments were weakened atropine or patching

Â of the sound eye for two hours a day.

Â The investigators in this study had previously conducted

Â another trial where they tested the combination of

Â patching and atropine and they found that this

Â combination was effective in treating children with amblyopia.

Â But even after the trial, most health care providers still did not initiate

Â a combination therapy for children with ambyopia.

Â 3:51

So, the investigators decided to test if the two

Â therapies were equivalent to one another when used individually.

Â The children in the study were seen for follow-up visits at 5 and 17

Â weeks following enrollment, and the primary outcome

Â was visual acuity, controlling for baseline acuity.

Â The study was designed to test the equivalence of patching and atropine.

Â The equivalence limit was five letters or one line on the ETDRS chart.

Â That is, the investigators felt that they should rule

Â out a difference of more than one line on

Â the ETDRS chart between the two groups in order

Â to call the two treatments equivalent to one another.

Â 4:31

The last design that we're going to talk

Â about in this section is the non-inferiority design.

Â This is another example of testing a hypothesis other than superiority.

Â In this case, the objective is to determine whether a

Â new treatment is at least as good as an established treatment.

Â To do this we test to see if the hypothesis that

Â a new treatment is worse than the established treatment can be rejected.

Â So our null hypothesis is that the new treatment

Â is worse than the established treatment, and to reject this

Â hypothesis, we need evidence to show that the new

Â treatment is at least as good as the established treatment.

Â You'll note that this type of statistical test is, by definition, one-sided.

Â In other words, the observed estimates from which we would reject the null

Â hypothesis, are located entirely in one tail

Â of the probability distribution of the outcome.

Â Operationally, we need to show that the

Â new treatment's response, if worse, is still sufficiently

Â close to the established treatment's response so that

Â we are comfortable with saying that the new

Â treatment is as good as, or not worse than the established treatment.

Â Again, like with the equivalence design, we're

Â looking for a very small detectable difference.

Â But for the non-inferiority trial, the hypothesis is one-sided,

Â whereas with the equivalence design, the hypothesis is two-sided.

Â A one-sided test does not require as much evidence to reject

Â the null, as a two-sided test at the same error level.

Â Which means that a non-inferiority design does not require

Â as large a sample size as the corresponding equivalence design.

Â But you have to keep in mind that cost of using a one-sided

Â test is that you're rejecting the null with a lower level of evidence.

Â 6:11

An example of a non-inferiority design is the advance to trial in which Apixaban

Â was compared to Enoxaparin for the prevention of venous thromboembolism after

Â total knee replacement surgery.

Â Enoxaparin is a low molecular weight heparin, and is frequently

Â used for the prevention of

Â venous thromboembolism after major joint replacement.

Â However, Enoxaparin increases the risk of bleeding, and it can be cumbersome to use.

Â So the investigators proposed that Apixaban, which is an orally active

Â factor XA inhibitor might be as effective in preventing venous thromboembolism.

Â But may have a lower bleeding risk and it

Â might also be to easier to administer the Enoxaparin.

Â In advance two the patients were allocated

Â to receive oral Apixaban twice a day starting

Â 12 to 24 hours after surgery or subcutaneous

Â injections of Enoxaparin starting 12 hours before surgery.

Â Both treatment groups had placebos or shams.

Â The treatments

Â were continued for 10 to 14 days after surgery.

Â The patients were assessed for the main outcome which was a

Â composite of asymptomatic and symptomatic DBT,

Â non-fatal pulmonary embolism, and all-cause death.

Â Any of these events having an onset during treatment,

Â or within two days of the last dose of treatment.

Â The study was designed to test non-inferiority.

Â The non-inferiority limit was set as the upper

Â 95% confidence limit of the risk ratio of Apixaban versus Enoxaparin,

Â not exceeding 1.25. And the risk difference of

Â Apixaban minus Enoxaparin not exceeding 5.6% of the difference.

Â So to reiterate, the goal of a

Â non-inferiority trial is to demonstrate that the

Â experimental treatment is not worse than the

Â control treatment by more than a pre-specified

Â small amount.

Â This amount is non, is the non-inferiority margin.

Â 8:07

On this slide we're going to look at

Â how non-inferiority margins are used with confidence intervals.

Â And some of you are probably familiar with the concepts

Â of point estimates and confidence intervals from your bio statistics class.

Â But since not everyone has had bio statistics I'm

Â just going to take a moment to review these concepts.

Â 8:26

A point estimate is a single

Â value that estimates some population parameter based on our sample data.

Â An example is the sample mean, which is the average of the values in our sample.

Â And it's frequently used to estimate the unknown population mean.

Â An interval estimate specifies a range within which the

Â population parameter is estimated to lie based on the sample.

Â How likely the interval is to contain

Â the parameter is determined by the confidence level.

Â And that's usually expressed as a percentage.

Â The most commonly used confidence interval is the 95% confidence interval.

Â For a 95% confidence interval, one can expect

Â that if you sample repeatedly from the same

Â population, 95% of the confidence intervals of the

Â sample mean will contain the population mean of interest.

Â 9:14

In this figure, we have several confidence intervals,

Â and these are indicated by the blue and

Â red horizontal lines.

Â The point estimates are designated with the short vertical lines

Â that you see in the middle of the confidence intervals.

Â 9:27

These point estimates represent the sample estimate of the

Â treatment difference between the experimental and the control groups.

Â The solid black vertical line that runs from the top to

Â the bottom, and ends above the zero, is the zero line.

Â And point estimates that are close to the zero line indicate

Â that our best estimate is that there is not

Â much difference between the treatment effects in the two groups.

Â 9:50

Point estimates that fall to the left of

Â the solid zero line, favor the experimental treatment,

Â and point estimates that fall to the right

Â of the zero line favor the controlled treatment.

Â 10:10

You'll notice that there's another long vertical line that

Â is dashed and has a delta at the top.

Â This is our non-inferiority margin.

Â If the confidence interval crosses or falls

Â to the right of the non-inferiority margin,

Â then we cannot reject the null that the experimental is worse than the control.

Â 10:42

So in this figure, the confidence intervals that are shaded

Â in blue fall entirely to the left of the delta line.

Â So in those cases we can say that we have shown non-inferiority.

Â The confidence intervals shaded in red, cross the

Â delta line so we cannot say that we've

Â shown non-inferiority.

Â Only the bottom confidence interval falls to the left of the

Â non-inferiority line and also entirely to the left of the zero line.

Â And so, in that scenario, we can say that

Â we've shown non-inferiority, and we have also shown superiority.

Â 11:17

Trials are sometimes designed with

Â nested non-inferiority and superiority hypotheses.

Â Investigators might

Â design the trial so that if non-inferiority is established when the

Â study is finished, then they can go on and test for superiority.

Â 11:32

The more common situation is when investigators fail to show

Â superiority, but they might then test if they can show non-inferiority.

Â So they can't say that the experimental treatment is better than the control,

Â but they can say that it's at least not inferior by some small amount.

Â In the example of the non-inferiority trial

Â from the previous slide, the advance two trial,

Â the investigators had planned a priori to

Â test for superiority once they had established non-inferiority.

Â So this brings us to the end of the section on designs for hypothesis testing.

Â We've covered superiority, equivalence and non-inferiority hypotheses and

Â in the final section we'll cover adaptive designs.

Â