cops if there's one survival analysis

method you need to know it Scott's

created by the British statistician Sir

David Cox during his time here at

Imperial College London in a 1970s there

are many other survival analysis models

which I won't cover in this course so

why is the cop's model so widely used

what is it and how does it work well a

kaplan-meier plot and log-rank tests

oh great for exploring the relation

between one predictor and mortality over

time but they can only manage one

predictor in contrast Cox's approach can

handle multiple predictors it's a type

of regression so if you take in two

previous courses in this series on

regression you'll be familiar with that

concept the full name of his method is

the Cox proportional hazards model

sometimes called Cox regression and

sometimes just a Cox model as its full

name suggests it has a couple of

distinctive features that are now

explained proportional hazards what does

that mean it means that the hazards of

assumed by the model to be proportional

but what are hazards they sound

dangerous like a fire hazard or a bio

hazard and indeed they are

if you misunderstand them in statistics

a hazard is a risk of death as a given

moment in time more generally it's the

risk at a given moment of having the

outcome of interest which as we saw at

the start of this course doesn't have to

be death if you want to use survival

analysis methods this hazard can change

over time so imagine you recruit a set

of patients with cancer and you give

half of them chemotherapy and the other

half gets surgery their risk of death

that hazard is unlikely to be constant

over time it's unlikely to be the same

in the first month as it is in the

second month or five years later the way

their hazard changes over time is called

the hazard function or hazard rate so

that's they has a bit of the name

covered

which is a key concept in survival

analysis but what about proportional in

proportional hazards look at this graph

which shows the hazard function for

death in patients hospitalized for heart

failure on the day they're admitted

their hazard is two percent so two

percent of patients die on the day they

are admitted the next day their hazard

is 1 percent so of those who survive the

first day 1% die on the second day on

day 3 that has it is lower still and so

it goes on over time but not all

patients are like they don't have the

same hazard function the very elderly

have a higher hazard than younger

patients as shown by it the blue line

with the red line shown hazard for

younger patients the shape looks pretty

similar to the first line but the line

for the very elderly is higher so how

can you summarize these two lines how

can you say by how much are the very

elderly more likely to die than young

patients when the two lines are nice and

parallel like they are here you can sum

up the relation in a single number you

can say that the hazard for very elderly

is for example twice the hazard for

young patients the hazards are

proportional meaning that for every time

point you can multiply the hazard for

the young patients by 2 in this case and

get the hazard for the very elderly it

doesn't matter what shapes the two

hazard curves have one is just some

multiple of the other one hazard divided

by another is called a hazard ratio it's

kind of analogous to the odds ratio in

logistic regression even though the

underlying maths totally different this

assumption that the hazards are

proportional is crucial and can be

tested and you'll see how to do that

later in the course the cops

proportional hazards model was one of

the great advances in statistics it's

the most commonly used survival analysis

method when you want to look at multiple

predictors at the same time

happily as

soon see it's really easy to do in our

[Music]