0:00

In that extended example,

Â we saw the value of doing good analytics with these performance measures.

Â What first appeared to be significant differences in skill turned out to be

Â purely chance.

Â That really emphasizes is importance of persistence.

Â We need to find ways of testing for persistence.

Â In that case we just looked at split sample,

Â how performance varied in one year and then how it varied in the next year.

Â Finding ways to do that,

Â is one of the most fundamental ways you can parse signal from noise.

Â We're going to focus on four additional issues for

Â the rest of the module, regression to the mean, sample size, signal independence and

Â process versus outcome.

Â These are all important concepts to have in mind as you dig into your data, and

Â they're also issues that we tend to have if we only reason about data intuitively.

Â They're issues that data can improve, analytics can improve, but

Â analytics aren't a panacea, you can still make these mistakes even with data.

Â 1:06

And I want to start that with a very Simple Model of performance

Â where you can think of performance in terms of Real Tendency + Luck.

Â And we've been talking about this a little bit, we can formalize it and

Â don't get too put off by the baby math here, but in formal terms you

Â can think of performance y as a function of x true ability and

Â e some error, some randomly distributed error around 0.

Â Now, what does that mean for when we sample on extreme performance?

Â What underlies extreme success and failure?

Â If that's the model of the world, and everything we've been saying so

Â far says it is, that there's some noise in these performance measures,

Â what does it mean when we sample on extreme performance?

Â Well, it means that extreme success

Â suggests that the person might in fact have superior ability or

Â tried very hard, but also that they got lucky, that error was positive.

Â And conversely, extreme failure perhaps means inferior ability or

Â that they did not try very hard but also negative error or that they got unlucky.

Â We can be sure that as we sample very extremely on performance measure,

Â a noisy performance measure, and they're all noisy,

Â we can be sure that when we go to the extremes, we get extreme error as well.

Â What are the consequences of that?

Â There's one very important consequence, and that is in subsequent periods,

Â error won't be negative again, it will regress to the mean.

Â You'd expect it to be zero.

Â Error is, by definition, zero.

Â And if you got very positive error in one period,

Â you would expect less error in the following period.

Â This is a notion called regression to the mean, and

Â it's one of the most important notions in performance evaluation.

Â An example.

Â There was a study a few years ago of mutual fund performance, in the 1990s.

Â The study divided the 1990s into two halves.

Â 1990-94, and then 1995-1999.

Â And they looked at the top 10 performing funds from the first half of the decade.

Â And here, I'll show them to you.

Â We anonymized them.

Â This is just supposed funds A through J, and their performance in the early 1990s.

Â There were 283 funds in this study.

Â These were only the top 10 performing funds.

Â Then they did two things.

Â They go and ask how do these funds perform in subsequent years?

Â And they did an interesting thing in between.

Â They ask people, what do they predict happened in the next few years?

Â What do they think performance would be realized in the second half of the decade?

Â Here are the predictions, the estimations from the people that they ask.

Â They thought the top performing,

Â they didn't think the top performing firm A would again be the top performing firm,

Â but they thought maybe tenth and so on down the list.

Â E which is the fifth performing firm they thought, well, maybe 44th and so

Â you can see that they didn't expected the firms to be as good, but

Â they expected some regression to the mean.

Â Then they looked to what actually happened.

Â What actually happened?

Â It ranged from 129th, 21st, 54th.

Â The interesting this is that on average, the firms performed.

Â Their rank was 142.5.

Â What is the significance of 142.5?

Â It's half of the total number of firms in the study.

Â In other words, the average performance of the top 10 firms, in the second period,

Â the second half of the 90s, was perfectly average for this sample.

Â They've regressed entirely.

Â The top 10 mutual funds in the top half the 90s and

Â of early 90s regressed entirely to the mean in the second half of the 90s.

Â If that's the case ,what does that say about how much skill versus luck

Â was involved with how those firms did in the first half of the 90s?

Â If they regress all the way to the mean in the second period,

Â it suggests that there was no skill.

Â That the differences that we saw, and

Â there are huge consequences to those differences because we know that new

Â funds flow to successful funds, were in fact, entirely based on luck.

Â There are many other examples.

Â Danny Conoman, Nobel Prize winner Danny Conoman gives a famous example of being

Â an officer in the Israeli Air Force.

Â 5:56

And after a good flight, if there's some chance involved there,

Â you would expect that the following flight wouldn't be as good and conversely.

Â After a bad flight, if there's some chance involved, you would expect,

Â you would predict that the next flight would, on average, be better.

Â This is exactly why we have to be so careful about regression to the mean.

Â We have the wrong model of the world if we don't appreciate regression to the mean.

Â We walk around like the Israeli Air Force officer who believed that it's all about

Â praise and punishment as opposed to merely statistical regression to the mean.

Â There's another example which comes from,

Â we're not going to pick on Israeli Air Force officers.

Â One comes from Tom Peters one of the original business book Peters and

Â Waterman were McKenzie consultants no less.

Â And they did a study and it begins as an internal study and

Â they've eventually published it as a hugely best selling book,

Â on what determines excellence in companies.

Â They selected 43 high performing firms and tried to learn what they could

Â about business practices from these top 43 firms.

Â 6:59

But, subsequently a few folks evaluated the performance of those 43 firms,

Â and what do they find?

Â 5 years later, there were still some Excellent Companies.

Â And there were some that were solid, but not exactly the top of their industries.

Â And then there were quite a few in weakened positions, and there were

Â even some from these supposed 43 excellent companies who were fully troubled.

Â Now, this is exactly what you'd expect from regression to the mean.

Â And that suggests that, that sample, that Peters and Waterman had grabbed,

Â as supposedly excellent firms, they grabbed them.

Â Perhaps they were on average, a little bit better.

Â But they hadn't necessarily been lucky.

Â To make it into that sample, to be called the most successful of 43 firms in

Â the world, essentially, they were necessarily lucky, and

Â in subsequent periods, they're not going to have luck break their direction.

Â 7:49

This is something that you'll see anytime you sample based on extreme values that

Â if you sample on one attribute, any other attribute that's not perfectly related

Â will tend to be closer to the mean value.

Â We've been talking about performance at points in time.

Â If you sample on extreme performance at one time period, the subsequent time

Â period won't be as extreme whether you sample extremely good or extremely bad.

Â But it can also be attributes within an individual or within an organization.

Â If you sample, say, a person's running speed, and

Â then look at what their language ability is, these things are imperfectly related.

Â If you only looked at the fastest runners,

Â how would you expect them to perform on some language ability test?

Â They wouldn't be, they wouldn't be a sign.

Â The fastest runners will almost find definition

Â not necessarily be the people with the best language ability.

Â But that's not because their some inverse relationship between running and

Â language ability is that this two traits or simply imperfectly correlated.

Â And so when you sample on the extreme,

Â you have to expect regression to the mean on any other attribute.

Â 8:51

We could spend a day on regression to the mean.

Â In fact, there aren't many concepts that are more important

Â in understanding the world than regression to the mean.

Â We could spend hours on this.

Â And, I would be very happy if you walked away from this course with only two or

Â three ideas, if this was one of them because it's going to help your reasoning

Â about the world.

Â Why is this so hard?

Â Why is this such a hard concept to stay, to live?

Â Well, there are a few things to get in the way.

Â Among others, we have this outcome bias.

Â I mentioned it with Hershey and Bear, and referenced Hershey and Bear, and

Â they're the ones that came up with the study originally.

Â We tend to believe that good things happen to people who worked hard,

Â bad things happened to people that work badly and

Â we draw too strong an inference based on this.

Â We tend to judge decisions and people by outcomes and not by process.

Â This is a real problem and it gets in the way of our understanding

Â this regression to the mean, framework for the world.

Â Two others, one is hindsight bias.

Â Once we've seen something occur we have a hard time

Â appreciating that we didn't anticipate it occurring.

Â We in fact often misbelieve

Â that we anticipated that's exactly what would happen.

Â We show hindsight bias.

Â And again, if that's the way we reason about what happens,

Â then we're not going to appreciate that what happens next just could possible be

Â just regression to the mean.

Â And then finally narrative seeking, we want to make sense of the world,

Â we want to connect the dots.

Â We tend to believe things better if we can tell a causal story in between

Â what took place at time one and what took place at time two.

Â And if we can tell a causal story, then

Â we actually have a great confidence in our ability to predict what happens next.

Â We seek these stories as opposed to what I've been telling you which is this dry

Â statistical reason for why things happen.

Â We seek stories.

Â And that again gets in the way of our understanding

Â the statistical processes that actually drive what's going on.

Â In short, we make sense of the past.

Â We are sense making animals, and we make sense of the past, and

Â there's not a lot of sense to be had for merely regression to the mean.

Â But it's going to it's going to get in the way of predicting what happens next.

Â We try to find stories that connects all the dots, and

Â we by doing that get a chance to smaller role and those stories.

Â There was an Internet meme that captures these well a year or two ago where,

Â if this is knowledge distributed in your experience in the past this is knowledge

Â you might have and with that knowledge perhaps you can add some experience and

Â start connecting the dots, drawing simple eyes,

Â create something from that knowledge.

Â That's good.

Â That's what we want experience to do.

Â Bu then sometimes we're inclined to do this,

Â which is get a little too creative and over fit those lines.

Â We turn what should be a pretty straight grid, pretty parsimonious

Â connections into something that is unlikely to replicate in the future.

Â It might be very satisfying, might be a very satisfying interpretation of

Â the past, but it's over fit, and an over fit interpretation of the past is going to

Â make very bad predictions about the future.

Â