0:33

Then true and false, refer to the true state of the world.

Â So, true means that you actually belong to the class we're trying

Â to identify, and false means that you actually don't belong to that class.

Â So as an example, true positive mean, the true part of true positive, means that you

Â were correctly, so in other words that, the

Â truth is that there actually was something to identify.

Â A positive.

Â In other words, we actually identified you as being belonging to that class.

Â Similarly for a false positive, the positive part

Â again refers to the fact that we identified you

Â as being part of the positive class, and

Â false refers to the fact that you were wrong.

Â We didn't actually corr, classify you to the correct class.

Â To make this a little more concrete, consider a medical testing example.

Â So in this case, we're trying to identify

Â people that are sick using say a screening test,

Â a very common example would be, say mammograms

Â to try to identify if women have breast cancer.

Â In this case, the true part will be the status as to whether you're sick or not.

Â So if we say that we truly identified you, then you were truly sick.

Â And if we falsely identified you, then you were actually healthy.

Â You were not truly sick.

Â 1:41

So in this case, a true positive, is truly, somebody who is truly sick.

Â And it's positive, in other words, we actually

Â diagnosed those people as correctly as being sick.

Â If you're a false positive it means that,

Â false, in other words you are a healthy person,

Â but positive, means that we were still somebody that

Â we identified as being sick, even though you weren't.

Â 2:02

Similarly with a true negative.

Â This is somebody true, who is truly negative,

Â truly healthy, and we identified them as being negative.

Â And a false negative would be somebody who is sick, so we incorrectly identified

Â them as healthy, and the negative part of is we identified them as healthy.

Â You can learn more about sensitivity and

Â specificity by going to this Wikipedia link below.

Â You can also see them in this 2 by 2 table.

Â So it's called a 2 by 2 table, because it has two rows, here, and two columns, here.

Â So, the columns correspond to what your disease status is.

Â So, in this, in this particular example, positive means that you

Â have the disease, and negative means that you don't have the disease.

Â That's the real truth about your disease status.

Â And the test is our prediction, our machine learning algorithm.

Â A positive means we predict that you have a disease and

Â a negative means that we predict that you don't have the disease.

Â So some of the key quantities that people talk about, are the sensitivity.

Â This is the probability that we give you

Â a, predict that you are diseased, given that

Â you really are diseased, so, if you're really

Â diseased, what's the probability we get that, right?

Â And then the specificity is if you are

Â really healthy, what's the probability we get it right?

Â 3:11

The positive predictive value is the probability that we call you diseased,

Â or the probability that you are diseased, given that we call you diseased.

Â So it's a little bit different, than the sensitivity in the sense that, now it's

Â looking at all the people we called

Â po diseased, and saying, what fraction of them.

Â Actually are diseased.

Â Similarly for the negative predictive value.

Â And the accuracy is just a probability

Â that we classified you to the correct outcome.

Â So in this table, it's the terms on the diagonal.

Â It's the true positives, and the true negatives, just added up.

Â 3:41

So you can write these as fractions.

Â So for example, the sensitivity.

Â That's the probability, given that you are diseased, that we called you diseased.

Â So we look at this first column.

Â This is all the people that are diseased.

Â And we look, what fraction of them, did we actually get right.

Â So that's, the true positives, divided by the true

Â positives, plus the false negatives, that gives you the sensitivity.

Â You can similarly make the same sort of fractions for the

Â specificity, the positive predictive value, the

Â negative predictive value, and so forth.

Â When looking at the positive predictive value.

Â We basically look at the, in this case it's the

Â true positives, divided by the true positives plus the false positives,

Â because we're looking at only the positive tests, and we

Â say what fraction of the positive tests did we get right?

Â So the true positives were the ones that we got right, and the

Â true positives plus the false positives, is the total of the positive tests.

Â 4:32

So this is kind of important because, many prediction problems,

Â one of the classes will be more rare than the other.

Â So, for example in, in medical studies, it's very common

Â that only a very small percentage of people will be sick.

Â In this case, suppose that there's a disease where only

Â one, 0.1% of the people are sick in the population.

Â And suppose we have a really good machine learning algorithm.

Â A really good testing kit, that is 99% sensitive, and 99% specific.

Â In other words, the probability that we'll get it right, if you're diseased

Â is 99%, and the probability we'll get it right if you're healthy is 99%.

Â So in this case, suppose that you get a positive test.

Â 5:29

So the general population, remember, we only have

Â about 1% of the people that have the disease.

Â So there are only 100 people in this column, that have the

Â disease, sim, but there are a lot more people that are healthy.

Â Similarly, we have a 99% accuracy, if you have the disease.

Â So, 99 out of 100 people.

Â And 99 out of these 100, are correctly called diseased.

Â Similarly, at, among the people that are healthy, we get 99%

Â right, so 98,901 we call healthy, when they really are healthy.

Â That's 99% of the time.

Â But suppose that we wanted to know, if you got a

Â positive test, what's the probability that you actually have the disease?

Â So, let's look at this for a second.

Â So you say.

Â Suppose you actually got a positive test, that's this first row right here.

Â What's the probability that you actually have the disease?

Â So that's, the number of people that actually have the disease, among

Â the total number of people who had a positive test, so that's 99.

Â Divided by 99 plus 999, so it's only a 9% positive predictive value.

Â In other words, if you got a positive te, test, it's

Â only about a 9% chance that you actually have the disease.

Â What's the reason for that?

Â The reason is 99% of a small number, so 99 out of 100.

Â Is still smaller than 1% out of a much bigger number.

Â So 999 out of a much larger fraction that are actually healthy people.

Â If instead we consider the case where 10% of people are actually sick,

Â then you have a much larger number of people that are actually sick.

Â And 99% of the time, we'll get it right, so 99, 9,900 of people, that actually are

Â sick, we'll call sick, and only 900 of

Â the people that are healthy will be called sick.

Â And so then, things work out how you'd expect them to.

Â In other words, 9,900 out of 9,900 plus 900, so that's.

Â This number on the top left-hand corner.

Â Divided by this total row.

Â Is 92%, and so you have a high positive predictive value.

Â What does this mean?

Â If you're predicting a rare event.

Â You have to be aware of, how rare that event is.

Â This goes back to the idea of knowing what population you're sampling from.

Â When you're building the a predictive model.

Â 7:39

This is actually a key public health issue, so you've probably

Â seen it in the news that there's been questions about how,

Â what's the value of mammograms in detecting disease, and detecting the

Â value of disease versus detecting cases that aren't necessarily life threatening.

Â Similarly, you've probably heard about it for prostate

Â cancer screening, and in both of these cases.

Â You have a fairly rare disease, and even though the

Â screening mechanisms are relatively good, it's very hard to know whether

Â you're getting a lot of false positives that are, as

Â a fraction of the total number of positives that you're getting.

Â For continuous data, you actually don't have quite

Â so simple a scenario, where you only have

Â one of two cases, and one of two types of errors that you can possibly make.

Â The goal here is to see how close you are to the truth.

Â And so, one common way to do that, is with something called mean squared error.

Â And so the idea is, you have a prediction that

Â you have from your model or your machine learning algorithm.

Â And so, you have a prediction for

Â every single sample that you're trying to predict.

Â And you also maybe know the truth for those samples, say in a test set.

Â So what you do is, you calculate

Â the difference between the prediction and the truth.

Â And you square it, so the numbers are all positive.

Â And then you average the total number of, sort

Â of total distance between the pre, prediction and the tree.

Â The one thing that's a little bit difficult about

Â interpreting this number is that you squared this distance,

Â and so, it's a little bit hard to interpret

Â on the same scale as the predictions or the truth.

Â And so what people often do is they take the square root of that quantity.

Â So here, underneath the square root sign, is the same number, it's just the average

Â distance between the prediction and the truth, and you just sum it and square it.

Â And then you take the square root in that number,

Â and that gives you the root, root mean squared error.

Â And this is probably the most common error measure that's used for continuous data.

Â So for continuous data, people often use either

Â the mean squared error, or the mean squared error.

Â But if often doesn't work when there are a lot of outliers.

Â Or the values of the variables can have very different scales.

Â Because, it will be sensitive to those outliers.

Â So, for example, if you have one really, really large value.

Â It might really raise the mean.

Â Instead, what we could use is often the median absolute deviation.

Â So in that case, they take the median

Â of the diff, distance between the observed value,

Â and the predicted value, and they do the

Â absolute value instead of doing the squared value.

Â And so again, that requires all of the distances to be positive,

Â but it's a little bit more robust to the size of those errors.

Â 9:56

And then sensitivity and spe specificity are very commonly

Â used when talking about particularly medical tests, but they also

Â are particularly widely used if you care about one

Â type of error more than the other type of error.

Â And then, accuracy which weights false positives and false positives equally.

Â This is an important point if again you have a very large.

Â Discrepancy in number of times that you're a positive or a negative.

Â For multiclass cases, you might have something like concordance,

Â and here I've linked to one particular distance measure, kappa.

Â But there are a whole large class of distance measures, and they

Â all have different properties, that can be used when you have multiclass data.

Â So, those are some of the common error

Â measures that are used when doing prediction algorithms.

Â