0:00

[MUSIC]

Â American Cancer Society estimates that about 1.7% of women have breast cancer.

Â Susan G Komen for the Cure Foundation states that mammography

Â correctly identifies about 78% of women who truly have breast cancer.

Â An article published in 2003 suggests that up

Â to 10% of all mammograms are false positive.

Â These probabilities are of course estimates,

Â as they're very difficult to calculate pro, precisely.

Â But we're going to take these as givens for this example.

Â As usual, let's first parse through the percentages we're given.

Â The probability of having breast cancer is estimated to be.017.

Â The probability of testing positive given that somebody

Â has breast cancer is .78 and the probability

Â of testing positive even though somebody does not have breast cancer is .10.

Â 1:05

Prior to any testing, and any information exchange between the patient and the

Â doctor, what probability should a doctor assign

Â to a female patient having breast cancer?

Â Since we don't know anything about this patient's medical history, our best

Â bet is to treat them like a randomly chosen person from the population.

Â Hence, we would set this probability at 0.017.

Â This is the prior probability we assigned to a

Â patient having breast cancer before we collect data on them.

Â In other words, before we test them.

Â 1:39

When a patient goes through breast cancer screening, there are 2 competing claims.

Â Patient has cancer or, and patient doesn't have cancer.

Â If a mammogram yields a positive

Â result what is the probability that patient has cancer?

Â If we think about this in probability notation, we're being asked to find

Â what is the probability of breast cancer given that the patient tested positive.

Â And earlier on, we were provided the, the

Â reverse of this probability Positive, given breast cancer.

Â And when we have the conditions reversed, we know that a probability

Â tree might be useful in our calculations. So, let's start building that.

Â There are two competing claims.

Â Patient has breast cancer, or patient does not have breast cancer.

Â The probability of having breast cancer a prior is 0.017 So the probability of not

Â having breast cancer is going to be the compliment of that, 1 minus 0.017, 0.983.

Â If a patient has

Â breast cancer, there are 2 possible outcomes; they

Â might test positive, or they might test negative.

Â The probability of testing positive when the patient has breast cancer is 0.78.

Â The probability then of testing negative when the patient has

Â breast cancer is going to be the complement of 0.78, 0.22.

Â Similarly, when a patient does not have breast cancer,

Â there are two possible outcomes, testing positive or negative.

Â Probability of testing positive even though the patient does not have breast

Â cancer is the .10 we were given earlier so the accuracy r, rate of the test

Â when the patient does not have breast cancer or in other words the probability

Â of testing negative given no breast cancer is going to be the compliment of .10 0.90.

Â We know that we're given that the patient has tested positive, so really,

Â we're only interested in the first and the third branch here, so we don't

Â even have to worry about the other branches, because we know the patient

Â we're interested in doesn't come from the

Â sect of the population that tested negative.

Â 3:49

First, we're going to want to find the joint probabilities.

Â The probability of breast cancer and positive is the product of 0.017 and

Â 0.78, that's 0.01326.

Â And the probability of no breast cancer and, and positive is a

Â product of the probabilities in the two branches leading up to that.

Â 0.0983.

Â We are asked for the probability of breast cancer given positive.

Â Using base theorem, this is going to

Â be the probability of breast cancer and positive,

Â divided by the probability of testing positive.

Â The numerator is simply coming from our top branch.

Â And the denominator.

Â The probability of testing positive is going to be the sum

Â of breast cancer and positive or no breast cancer and positive.

Â Since these are two disjoint outcomes.

Â We add the two probabilities when we're saying or.

Â This gives

Â us about a 12% rate.

Â This, remember, is what we called our posterior probability.

Â Initially, we had given a 1.7% chance to this patient having breast

Â cancer because we knew nothing about them. Then, we tested them.

Â And they tested positive.

Â So, now we have this additional information about the patient.

Â Now, after data

Â collection, the probability that we're assigning to

Â this patient having breast cancer is slightly higher.

Â It's at 12%.

Â 5:39

Once again we run through our probability tree:

Â two competing claims, breast cancer, and no breast cancer.

Â However what has changed now is that this

Â patient is no longer a nobody from the population.

Â We've tested them once and they tested positive.

Â So we have some additional information about them, and

Â we should update our prior with this additional information.

Â In other words we plug in the posterior from the previous iteration,

Â the previous test. To be our new prior.

Â And therefore, the probability of not having breast cancer is updated

Â to be the complement of this, 88%. Next, we can run through our tree again.

Â Remember, nothing about the test has changed.

Â So the probability of testing positive given

Â a patient has breast cancer is still 78%.

Â And the probability of testing

Â negative given the patient has breast cancer is still 22%.

Â Similarly with the lower branch, nothing about the

Â test has changed so the conditional probabilities of testing

Â positive or negative given the patient has, does not

Â have breast cancer is still 10% and 90% respectively.

Â Once again, we're only interested interested in the branches where

Â the patient has tested positive Because we're saying that the second

Â mammogram also yielded a positive result, we can

Â multiply through the branches to find our joint probabilities.

Â And these are going to change a little bit, because

Â our starting probabilities, the probabilities

Â in the first branch has changed.

Â And this time, our probability of having breast cancer and testing positive Is

Â higher at 0.0936. And our probability of no breast

Â cancer in positive is 0.088.

Â In this example, we have reviewed

Â a Bayesian approach to statistical inference.

Â Which involves setting a prior, collecting data, obtaining a posterior,

Â and updating the prior with the posterior from the previous step.

Â In addition we got some practice working with conditional

Â probabilities, probability trees, and the base theorem in general.

Â [MUSIC]

Â