How accurate can we be when making a medical diagnosis?

This BBC article (actually about a famous problem known as the Monty Hall problem which is also very interesting) quotes something called Simpson’s paradox. In fact, it’s actually a slightly different issue called the False Positive paradox. It relates to the fact that a diagnostic test with a seemingly high probability of being correct can, in fact, be quite the opposite. I thought it’d be worth explaining the maths.

We suppose that 1% of the population has a disease, and a diagnostic test is developed.

Research shows that among those who have the disease, the test is accurate 99% of the time. Among those who do not have the disease, it is accurate 98% of the time. The below table demonstrates this:

Positive | Negative | |

Diseased | 99% | 1% |

Not diseased | 2% | 98% |

We suppose there are 10 000 people in our population, and that therefore 100 of them have the disease. In total, how many people who test positive actually have the disease? The answer is that only 1/3 of the people who give a positive test do. Why so few? It’s mostly due to the fact the instances of disease in the population are low to begin with.

We have 10 000 people, of whom 9 900 are disease-free and 100 have the disease.

2% of the disease-free population show up as positive, that is 198 people, while 99% of the diseased population show up as positive, 99 people.

Therefore, 197 people show up as positive, while 9 803 are negative. If I choose a person at random from the positive population, the chances of me picking a truly diseased individual is 99/297=1/3 as shown above. Compare that to the negative population where only 1 in 9 803 was wrongly diagnosed as negative (0.01%).

So, if you are diagnosed as “positive” there’s only a 1/3 chance you have the disease: this is because the incidence rate of the disease is lower than the test’s false positive rate.

Is the test therefore rubbish? That is a question that goes beyond the scope of mathematics. Essentially, the problem is balancing whether the 198 people that were wrongly given as positive is more of an issue than the 1 person that was wrongly given as negative. Certainly, it would be more use as a tool to “diagnose” that you didn’t have the disease; but as any doctor will tell you, this is an almost impossible thing to gauge, as there are so many factors to consider, not least the emotions of the patients and their families.

But, then again, 99 times out of 100, the test gets a sufferer right, so it’s (much) better than nothing. Assuming incidences of the disease don’t chance, the best way of improving the test is probably to research reduce the false positive rate so the doctors can be more clearer about the outcome. If they can reduce the number of people who are wrongly given a positive outcome to 1%, the odds of being wrong will be (slightly less than) 50%.

But then, of course, this is not real. I wonder if there are any real-world instances of this?

The other interesting thing about this problem, is that if we change it slightly, so that the test is 100% correct in telling when a patient has a disease but still has a 2% chance of a false positive, the maths changes little. There’s now a 100/298 chance that someone diagnosed positive will get the disease.

What this problem illustrates beautifully is the huge issue of false positives and the importance of eliminating them from a test. In fact, it can be shown that if we reduced the false positives to 1%, then anything over 50% for the patients who have the disease will give us more information than the 99%/2% cited above.

Yes, you’re right, in fact the false positive issue is a significant proportion of what I deal with in my work.