By William M. London
In Part 1, I presented this problem and explained why you should care about its answers:
In a hypothetical community, 100 out of every 10,000 women actually have breast cancer at a detectable preclinical phase of the disease and the rest do not have breast cancer. (Let’s assume that we know all this to be true.) All 10,000 women in this community receive screening mammography.
For women who actually have breast cancer, assume that mammography will correctly detect evidence of breast cancer 80% of the time (which means it will fail to detect evidence of breast cancer 20% of the time).
For women who don’t actually have breast cancer, assume that mammography will correctly identify them as not having breast cancer 90% of the time, but incorrectly identify them as having breast cancer 10% of the time.
A mammography result suggestive of cancer is called a positive result. A mammography result suggesting that that cancer is not present is called a negative result.
Given the information above, what percent of the women who receive a positive test result actually have breast cancer?
Given the information above, what percent of the women who receive a negative test result are truly cancer free?
I offered as a hint these caveats inspired by the analogous, heteronormative signal interpretation efforts of the white female characters in the movie “He’s Just Not That Into You”:
- The probability that guys will call you if they love you is not the same as the probability that guys love you if they call you.
- The probability that guys will not call you if they don’t love you is not the same as the probability that guys don’t love you if they don’t call you.
Here’s another way to think about the hint:
- Let’s assume that, when guys love you, most of them will call. But that doesn’t mean that when guys call, the probability is high that they love you.
- Let’s assume that, when guys don’t love you, most of them won’t call. But that doesn’t mean that when guys don’t call, the probability is high that they don’t love you. (Perhaps they’re busy with their blogging.)
The hint was supposed to help you recognize:
- It’s important not to confuse (1) the probability of getting a mammography screening result of positive when testing women with asymptomatic breast cancer with (2) the probability of having asymptomatic breast cancer for women who have received positive findings from screening mammography.
- It’s also important not to confuse (1) the probability of getting a mammography screening result of negative when testing women who don’t have asymptomatic breast cancer with (2) the probability of not having asymptomatic breast cancer for women who have received negative findings from screening mammography.
Many people (including many of my epidemiology students) frequently confuse these various conditional probabilities and consider the answers to at least one of the two questions posed at the end of the problem (especially the first question) to be counterintuitive. Here are the answers:
- The percentage of women who receive a positive test result that actually have breast cancer is only 7.5%. In this scenario, positive test results are poor signals of actually having breast cancer.
- The percentage of women who receive a negative test result that actually don’t have breast cancer is 99.8%. In this scenario, negative test results are strong signals of not actually having breast cancer.
Many people find these answers to be counter-intuitive. I’ve had arguments in online forums about the answer to this type of problem with smart people who simply could not acknowledge they were wrong even after I explained their probability confusions. These arguments reminded me of how even some mathematicians could not acknowledge that Marilyn vos Savant was correct in her analysis of the famous “Monty Hall” probability problem.
Most people have a hard time with these types of conditional probability problems even though the only math ability required to solve them is simple arithmetic. The challenge is to recognize how to arrange the arithmetic and have the patience to do so without getting mixed up. If you don’t trust me about the answers to the screening problem and you’re curious enough to see how I got them, here’s a step-by-step approach to arranging the arithmetic followed by a discussion of the practical implications for anyone thinking about getting screened for any disease (i.e., being transformed from a normal human being into a patient):
Step 1: Create a 2 by 2 Table with Margins for Column and Row Totals
Create a 2 by 2 table (as shown below this paragraph), which has an additional column and an additional row for keeping track of column and row totals. The first two columns will represent actual breast cancer status (disease present–at an asymptomatic stage–or disease absent). The first two rows will represent the two possible test results (positive versus negative). The total number of women screened is given in the problem and is equal to 10,000. The number 10,000 is labeled in the table as the grand total of four subgroups of women: (A) women with true positive results, (B) women with false positive results, (C) women with false negative results, and (D) women with true negative results. The grand total will equal the sum of all positive and negative test results; it will also equal the number of women with asymptomatic breast cancer plus the number of women without asymptomatic breast cancer.
Disease Present | Disease Absent |
Row Totals |
|
Positive Test Result |
A (True +) | B (False +) | (A+B)= all women with positive test results |
Negative Test Result |
C (False -) | D (True -) | (C+D)= all women with negative test results |
Column Totals | (A+C)= all women with asymptomatic breast cancer | (B+D)= all women who don’t have breast cancer |
Grand total =(A+B+C+D)= 10,000 |
When a woman with asymptomatic breast cancer gets a mammography result of ‘positive,’ she is a true positive (counted in cell A); when her result is ‘negative,’ the result is a false negative (counted in cell C).
When a woman who does not have breast cancer gets a mammography result of ‘negative,’ she is a true negative (counted in cell D); when her result is ‘positive,” the result is a false positive (counted in cell B).
Step 2: Use Information Given About Prevalence to Complete the Column Totals
Part of the problem statement indicates that 100 out of 10,000 women actually have asymptomatic breast cancer. That’s equivalent to 10 out of 1,000 or 1 out of 100 or 1% of women having asymptomatic breast cancer. This 1% figure represents the prevalence of asymptomatic (also known as preclinical) breast cancer. Prevalence means how common a disease is (at either a specified point in time or during a specified period of time). If everyone has the disease of interest, prevalence is 100%. If no one has the disease of interest, prevalence is 0%.
If 100 of the women have breast cancer, then 9,900 (or 10,000 minus 100) of the women don’t have breast cancer. Let’s enter these numbers in the appropriate cells of the table:
Disease Present | Disease Absent |
Row Totals |
|
Positive Test Result |
A (True +) | B (False +) | (A+B) |
Negative Test Result |
C (False -) | D (True -) | (C+D) |
Column Totals | (A+C)=100 | (B+D)=9,900 |
Grand total =(A+B+C+D)= 10,000 |
Step 3: Use Information Given About Test Sensitivity to Complete the “Disease Present” Column
The problem as stated includes this information: For women who actually have breast cancer, assume that mammography will correctly detect evidence of breast cancer 80% of the time and not detect evidence of breast cancer 20% of the time. In other words, the proportion of women with the disease who receive a positive test result is in this case 80%. This figure is called the sensitivity of the test. Sensitivity tells us how valid the test is when used with people who actually have the disease of interest. The ideal screening test has 100% sensitivity. Sensitivity is equal to the number of true positive test results divided by the total number of screened people who actually have the disease of interest. Sensitivity is represented in the table as cell A divided by the sum of cells A and C. To be more concise: sensitivity = A/(A+C). We know that the number in cell A must be 80% of the number already inserted into cell (A+C). Since A+C is is conveniently equal to 100, A, in this problem must be equal to 80 true positive test results. If A+C equals 100 and A equals 80, then C must be 100-80=20. I now insert the numbers 80 and 20 into the table:
Disease Present | Disease Absent |
Row Totals |
|
Positive Test Result | A (True +) = 80 | B (False +) | (A+B) |
Negative Test Result |
C(False -) = 20 | D (True -) | (C+D) |
Column Totals | A+C=100 | B+D=9,900 |
Grand total =A+B+C+D= 10,000 |
Step 4: Use Information Given About Test Specificity to Complete the “Disease Absent” Column
The problem as stated includes this information: For women who don’t actually have breast cancer, mammography will correctly identify them as not having breast cancer 90% of the time, but incorrectly identify them as having breast cancer 10% of the time. In other words, the proportion of women without the disease who test negative is in this case 90%. This figure is called the specificity of the test. The ideal screening test has 100% specificity. Specificity is equal to the number of true negative test results divided by the total number of people who don’t have the disease of interest. Specificity is represented in the table as cell D divided by the sum of cells B and D. To be more concise: specificity = D/(B+D). We know that the number in cell D must be 90% of the number in (B+D). Since B+D is 9,900 (which is not so easy for solving the problem), D must be 90% of 9,900, which equal 8,910. If D is 8,910 and B+D is 9,900, then B must be 9,900-8,910, which is 990. (Also note that since the specificity is 90%, the false positives as a percentage of all the women without breast cancer must be 10%, as noted in the problem statement. Note that 10% of 9,900 is 990. Thus, we insert 990 into cell B. So here’s what we now have:
Disease Present | Disease Absent |
Row Totals |
|
Positive Test Result |
A (True +) = 80 | B (False +) = 990 | (A+B) |
Negative Test Result |
C(False -) = 20 | D (True -) = 8,910 | (C+D) |
Column Totals | (A+C)=100 | (B+D)=9,900 |
Grand total =(A+B+C+D)= 10,000 |
Step 5. Complete the Table by Summing A+B and Then Summing C+D
We find that (A+B), which represents the total number of positive test results, equals 1,070 and that (C+D), which represents the total number of negative test results, equals 8,930. We insert 1,070 and 8,930 as shown below.
Disease Present | Disease Absent |
Row Totals |
|
Positive Test Result | A (True +) = 80 | B (False +) = 990 | (A+B) = 1,070 |
Negative Test Result |
C(False -) = 20 | D (True -) = 8,910 | (C+D) = 8,930 |
Column Totals | (A+C)=100 | (B+D)=9,900 |
Grand total =(A+B+C+D)= 10,000 |
Step 6: Check the Arithmetic for the Completed Table
(A+B) + (C+D) should sum together to equal (A+B+C+D), which we already know is 10,000. Since 1,070 + 8,930 = 10,000, we have confirmed that we completed the table correctly.
Step 7: Use the Numbers in the Completed Table to Calculate the Answer to the First of the Two Questions in the Problem Statement
The first question was: What percent of the women who receive a positive test result actually have breast cancer?
We know that 1,070 is the total for the “Positive Test Result” row. That represents all women who received a test result of positive. We know that 80 of these women actually had asymptomatic breast cancer (true positives). Therefore, 80 out 1,070 or 80/1,070 or 7.5% of the women with positive test results actually have breast cancer. That means 92.5% of the women with positive test results in this scenario were false positives.
The 7.5% answer gives us what is called predictive value of a positive test result, also known as the positive predictive value. The positive predictive value is what I want to know whenever I receive a positive test result from a screening test because it indicates the probability of having the disease among people with positive test results. In the scenario presented with 1% prevalence, 80% sensitivity, and 90% specificity, the vast majority of positive test results will be false positive test results.
Step 8: Use the Numbers in the Completed Table to Calculate the Answer to the Second of the Two Questions in the Problem Statement
The second question was: What percent of the women who receive a negative test result actually do not have breast cancer?
We know that 8,930 is the total for the “Negative Test Result” row. That represents all women who who received a test result of negative. We know that 8,910 of these women do not have breast cancer. Therefore, 8,910 out 8,930 or 8,910/8,930 or 99.8% of the women with negative test results do not have breast cancer. Only .2% of the women with negative test results actually have the disease.
The 99.8% answer gives us what is called the predictive value of a negative test result, also known as the negative predictive value. The negative predictive value is what I want to know whenever I receive a negative test result from a screening test because it indicates the probability of having the disease among people with negative test results. In the scenario presented with 1% prevalence, 80% sensitivity, and 90% specificity, the vast majority of negative test results will be true negative test results.
Some Practical Implications
Isn’t it interesting that the sensitivity and specificity were so high, but the positive predictive value was so low? It’s important to keep in mind that predictive value isn’t determined just by test validity (sensitivity and specificity). The third determining factor is the prevalence of the asymptomatic disease in the population screened. When the prevalence is low and sensitivity and specificity are reasonably high, we can expect low positive predictive value and high negative predictive value. Screening programs tend to be problematic when the prevalence of the disease of interest at a detectable, treatable asymptomatic stage is low. Aesop’s fable, “The Boy Who Cried Wolf” shows how too many false alarms can be problematic.
Anyone who is familiar with screening for weapons at airports already knows this intuitively. The screening imaging at airport security checkpoints has high sensitivity and high specificity, but the prevalence of weapons carried by passengers is extremely low. Thus, many of us have had positive test results at airport security screenings leading to follow-up testing that almost always shows that the initial positive test results were false. Airport security procedures may protect us, but it probably inconveniences us even more. The costs of airport security are enormous and include significant opportunity costs.
What about using phone calls as a screening test for determining whether guys love you? Well, I don’t know what the sensitivity or specificity of the phone call-test might be, but let’s focus on the prevalence of guys in our population of interest who fall in love with a particular woman. The higher that prevalence is, the greater the positive predictive value will be. The more common it is for guys to love you, the more likely it is that a phone call will be a true signal of love. The less common it is for guys to love you, the more likely it is that a phone call will be a false signal of love.
Preview to Part 3
In Part 3, I’ll discuss in more detail the impact of prevalence of asymptomatic disease on predictive value. In the meantime, you may want to test your understanding of predictive value, by calculating the positive predictive value and negative predictive value with 10% instead of 1% prevalence, but with no change to the sensitivity (80%) and specificity (90%). I’ll present that calculation in Part 3,
I’ll also discuss issues the U.S. Preventive Services Task Force will consider in updating its breast cancer screening recommendations. Among those issues are questions about mammography sensitivity and specificity, and, of course, predictive value. But there are other issues to consider as well, as discussed in a recent op-ed piece in The New York Times by H. Gilbert Welch, a professor of medicine at the Dartmouth Institute for Health Policy and Clinical Practice and an author of “Overdiagnosed: Making People Sick in the Pursuit of Health.” I conclude with this description of the complexity of evaluating screening mammography made by Dr. Welch:
Among a thousand 50-year-old American women screened annually for a decade, 3.2 to 0.3 will avoid a breast cancer death, 490 to 670 will have at least one false alarm and 3 to 14 will be overdiagnosed and treated needlessly. That may help some women choose whether to be screened or not. But it’s still not very precise, and it doesn’t answer the fundamental question: Now that treatment is so much better, how much benefit does screening actually provide? What we need is a clinical trial in the current treatment era.
____________________________________________________________________________________________
- William M. London is a specialist in the study of health-related superstition, pseudoscience, sensationalism, schemes, scams, frauds, deception, and misperception, who likes to use the politically incorrect word: quackery. He is a professor in the Department of Public Health at California State University, Los Angeles; a co-author of the college textbook Consumer Health: A Guide to Intelligent Decisions (ninth edition copyright 2013); the associate editor (since 2002) of Consumer Health Digest, the free weekly e-newsletter of Quackwatch; one of two North American editors of the journal Focus on Alternative and Complementary Therapies; co-host of the Quackwatch network’s Credential Watch website; a consultant to the Committee for Skeptical Inquiry. He earned his doctorate & master’s in health education, master’s in educational psychology, baccalaureate in biological science, and baccalaureate in geography at the University at Buffalo (SUNY), and his master of public health degree from Loma Linda University. He successfully completed all required coursework toward a Master of Science in Clinical Research from Charles R. Drew University of Medicine and Science, but he has taken way too much time writing up his thesis project: an investigation of therapeutic claims and modalities promoted by chiropractors in the City of Los Angeles.