Skip to main content
Mathematics LibreTexts

4.4: Two-Way Tables

  • Page ID
    74318
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    4.4 Learning Objectives

    • Use two-way tables to find conditional probabilities

    In this section we concentrate on the more complex conditional probability problems we began to look at in the last section.

    Example 1

    Suppose a certain disease has an incidence rate of 0.1% (that is, it afflicts 0.1% of the population). A test has been devised to detect this disease. The test does not produce false negatives (that is, anyone who has the disease will test positive for it), but the false positive rate is 5% (that is, about 5% of people who take the test will test positive, even though they do not have the disease). Suppose a randomly selected person takes the test and tests positive. What is the probability that this person actually has the disease?

    Solution

    Let's break down the information in the problem piece by piece, using a two-way table to record our results.

    Suppose a certain disease has an incidence rate of 0.1% (that is, it afflicts 0.1% of the population). The percentage 0.1% can be converted to a decimal number by moving the decimal place two places to the left, to get 0.001. In turn, 0.001 can be rewritten as a fraction: 1/1000. This tells us that about 1 in every 1000 people has the disease. (If we wanted we could write P (disease)=0.001.)

    A test has been devised to detect this disease. The test does not produce false negatives (that is, anyone who has the disease will test positive for it). This part is fairly straightforward: everyone who has the disease will test positive, or alternatively everyone who tests negative does not have the disease. (We could also say P (positive | disease)=1.)

    The false positive rate is 5% (that is, about 5% of people who take the test will test positive, even though they do not have the disease). This is even more straightforward. Another way of looking at it is that of every 100 people who are tested and do not have the disease, 5 will test positive even though they do not have the disease. (We could also say that \(P\)(positive | no disease)=0.05.)

    Suppose a randomly selected person takes the test and tests positive. What is the probability that this person actually has the disease? Here we want to compute \(P\)(disease|positive). We already know that \(P\)(positive|disease)=1, but remember that conditional probabilities are not equal if the conditions are switched.

    Rather than thinking in terms of all these probabilities we have developed, let's create a hypothetical situation and apply the facts as set out above. First, suppose we randomly select 1000 people and administer the test. How many do we expect to have the disease? Since about 1/1000 of all people are afflicted with the disease, \(\frac{1}{1000}\) of 1000 people is 1. (Now you know why we chose 1000.) Only 1 of 1000 test subjects actually has the disease; the other 999 do not.

    \(\begin{array}{|l|l|l|l|}
    \hline & \begin{array}{l}
    \text { Has } \\
    \text { Disease}
    \end{array} & \begin{array}{l}
    \text { Does Not } \\
    \text { Have Disease}
    \end{array} & \text { Total } \\
    \hline \text { Tests Positive} &   &   &  \\
    \hline \text { Tests Negative} &  &   &   \\
    \hline \text { Total } & 1 & 999 & 1000\\
    \hline
    \end{array}\)

    We also know that 5% of all people who do not have the disease will test positive. There are 999 disease-free people, so we would expect \((0.05)(999)=49.95\) (so, about 50) people to test positive who do not have the disease.

    \(\begin{array}{|l|l|l|l|}
    \hline & \begin{array}{l}
    \text { Has } \\
    \text { Disease}
    \end{array} & \begin{array}{l}
    \text { Does Not } \\
    \text { Have Disease}
    \end{array} & \text { Total } \\
    \hline \text { Tests Positive} & 1  & 50  &  \\
    \hline \text { Tests Negative} & 0 & 949  &   \\
    \hline \text { Total } & 1 & 999 & 1000\\
    \hline
    \end{array}\)

    Now back to the original question, computing P (disease|positive). There are 51 people who test positive in our example (the one unfortunate person who actually has the disease, plus the 50 people who tested positive but don't). Only one of these people has the disease, so

    P(disease | positive) \(\approx \frac{1}{51} \approx 0.0196\)

    or less than 2%. This means that of all people who test positive, over 98% do not have the disease. Does this surprise you?

    The answer we got was slightly approximate, since we rounded 49.95 to 50.

    But back to the surprising result. Of all people who test positive, over 98% do not have the disease. If your guess for the probability a person who tests positive has the disease was wildly different from the right answer (2%), don't feel bad. The exact same problem was posed to doctors and medical students at the Harvard Medical School 25 years ago and the results revealed in a 1978 New England Journal of Medicine article. Only about 18% of the participants got the right answer. Most of the rest thought the answer was closer to 95% (perhaps they were misled by the false positive rate of 5%).

    The significance of this finding and similar results from other studies in the intervening years lies in the possibly catastrophic consequences it might have for patient care. If a doctor thinks the chances that a positive test result nearly guarantees that a patient has a disease, they might begin an unnecessary and possibly harmful treatment regimen on a healthy patient. Or worse, as in the early days of the AIDS crisis when being HIV-positive was often equated with a death sentence, the patient might take a drastic action and commit suicide.

    As we have seen in this hypothetical example, the most responsible course of action for treating a patient who tests positive would be to counsel the patient that they most likely do not have the disease and to order further, more reliable, tests to verify the diagnosis.

    Example 2

    A certain disease has an incidence rate of 2%. If the false negative rate is 10% and the false positive rate is 1%, compute the probability that a person who tests positive actually has the disease.

    Solution

    Imagine 10,000 people who are tested. Of these 10,000, 200 will have the disease since the disease as an incident rate of 2%.

    \(\begin{array}{|l|l|l|l|}
    \hline & \begin{array}{l}
    \text { Has } \\
    \text { Disease}
    \end{array} & \begin{array}{l}
    \text { Does Not } \\
    \text { Have Disease}
    \end{array} & \text { Total } \\
    \hline \text { Tests Positive} &   &   &  \\
    \hline \text { Tests Negative} &  &   &   \\
    \hline \text { Total } & 200 & 9800 & 10000\\
    \hline
    \end{array}\)

    Of those who have the disease, 10% of them, or 20, will test negative and the remaining 180 will test positive. Of the 9800 who do not have the disease, 1% of them, or 98, will test positive. So of the 278 total people who test positive, 180 will have the disease.

    \(\begin{array}{|l|l|l|l|}
    \hline & \begin{array}{l}
    \text { Has } \\
    \text { Disease}
    \end{array} & \begin{array}{l}
    \text { Does Not } \\
    \text { Have Disease}
    \end{array} & \text { Total } \\
    \hline \text { Tests Positive} & 180  &  98 & 278 \\
    \hline \text { Tests Negative} & 20 &   9702&   \\
    \hline \text { Total } & 200 & 9800 & 10000\\
    \hline
    \end{array}\)

     

    Thus, \(P(\text { disease } | \text { positive })=\frac{180}{278} \approx 0.647\).

    So, about 65% of the people who test positive will have the disease.

    Try it Now 1

    A certain disease has an incidence rate of 0.5%. If there are no false negatives and if the false positive rate is 3%, compute the probability that a person who tests positive actually has the disease.

    Answer

    Out of 100,000 people, 500 would have the disease since the disease as an incidence rate of 0.5%.

    \(\begin{array}{|l|l|l|l|}
    \hline & \begin{array}{l}
    \text { Has } \\
    \text { Disease}
    \end{array} & \begin{array}{l}
    \text { Does Not } \\
    \text { Have Disease}
    \end{array} & \text { Total } \\
    \hline \text { Tests Positive} &   &   &  \\
    \hline \text { Tests Negative} &  &   &   \\
    \hline \text { Total } & 500 & 99500 & 100000\\
    \hline
    \end{array}\)

    Of those, all 500 would test positive since there are no false negatives. Of the 99,500 without the disease 3%, or 2,985, would falsely test positive and the other 96,515 would test negative.

    \(\begin{array}{|l|l|l|l|}
    \hline & \begin{array}{l}
    \text { Has } \\
    \text { Disease}
    \end{array} & \begin{array}{l}
    \text { Does Not } \\
    \text { Have Disease}
    \end{array} & \text { Total } \\
    \hline \text { Tests Positive} &  500 & 2985  &3485  \\
    \hline \text { Tests Negative} & 0 &   96515&  96515 \\
    \hline \text { Total } & 500 & 99500 & 100000\\
    \hline
    \end{array}\)

    \(\mathrm{P}(\text { disease } | \text { positive })=\frac{500}{500+2985}=\frac{500}{3485} \approx 14.3 \%\)


    This page titled 4.4: Two-Way Tables is shared under a CC BY-SA license and was authored, remixed, and/or curated by Leah Griffith, Veronica Holbrook, Johnny Johnson & Nancy Garcia.