Skip to main content
Mathematics LibreTexts

4.4: Probabilities from Frequency Tables

  • Page ID
    105830
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Section 1: Using Relative Frequency Tables

    In this section, we are going to learn how to make probability statements using relative frequency distribution tables for univariate data. Let us assume that we have access to a huge database that has most of the information about the past students. Let us imagine that we requested the data and received the following frequency distribution table for a random 100 students.

    Number of years

    Frequency

    Relative Frequency

    0-2

    27

    0.27

    2-4

    42

    0.42

    4-6

    20

    0.20

    6+

    11

    0.11

    Total

    100

    1

    We already know that the frequencies can’t be interpreted as probabilities, so we turn it into the relative frequency distribution table. We have an intuition that the relative frequencies can and SHOULD be interpreted as probabilities, but we also know that in order to find the probability we must have a well-defined event which is defined for some experiment with its sample space. The question that we need to answer first is what are the experiment, the simple outcomes, the sample space, and what kind of events we can find the probabilities of when such frequency distribution table is given.

    First let’s define the experiment! In this case, the experiment consists of randomly selecting one of the 100 students from the sample. Based on the formatting of the table there are 4 simple outcomes 0-2, 2-4, 4-6, 6+. We label them Y1, Y2, Y3, and Y4 and these 4 outcomes form the sample space for the experiment.

    \(\{Y_1, Y_2, Y_3, Y_4\}\)

    We describe the outcome Y1 as an outcome in which the randomly selected student spent between 0 and 2 years in college. We describe the outcome Y2 as an outcome in which the randomly selected student spent between 2 and 4 years in college. Hopefully, it is clear now how the simple outcomes are defined.

    Now that we defined the simple outcomes, we can interpret the relative frequencies as the probabilities. For example, 0.27 is the \(P(Y_1)\), 0.42 is the \(P(Y_2)\) and so on.

    Outcome

    Number of years

    Frequency

    Relative Frequency

    Outcome

    \(Y_1\)

    0-2

    27

    0.27

    \(=P(Y_1)\)

    \(Y_2\)

    2-4

    42

    0.42

    \(=P(Y_2)\)

    \(Y_3\)

    4-6

    20

    0.20

    \(=P(Y_3)\)

    \(Y_4\)

    6+

    11

    0.11

    \(=P(Y_4)\)

    \(Y_5\)

    Total

    100

    1

    \(=P(Y_5)\)

    It is worth noting that this experiment doesn’t have equally likely outcomes.

    Now, that we understand the experiment and its simple outcomes that form the sample space, we can start defining different events.

    Let \(A\) be an event in which a randomly selected student spent 0-4 years in college. The event \(A\) is made of the two outcomes \(Y_1\) and \(Y_2\) so its probability is

    \(P(A)=0.27+0.42=0.69\)

    Let \(B\) be an event in which a randomly selected student spent 2-6 years in college. The event \(B\) is made of the two outcomes \(Y_2\) and \(Y_3\) so its probability is

    \(P(B)=0.42+0.20=0.62\)

    Let \(C\) be the complement of \(A\). The event \(C\) is made of the two outcomes \(Y_3\) and \(Y_4\) so its probability is

    \(P(C)=0.20+0.11=0.31\)

    We can also obtain the same result by using the complementary rule.

    Let \(D\) be the intersection of \(A\) and \(B\). The event \(D\) is made of only one outcome \(Y_2\) so its probability is

    \(P(D)=0.42\)

    Let \(E\) be the union of \(A\) and \(B\). The event \(E\) is made of three outcomes \(Y_1\), \(Y_2\), and \(Y_3\) so its probability is

    \(P(E)=0.27+0.42+0.20=0.89\)

    We can also obtain the same result by using the general addition rule.

    We discussed how to use the frequency distribution tables to produce the probability statements about the events and compound events that are properly defined in the context of the given frequency distribution table.

    Section 2: Contingency Tables

    In this section, we are going to learn how to make probability statements using contingency tables for bivariate data. Let's assume that we have access to a huge database that has most of the information about the past students. Let's imagine that along with requesting the enrollment length we also requested the information about whether a student transferred, graduated or dropped. And received the following contingency table for a random 100 students.

     

    Dropped

    Transferred

    Graduated

    Total

    0-2

    8

    16

    3

    27

    2-4

    14

    13

    15

    42

    4-6

    10

    4

    6

    20

    6+

    8

    2

    1

    11

    Total

    40

    35

    25

    100

    We already know that the frequencies can’t be interpreted as probabilities, so we need to turn it into the relative frequency. We add the total for columns and rows and divide everything by the total to obtain the relative frequency contingency table.

     

    Dropped

    Transferred

    Graduated

    Total

    0-2

    0.08

    0.16

    0.03

    0.27

    2-4

    0.14

    0.13

    0.15

    0.42

    4-6

    0.10

    0.04

    0.06

    0.20

    6+

    0.08

    0.02

    0.01

    0.11

    Total

    0.40

    0.35

    0.25

    1.00

    Now, in order to find the probability, we must have a well-defined event which is defined for some experiment with its sample space. The question that we need to answer first is what are the experiment, the simple outcomes, the sample space, and what kind of events we can find the probabilities of when such frequency distribution table is given.

    First let’s define the experiment! In this case, the experiment consists of randomly selecting one of the 100 students from the sample. For bivariate data, we are not going to define the simple outcomes but we will highlight a number of special events. There are 4 special events one for each row of the table. We label them \(Y_1\), \(Y_2\), \(Y_3\), and \(Y_4\). And there are also 3 special events one for each column of the table. We label them \(R_1\), \(R_2\), \(R_3\). We describe the event Y1 as an event in which the randomly selected student spent between 0 and 2 years in college. We describe the event R1 as an event in which the randomly selected student Dropped out of college. Hopefully, it is clear now how the other special events are defined.

     

    Dropped

    (\(R_1\))

    Transferred

    (\(R_2\))

    Graduated

    (\(R_3\))

    Total

    0-2 (\(Y_1\))

    0.08

    0.16

    0.03

    0.27

    2-4 (\(Y_2\))

    0.14

    0.13

    0.15

    0.42

    4-6 (\(Y_3\))

    0.10

    0.04

    0.06

    0.20

    6+ (\(Y_4\))

    0.08

    0.02

    0.01

    0.11

    Total

    0.40

    0.35

    0.25

    1.00

    The relative frequencies in the right-most column and bottom row are called marginal probabilities as they can be interpreted directly as the probabilities of the 7 special events. For example, the probability of \(Y_3\) is 0.20 and the probability of \(R_2\) is 0.35. All the cells inside the table are the simple outcomes and form the sample space. The cell where the event \(Y_3\) and event R2 intersect represent the intersection of the two events and the number written in the cell is the probability of the intersection of the two events. Similarly, the probability of \(Y_4\) and \(R_1\) is 0.08. All the relative frequencies inside the contingency table are called joint probabilities. The probability of \(Y_1\) and \(R_3\) is 0.03 and the probability of \(Y_3\) and \(R_1\) is 0.10. The cells formed by combining the events \(Y_3\) and \(R_2\) represent the union of the two events and to find its probability we can add all the numbers in the region one-by-one, or we can use the general addition rule the ingredients for application of which are 2 marginal and 1 joint probabilities. Of course, we are more interested in applying the rules so let’s find the probability of \(Y_2\) OR \(R_1\) using the general addition rule:

    \(P(Y_2\text{ or }R_1)=P(Y_2)+P(R_1)-P(Y_2\text{ and }R_1)=0.42+0.40-0.14=0.68\)

    We discussed how to use the contingency tables to produce the probability statements about the events and compound events that are properly defined in the context of the given contingency table.


    4.4: Probabilities from Frequency Tables is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?