Skip to main content
Mathematics LibreTexts

8.4: Approximating Binomial Random Variables

  • Page ID
    105848
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    There is roughly an \(80\%\) chance that a person of age 20 will be alive at age 65. Suppose that ten people of age 20 are selected at random among those who purchased a life insurance. The insurance company wants to know the probability that the number of policy holders that are alive at age 65 is

    1. exactly six.
    2. at most four.
    3. at least eight.

    Let \(X\) represent the number of people alive at age 65 among the ten people that were selected, then

    \(X \sim B(n=10, p=0.8)\)

    and the following probability distribution table can be constructed using the formula

    \(P(X=k)=C_k^10\cdot0.8^k\cdot0.2^{10-k}\)

    \(x_i\)

    \(P(X=x_i)\)

    0

    1.024E-07

    1

    4.096E-06

    2

    7.3728E-05

    3

    0.000786432

    4

    0.005505024

    5

    0.026424115

    6

    0.088080384

    7

    0.201326592

    8

    0.301989888

    9

    0.268435456

    10

    0.107374182

    Now, using the probability distribution table we can answer the questions:

    1. \(P(X=6)=0.0881\)
    2. \(P(X\leq6)=0.0064\)
    3. \(P(X\geq8)=0.6778\)

    For most real-world problems, the number of people under investigation is much larger than ten. Consider the same problem but let’s suppose that \(100\) people of age 20 years are selected at random. The insurance company may want to know the probability that

    1. exactly seventy-five of them will be alive at age 65.
    2. at least eighty-three of them will be alive at age 65.

    Let \(X\) represent the number of people alive at age 65 among the hundred people that were selected, then

    \(X \sim B(n=100, p=0.8)\)

    and the following probability distribution table can be constructed using the formula

    \(P(X=k)=C_k^{100}\cdot0.8^k\cdot0.2^{100-k}\)

    \(x_i\)

    \(P(X=x_i)\)

    0

    \(C_0^{100}\cdot0.8^0\cdot0.2^{100}\)

    1

    \(C_1^{100}\cdot0.8^1\cdot0.2^{99}\)

    2

    \(C_2^{100}\cdot0.8^2\cdot0.2^{98}\)

    3

    \(C_3^{100}\cdot0.8^3\cdot0.2^{97}\)

    \(\dots\)

    \(\dots\)

    75

    \(C_{75}^{100}\cdot0.8^{75}\cdot0.2^{25}\)

    \(\dots\)

    \(\dots\)

    83

    \(C_{83}^{100}\cdot0.8^{83}\cdot0.2^{17}\)

    \(\dots\)

    \(\dots\)

    99

    \(C_{99}^{100}\cdot0.8^{99}\cdot0.2^{1}\)

    100

    \(C_{100}^{100}\cdot0.8^{100}\cdot0.2^{0}\)

    Now, using the probability distribution table we can compute the answers the questions:

    1. \(P(X=75)=C_{75}^{100}\cdot0.8^{75}\cdot0.2^{25}\)
    2. \(P(X\geq83)=P(X=83)+P(X=84)+\cdots+P(X=100)=C_{75}^{100}\cdot0.8^{75}\cdot0.2^{25}\)

    However, we run into a problem very quick! Finding the probability that \(X=60\) requires enough computational power to handle \(\frac{100!}{60!40!}\). And computing the probability that \(X\geq80\) requires even more computational power! So we have an interesting situation, when we know how to compute the answer yet cannot compute it without a powerful calculator.

    Is there another way to find the desired probability without a powerful calculator? The short answer is yes.

    Let’s scale back to the problem with the sample of size 10 and let’s consider the probability histogram along with the probability distribution table.

    clipboard_eb42fc5a67c286b14834addc5d4fa9295.png

    clipboard_ecb2c5a86e41692de5ab6667d41275f8c.png

    Recall that the probability histogram serves exactly the same purpose as the probability distribution table, that is we can read the probabilities from the histogram itself. For example, on the histogram, the bar that corresponds to \(X=6\) has the area equal to the \(P(X=6)\). Next, to find the probability \(P(X\geq8)\) is the same as to compute the area of the bars at 8, 9, and 10! Similarly, to find the probability \(P(X\leq4)\) is the same as to compute the area of the bars at 0, 1, 2, 3, 4 which is virtually equal to zero. This principle is true for any discrete random variable of any shape and in particular can be applied for a binomial random variable with any parameters \(n\) and \(p\).

    Does the overall shape of the histogram remind you of anything? Yes, it does look like a normal probability density curve!

    clipboard_e9d1d9468c6f4c4ed14f6b455031fb699.png

    But which values of \(\mu\) and \(\sigma\) give us the best approximation? Recall that a binomial random variable with parameters \(n\) and \(p\) has the mean and standard deviation can be computed using the formulas:

    \(\mu_X=np\)

    \(\sigma_X=\sqrt{np(1-p)}\)

    Now it is not surprising that a normal variable \(Y\) with the same \(\mu\) and \(\sigma\) as the binomial random variable \(X\) has the probability density curve with the same shape as the histogram of \(X\), in other words, the probability density curve of \(Y\) can be used to approximate the area in the histogram for \(X\)!

    In our case, we have a binomial random variable \(X\) with parameters \(n=10\) and \(p=0.8\) whose mean is \(\mu_X=np=10\cdot0.8=8\) and standard deviation is \(\sigma_X=\sqrt{np(1-p)}=\sqrt{10\cdot0.8\cdot0.2}=\sqrt1.6=1.26\). Therefore, if we let \(Y\) be a normal random variable with parameters \(\mu_Y=\mu_X=8\) and \(\sigma_Y=\sigma_X=1.26\) then \(Y\) can be used to approximate \(X\).

    So now that we know the parameters of the normal variable that approximates the original binomial variable \(X\), we can focus on using it to answer the original questions about \(X\)!

    While it is natural to approximate the \(P(X\geq8)\) with the \(P(Y>8)\), by doing so we will be ignoring the area of the left half of the bar at \(X=8\). A better estimate can be obtained by finding the area to the right of \(7.5\) which is \(P(X\geq8) \approx P(Y>7.5)=0.6543\) and is very close to the actual probability that we found earlier.

    Definition: Correction for continuity

    The adjustment of subtracting ½ from the boundary is called the correction for continuity and must be done to obtain the best estimate!

    Similarly, while it is natural to approximate the \(P(X=6)\) with the \(P(Y=6)\), by doing so we will be ignoring the fact that the probability of \(Y=6\) is equal to zero. A better estimate can be obtained by finding the area between \(5.5\) and \(6.5\) which is \(P(X=6) \approx P(5.5<Y<6.5)=0.0933\) and is very close to the actual probability that we found earlier.

    In summary, for \(X \sim B(n, p)\) and \(Y \sim N(np, \sqrt{np(1-p)})\), the following approximations can be made:

    \(P(X=c) \approx P(c-\frac{1}{2}<Y<c+\frac{1}{2})\)

    \(P(X\leq{b}) \approx P(Y<b+\frac{1}{2})\)

    \(P(X<b) \approx P(Y<b-\frac{1}{2})\)

    \(P(X>a) \approx P(Y>a+\frac{1}{2})\)

    \(P(X\geq{a}) \approx P(Y>a-\frac{1}{2})\)

    \(P(a\leq X\leq b) \approx P(a-\frac{1}{2}<Y<b+\frac{1}{2})\)

    \(P(a<X\leq b) \approx P(a+\frac{1}{2}<Y<b+\frac{1}{2})\)

    \(P(a\leq X<b) \approx P(a-\frac{1}{2}<Y<b-\frac{1}{2})\)

    \(P(a<X<b) \approx P(a+\frac{1}{2}<Y<b-\frac{1}{2})\)

    Again, the adjustment of adding or subtracting ½ is called the correction for continuity.

    Note that the approximation becomes less accurate for skewed binomial distributions, so only perform it when both \(np\geq5\) and  \(n(1-p)\geq5\).

    Let’s go back to our original problem with 100 people instead of 10.

    Example \(\PageIndex{1}\)

    Suppose that \(100\) people of age 20 years are selected at random. Find the probability that

    1. exactly 75 of them will be alive at age 65.
    2. at least 83 of them will be alive at age 65.
    Solution

    Let \(X\) represent the number of people alive at age 65 among the hundred people that were selected, then

    \(X \sim B(n=100, p=0.8)\)

    and

    \(\mu_X=np=100\cdot0.8=80\) and \(\sigma_X=\sqrt{np(1-p)}=\sqrt{100\cdot0.8\cdot0.2}=4\)

    Let’s introduce a new random variable

    \(Y \sim N(\mu_Y=\mu_X=80, \sigma_Y=\sigma_X=4)\)

    Now, using the normal random variable \(Y\) we can answer the questions:

    1. \(P(X=75) \approx P(74.5<Y<75.5)=0.0457\)
    2. \(P(X\geq83) \approx P(Y>82.5)=0.2660\)

    These approximations are expected to be very close to the true values because both \(np\) and \(n(1-p)\) are greater than \(5\).

    Conclusion: we discussed how to use normal distributions to approximate the probabilities for binomial distributions. While this result is not as significant now as it was prior to technology becoming accessible, it still highlights the importance of the normal distributions in nature and applications!


    8.4: Approximating Binomial Random Variables is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?