Skip to main content
Mathematics LibreTexts

6.3: Binomial Random Variables

  • Page ID
    105839
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Section 1: Introduction to Binomial Random Variables

    Definition: Binomial experiment

    A binomial experiment is an experiment that has exactly 2 outcomes.

    Example \(\PageIndex{1.1.1}\)

    Tossing a coin is a binomial experiment with two outcomes: Heads and Tails.

    However, rolling a die and observing an outcome is not a binomial experiment because there are six possible outcomes.

    Example \(\PageIndex{1.1.2}\)

    Going to work and observing the tardiness is a binomial experiment with two outcomes - Late and On-Time.

    However, asking a randomly selected person how many credit cards they have is not a binomial experiment with two outcomes because a person may have any number of cards: 0, 1, 2, 3, etc.

    Why binomial experiments are so important? Because any experiment with any number of outcomes can be thought of as a binomial experiment by considering an event \(E\), then for that experiment we have two outcomes - \(E\) and \((\text{not }E)\).

    Example \(\PageIndex{1.2}\)

    Rolling a die can be thought of as a binomial experiment by considering the two complementary events: Even and Odd, (Six) and (Other than Six) etc.

    Example \(\PageIndex{1.3}\)

    Asking how many credit cards a randomly selected person has can be thought of as a binomial experiment by considering two complementary events: (None) and (At least one)

    Example \(\PageIndex{1.4}\)

    More examples of binomial experiments:

    • A randomly selected patient dies or survives;
    • A randomly selected store visitor makes a purchase or not;
    • A randomly selected passenger arrives on-time or late;
    • A randomly selected applicant has a master's degree or not;
    • A randomly selected student graduates in four years or more, etc.

    For a binomial experiment we usually label one outcome a success, and the other one is a failure which is the complement of success. The probability of success is labeled \(p\) and the probability of failure is labeled \(q\) which by the complementary rule can always be computed by subtracting \(p\) from \(1\). Note that success may not be associated with a literal success but in any case, the failure is the opposite or the complementary event of success.

    Example \(\PageIndex{1.5}\)

    For a randomly selected patient, we can label \(S\) the event in which the patient dies, then \(F\) is the event in which the patient survives.

    For a randomly selected store visitor, we can label \(S\) the event in which a purchase is made, then \(F\) is the event in which the purchase was not made.

    For a randomly selected passenger, we can label \(S\) the event in which the passenger arrives on-time, then \(F\) is the event in which the passenger is late.

    If the same binomial experiment is performed as a sequence of \(n\) independent experiments, then such sequence is called \(n\) Bernoulli trials. The key word here is independent, which means that there should not be any dependency between the different instances of the experiment.

    Example \(\PageIndex{1.6}\)

    If we consider a group of 10 patients and track whether they survive or not, we have 10 Bernoulli trials.

    If we consider a group of 100 store visitors and track whether they make a purchase or not, we have 100 Bernoulli trials.

    If we consider 200 passengers and track whether they show up on-time or late, we have 200 Bernoulli trials.

    If we consider 1000 students and track whether they graduate in four years or more, we have 1000 Bernoulli trials.

    Next, we going to establish the following two facts.

    Fact#1: A binomial experiment can be simulated by a toss of an unfair coin for which \(S\) is the event in which the toss results in Heads and \(F\) is the event in which the toss results in Tails, and

    \(P(S)=P(H)=p\) and \(P(F)=P(T)=1-p=q\)

    Fact#2: A sequence of \(n\) Bernoulli trials can be simulated by \(n\) tosses of an unfair coin for which we have established the following formula:

    \(P(kH/n)=C_k^n p^k q^{n-k}\)

    for finding the probability of observing \(k\) heads among \(n\) tosses.

    Next, let’s consider \(n\) Bernoulli trials of a binomial experiment with c and \(P(F)=1-p=q\). Before performing the binomial experiment \(n\) times, it is impossible to know the number of successes. Therefore, the number of successes depends on a chance. If we let \(X\) be the number of successes among \(n\) Bernoulli trials, then \(X\) satisfies the definition of a discrete random variable. Why discrete? Because the possible number of successes can be \(0, 1, 2, \dots\) and any number up to \(n\). Just like for any discrete random variable we can construct the probability distribution table for \(X\) by listing all of its possible values side-by-side with their probabilities

    \(X\)

    \(k\)

    \(P(X=k)\)

     

    0

    \(P(X=0)\)

     

    1

    \(P(X=1)\)

     

    2

    \(P(X=2)\)

     

    \(\dots\)

    \(\dots\)

     

    \(n\)

    \(P(X=n)\)

    If a binomial experiment can be simulated by a toss of a fair coin with \(P(S)=P(H)=p\) and \(P(F)=P(T)=1-p\) then \(P(X=k)=P(kH/n)=C_k^n p^k q^{n-k}\). The significance of the formula is that now we can compute the probabilities of each outcome by simply evaluating the formula for different values of \(k\). So, the entire distribution table can be computed from just two values \(n\) and \(p\).

    \(X\)

    \(k\)

    \(P(X=k)=C_k^n p^k q^{n-k}\)

     

    0

    \(P(X=0)=C_0^n p^0 q^{n-0}\)

     

    1

    \(P(X=1)=C_1^n p^1 q^{n-1}\)

     

    2

    \(P(X=2)=C_2^n p^2 q^{n-2}\)

     

    \(\dots\)

    \(\dots\)

     

    \(n\)

    \(P(X=n)=C_n^n p^n q^{n-n}\)

    This variable \(X\) is called a Binomial Random Variable with parameters \(n\) and \(p\) and is denoted in the following way:

    \(X \sim B(n,p)\)

    Example \(\PageIndex{1.7}\)

    For a binomial random variable with parameters \(n=3\) and \(p=0.6\), we construct the probability distribution table by replacing \(n\) with 3 and \(p\) with 0.6 and \(k\) with every value from zero to 3.

    \(X\)

    \(k\)

    \(P(X=k)=C_k^3 p^k q^{3-k}\)

     

    0

    \(P(X=0)=C_0^3 0.6^0 0.4^{3-0}=0.064\)

     

    1

    \(P(X=0)=C_1^3 0.6^1 0.4^{3-1}=0.288\)

     

    2

    \(P(X=2)=C_2^3 0.6^2 0.4^{3-2}=0.432\)

     

    3

    \(P(X=3)=C_3^3 0.6^3 0.4^{3-3}=0.216\)

    The following relation can be observed between the shape of the histogram and the value of \(p\), one of the parameters of the variable.

    clipboard_e57ef0d4552e791748ea00e1cd63d0903.png clipboard_ee4b1a899a8f3799dd9df5c0f4083de8e.png clipboard_e82c0bd1c273c6d9bf6f466fd3c4f7473.png

    Right Skewed, p=0.25

    Symmetric, p=0.5

    Left Skewed, p=0.75

    Now we can use the probability distribution table and the histogram to find probabilities of a variety of events.

    Example \(\PageIndex{1.8}\)

    Let’s find the probability of \(X\) being greater than or equal to 2.

    Solution

    To find the probability we first identify the event in the probability distribution table and/or the histogram and then add the corresponding probabilities to get the answer, in this case 0.648.

    \(X\)

    \(k\)

    \(P(X=k)=C_k^3 p^k q^{3-k}\)

     

    0

    \(P(X=0)=C_0^3 0.6^0 0.4^{3-0}=0.064\)

     

    1

    \(P(X=0)=C_1^3 0.6^1 0.4^{3-1}=0.288\)

     

    2

    \(P(X=2)=C_2^3 0.6^2 0.4^{3-2}=0.432\)

     

    3

    \(P(X=3)=C_3^3 0.6^3 0.4^{3-3}=0.216\)

    The good news is that rarely one must compute these probabilities by hand.

    Another task that we can do for a binomial random variable is to compute the mean, variance and standard deviation. We can compute these values using the same approach as previously developed, however, luckily for us, there is an easier way that allows us to simply use the following formulas to compute the mean, variance, and standard deviation.

    For \(X \sim B(n, p)\):

    \(\mu_X=E[X]=np\)

    \(\sigma_X^2=VAR[X]=npq\)

    \(\sigma_X=SD[X]=\sqrt{npq}\)

    Example \(\PageIndex{1.9}\)

    In the above example,

    \(E[X]=np=3\cdot0.6=1.8\)

    \(VAR[X]=npq=3\cdot0.6\cdot0.4=7.2\)

    \(SD[X]=\sqrt{npq}=\sqrt{7.2}=2.68\)

    Interpretation: we expect 1.8 successes among 3 Bernoulli trials with \(p=0.6\).

    Section 2: Applications of Binomial Random Variables

    What makes binomial random variables quite useful is the following fact: if a certain ratio of the population possesses a certain property and a sample of size less than 5% of the population is obtained, then the number of people that possess the property in the sample is a binomial random variable. Note that when the sample size is more than 5% of the population the same holds only if the sample is obtained by sampling with replacement.

    Example \(\PageIndex{2.1}\)

    Given that 86% of the population have a HS diploma, find the probability that among a group of 10 applicants all have a HS diploma.

    Solution

    To start the solution, we introduce a variable. Let \(X\) be the number of people out of 10 that have a HS diploma. Since the sample size is 10 which is much less than 5% of the US population therefore \(X\) is a binomial random variable with parameters \(n=10\) and \(p=0.86\).

    \(X \sim B(n=10,p=0.86)\)

    The probability that among a group of 10 applicants all have a HS diploma can be found by using the formula for \(k=10\).

    \(P(X=10)=C_{10}^{10}0.86^{10}0.14^0=0.2213=22.13\%\)

    So the probability that all 10 applicants have a HS diploma is roughly 22%.

    With enough effort we can create the frequency distribution table for \(X\):

    \(X\)

    \(k\)

    \(P(X=k)=C_{k}^{n}p^{k}q^{n-k}\)

     

    0

    \(P(X=0)=C_{0}^{10}0.86^{0}0.14^{10}=2.89255\cdot10^{-9}\)

     

    1

    \(P(X=1)=C_{1}^{10}0.86^{1}0.14^{9}=1.77685\cdot10^{-7}\)

     

    2

    \(P(X=2)=C_{2}^{10}0.86^{2}0.14^{8}=4.91172\cdot10^{-6}\)

     

    3

    \(P(X=3)=C_{3}^{10}0.86^{3}0.14^{7}=8.04587\cdot10^{-5}\)

     

    4

    \(P(X=4)=C_{4}^{10}0.86^{4}0.14^{6}=0.000864931\)

     

    5

    \(P(X=5)=C_{5}^{10}0.86^{5}0.14^{5}=0.006375775\)

     

    6

    \(P(X=6)=C_{6}^{10}0.86^{6}0.14^{4}=0.032637895\)

     

    7

    \(P(X=7)=C_{7}^{10}0.86^{7}0.14^{3}=0.114565673\)

     

    8

    \(P(X=8)=C_{8}^{10}0.86^{8}0.14^{2}=0.263910212\)

     

    9

    \(P(X=9)=C_{9}^{10}0.86^{9}0.14^{1}=0.360258384\)

     

    10

    \(P(X=10)=C_{10}^{10}0.86^{10}0.14^{0}=0.221301579\)

    And use the table to find the following probabilities:

    • the probability that more than half have a HS diploma, \(P(X>5)=0.9927\).
    • the probability that at most 8 have a HS diploma, \(P(X\leq8)=0.4184\).
    • the probability that between 4 and 7 have a HS diploma, \(P(4\leq{X}\leq7)=0.1544\).

    In addition to finding the probabilities we can also find and interpret the expected value and the standard deviation.

    \(\mu_X=np=10\cdot0.86=8.6\)

    \(\sigma_X=\sqrt{npq}=\sqrt{10\cdot0.86\cdot0.14}=1.1\)

    In this example, we expect 8 or 9 applicants out of 10 to have a HS diploma.

    Section 3: Overbooking

    Example \(\PageIndex{3.1}\)

    An airline sold all tickets on a 150-seat plane. Given that only 98% of the people show up for a flight, find the probability that 150 people will show up.

    Solution

    To start the solution, we introduce a variable. Let \(X\) be the number of people out of 150 that will show up, then

    \(X \sim B(n=150,p=0.98)\)

    The probability that all 150 people that purchased a ticket will show up for the flight can be expressed as

    \(P(X=150)=C_{150}^{150}0.98^{150}0.02^{0}=0.0483\approx4.8\%\)

    In addition to finding this probability we can also find and interpret the expected value which is 147:

    \(E[X]=np=150\cdot0.98=147\)

    In this context, we expect only 147 passengers to show up for the flight. That means that the airline can sell more than 150 tickets for a flight and since not everyone is going to show up, the airline can make more money by selling more tickets. This practice is called overbooking.

    Example \(\PageIndex{3.2}\)

    An airline sold an extra 3 tickets on a 150-seat plane. Given that only 98% of the people show up for a flight, find the probability that at most 150 people will show up. This is the probability of the event in which the airline will have a seat for everyone who shows up. This is the best possible outcome for the airline.

    Solution

    To start the solution, we again introduce a variable. Let \(Y\) be the number of people out of 153 that purchased a ticket and showed up for the flight. Then \(Y\) is a binomial random variable with parameters \(n=153\) and \(p=0.98\).

    \(Y \sim B(n=153,p=0.98)\)

    The probability that less than or equal to 150 passengers will show up for the flight can be found by using technology.

    \(P(Y\leq150)=\text{<via technology>}=0.5925=59.25\%\)

    Thus, the probability that the airline will have a seat for everyone who shows up for the flight is roughly 60%.

    So, what happens when too many passengers show up for a flight? They will be offered a compensation for their trouble. Every airline decides how many tickets to sell for every flight in such a way to maximize the profit and minimize the probability of getting in trouble!


    6.3: Binomial Random Variables is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?