9.4: The Geometric and Negative Binomial Random Variables
In this section, we will discuss another two important random variables - the Geometric random variable and the Negative Binomial random variable. As usual, we will use our framework for introducing random variables by first defining the pmf or cdf, then by understanding what the random variable models, and then proving our interpretation is correct.
Definition: A random variable \(X\) whose probability mass function is given by
\[f(x) = P(X = x) =
\begin{cases}
p(1-p)^{x-1} & \text{if} ~ x = 1, 2, 3, \ldots \\
0 & \text{otherwise} \\ \end{cases} \]
is called a Geometric random variable with parameter \(p\) and we write \( X \sim Geo(p) \).
Suppose an experiment is to be performed with the following characteristics:
- The experiment can be thought of as repeating independent trials until we reach our first success at which point the experiment stops.
- The trial may result in one of two outcomes where we think of one outcome as being a "success" while the other outcome is labeled as a "failure".
- The probability of success on every trial is \(p\) (and hence the probability of failure is \(1-p\) ).
- \(X\) models the number of trials it takes to obtain our first success.
Then \(X \sim Geo(p)\).
To summarize, the Geomentric random variable models the number of trials it takes to obtain our first success.
Note: There are actually two different ways to define the Geometric random variable. Some sources define the Geometric random variable as modeling the number of trials until the first success which is what we have done. Other sources may define the Geometric random variable to be the number of failures until the first success. Thus, if you look at other resources, make sure you know which definition of a Geometric random variable they are using! (As you can see, Wikipedia lists both: https://en.Wikipedia.org/wiki/Geometric_distribution ).
We now argue that the conditions outlined above yields the Geometric random variable.
Theorem: Suppose an experiment is to be performed with the following characteristics:
- The experiment can be thought of as repeating independent trials until we reach our first success at which point the experiment stops.
- The trial may result in one of two outcomes where we think of one outcome as being a "success" while the other outcome is labeled as a "failure".
- The probability of success is \(p\) (and hence the probability of failure is \(1-p\) ).
- \(X\) models the number of trials it takes to obtain our first success.
Then \(X \sim Geo(p)\).
- Proof:
-
We now derive the probability mass function of \(X\). First consider the possible values of \(X\). Remember that \(X\) denotes the number of trials until our first success. Perhaps on our first trial we obtained our first success and so \(X = 1\). Or perhaps on our second trial we obtained our first success. This means we must have failed first and then succeeded. In this case, \(X = 2 \). Or perhaps on our third trial we obtained our first success. In terms of the sequence of trials, this means we failed, then failed, then succeeded and so \(X=3\) and so on and so forth. Thus, the possible values of \(X\) are 1, 2, 3, ... .
Allow us to now find the probability that \(X\) takes each of these values. Since there are infinitely many values, we hope to find some sort of general pattern. For convenience, let \(S_i = \{ \text{we have a success on trial} ~ i \} and let \(F_i = \{ \text{we have a failure on trial} ~ i \}. Then
1. \( P(X=1) = P(S_1) = p \)
2. \( P(X=2) = P(F_1 \cap S_2) = P(F_1)P(S_2) = (1-p)p = p(1-p) \)
3. \( P(X=3) = P(F_1 \cap F_2 \cap S_3) = P(F_1) P(F_2)P(S_3) = (1-p)(1-p)p = p(1-p)^2 \)
4. \( P(X=4) = P(F_1 \cap F_2 \cap F_3 \cap S_4) = P(F_1) P(F_2)P(F_3)P(S_4) = (1-p)(1-p)(1-p)p = p(1-p)^3 \)
And more generally, we see for any \(x\) in the support of \(X\), we have the following:
\( P(X = x) = P( \text{the first success occurs on the} ~ x^{th} ~ \text{trial} ) = P(F_1 \cap F_2 \cap \ldots \cap F_{x-1} \cap S_x) = P(F_1) P(F_2) \ldots P(F_{x-1}) P(S_x) = (1-p)^{x-1} p = p(1-p)^{x-1} \)
Hence,
\[f(x) = P(X = x) =
\begin{cases}
p(1-p)^{x-1} & \text{if} ~ x = 1, 2, 3, \ldots \\
0 & \text{otherwise} \\ \end{cases} \nonumber\ \] which completes the derivation.
Fo the next example, recall from calculus that
\begin{align*}
a + ar + ar^2 + ar^3 + \ldots = \sum_{n=1}^{\infty} ar^{n-1} = \frac{a}{1-r}, ~~ |r| < 1
\end{align*}
Reconsider the following example : Suppose an experiment consists of repeatedly flipping a fair coin until a heads appears for the first time at which point the experiment stops. Assuming that the flips of the coin are independent of each other, find the probability that the experiment will eventually terminate. That is, find the probability that a heads is eventually obtained.
- Answer
-
Let \(X\) denote the number of trials it takes to obtain our first heads. Notice that the experiment can be thought of as repeatedly performing independent flips of a fair coin until we obtain a heads, each trial (flip) results either in a heads (success) or failure (tails), the probability of success on each trial is 0.5, and \(X\) models the number of trials until our first success. Hence, \(X \sim Geo(0.5) \). That is, \[f(x) = P(X = x) =
\begin{cases}
0.5(1-0.5)^{x-1} & \text{if} ~ x = 1, 2, 3, \ldots \\
0 & \text{otherwise} \\ \end{cases} \nonumber\ \]We can simplify the above to
\[f(x) = P(X = x) =
\begin{cases}
(0.5)^{x} & \text{if} ~ x = 1, 2, 3, \ldots \\
0 & \text{otherwise} \\ \end{cases} \nonumber\ \]We wish to find the probability that a heads is eventually obtained. That is, we could have obtained our first heads on trial 1 or trial 2 or trial 3 or so on and so fourth. Hence, we obtain the following:
\begin{align*} P(\text{we eventually obtain a heads}) &= P(\text{our first heads appears on trial 1 or trial 2 or trial 3 or} \ldots) \\ &= P( \{X=1\} \cup \{X=2\} \cup \{ X = 3 \} \cup \ldots ) \\ &= P(X=1) + P(X=2) + P(X=3) + \ldots \\ &= 0.5^1 + 0.5^2 + 0.5^3 + \ldots \\ &= \frac{0.5}{1-0.5} \\ &= 1 \end{align*}
Of course, this answer agrees with our computation from Section 6.2. Further note that the computation is essentially \( \displaystyle \sum_{\text{all} ~ x} f(x) \) and by the properties discussed of the probability mass function, this summation is equal to 1.
Theorem: If \( X \sim \text{Geo}(p) \), then for any integer \(k\) in the support of \(X\), \( P(X > k) = (1-p)^{k} \).
- Proof:
-
\begin{align*}
P( X > k) &= P(X = k + 1 ) + P(X = k + 2 ) + P(X = k + 3 ) \ldots \\
&= p(1-p)^k + p(1-p)^{k+1} + p(1-p)^{k+2} \ldots \\
&= \frac{p(1-p)^{k}}{1-(1-p)} \\
&= \frac{p(1-p)^k}{p} \\
&= (1-p)^k
\end{align*} Alternatively, notice the interpretation. Since \(X > k\), then the number of trials to reach our first success is greater than \(k\). This happens if and only if the first \(k\) trials are all failures and so \(P(X>k) = P(\text{first} ~ k ~ \text{trials are all failures} ) = P(F_1 \cap F_2 \cap \ldots \cap F_k) = P(F_1)P(F_2) \ldots P(F_k) = (1-p)^k. \)
We now present the expected value, variance, and cumulative distribution function for the Geometric random variable.
Theorem: If \(X \sim Geo(p)\) then
\begin{align*} \mathbb{E}[X] &= \frac{1}{p} \\ \mathbb{V}ar[X] &= \frac{1-p}{p^2} \end{align*}
Additionally, the cumulative distribution function is given by \[F(x) = P(X \leq x) =
\begin{cases}
0 & \text{if} ~ x < 1, \\
1-(1-p)^{\lfloor x \rfloor} & \text{if} ~ x \geq 1
\end{cases} \]
An experiment consists of administering a drug to a patient and then seeing if the drug is a success (meaning beneficial) for that patient or if the drug is a failure (meaning non-beneficial) for that patient. Let us suppose that a sequence of independent trials are to be performed until a person is found for which the drug is successful. It is known that the probability the drug is successful for any patient is 0.75. Find the probability that it takes more than five patients to obtain our first success.
- Answer
-
We are interested in modeling the number of patients it takes to obtain our first success and so allow us to define our random variable. Let \(X\) denote the number of patients it takes to obtain our first success. Notice that:
1) The experiment can be thought of as repeating independent trials until we reach our first success at which point the experiment stops.
2) The trial may result in one of two outcomes where we think of one outcome as being a "success" while the other outcome is labeled as a "failure".
3) The probability of success on every trial is 0.75 (and hence the probability of failure is 0.25).
4) \(X\) models the number of trials it takes to obtain our first success.
Hence, \(X \sim Geo(0.75) \). Thus, by Theorem 9.4.3 we have the following:
\[ P(\text{it takes more than five patients to obtain our first success} ) = P(X > 5) = (1-0.75)^5 = 0.25^5 \nonumber\ \]
Alternatively, by Theorem 9.3.4 we have the following:
\[ P(\text{it takes more than five patients to obtain our first success} ) = P(X > 5) = 1 - P(X \leq 5) = 1 - [ 1 - (1-0.75)^5] = 0.25^5 \nonumber\ \]
and finally, we may also write the following:
we have the following:
\[ P(\text{it takes more than five patients to obtain our first success} ) = P(X > 5) = 1 - P(X \leq 5) = 1 - geometcdf(0.75, 5) \nonumber\ \]
An experiment consists of administering a drug to a patient and then seeing if the drug is a success (meaning beneficial) for that patient or if the drug is a failure (meaning non-beneficial) for that patient. Let us suppose that a sequence of independent trials are to be performed until a person is found for which the drug is successful. It is known that the probability the drug is successful for any patient is 0.75. Find the probability that it takes at most three patients to obtain our first success.
- Answer
-
Similarly, let \(X\) denote the number of patients it takes to obtain our first success. As argued above, \(X \ si m Geo(0.75) \). Thus, by Theorem 9.4.4 we have the following:
\[ P(\text{it takes at most three patients to obtain our first success} ) = P(X \leq 3) = [ 1 - (1-0.75)^3] = 1 - 0.25^3 \nonumber\ \]
Alternatively, we may use the following calculator command:
\[ P(\text{it takes at most three patients to obtain our first success} ) = P(X \leq 3) = geometcdf(0.75, 3) \nonumber\ \]
It is known that light bulbs produced by a certain company will be defective with probability 0.01 independently of one another. The company sells the light bulbs in packs of 15 and states at most, one of the fifteen bulbs will be defective. If more than 1 bulb is defective, then the company offers a money back guarantee. Suppose an experiment consists of a customer who continually buys packs until they come across a pack that is eligible for return. What is the probability that the customer will have to buy more than 20 packs?
- Answer
-
Let \(X\) model the number of packs the customer buys until they come across a pack that is eligible for return. Then \( X \sim Geo(p) \). Can we find \( p \)? That is, can we find what is the probability that the customer will return a pack?
Well the customer returning a pack depends on the number of defective light bulbs. Let \(Y\) denote the number of defective light bulbs. Then \(Y \sim Bin(15, 0.01) \).
\begin{align*}
p = P( \text{a pack is eligible for return}) = P(Y > 1) = 1 - P(Y \leq 1) = 1 - binomcdf(15, 0.01, 1) \approx 0.0096297734
\end{align*}Hence \(X \sim Geo(0.0096297734)\).
\begin{align*}
P( \text{the customer has to buy more than 20 packs until they find a pack eligible for return}) &= P(X > 20) \\
&= (1-0.0096297734)^{20} \\
&= 0.8240461109 \\
\end{align*}Or alternatively, \begin{align*} P( \text{the customer has to buy more than 20 packs until they find a pack eligible for return}) &= P(X > 20) \\
&= 1 - P(X \leq 20) \\
&= 1 - geometcdf(0.0096297737, 20) \\
&= 0.8240461109
\end{align*}
Allow us to modify Example 9.3.5. What if instead of modeling the number of trials until our first success, we wish to model the number of trials until our second success. What would this look like? More generally, we are asking if instead of repeating trials until our first success, we repeated trials until we obtained a total of \( r\) successes and wished to model the number of trials until ou r\( r^{th} \) success by a random variable \(X\). In such a case, what would the probability mass function of \(X\) look like? Clearly \(X\) is no longer Geometric (provided that \( r > 1\).
It would be nice to have a random variable which models the number of trials it takes to obtain our \(r^{th} \) success. The above scenario calls on us to use another brand name random variable which an extension of the Geometric. This random variable is the Negative Binomial random variable.
Definition: A random variable \(X\) whose probability mass function is given by
\[f(x) = P(X = x) = \begin{cases}
\binom{x-1}{r-1} p^r (1-p)^{x-r} & \text{if} ~ x = r, r+1, r+2, \ldots \\
0 & \text{otherwise} \\
\end{cases} \]
is called a Negative Binomial random variable with parameters \(r\) and \(p\) and we write \( X \sim NegBin(r,p) \).
Notice that \( NegBin(1,p) = Geo(p) \).
Suppose an experiment is to be performed with the following characteristics:
- The experiment can be thought of as repeating independent trials until we reach our \( r^{th} \) success at which point the experiment stops.
- The trial may result in one of two outcomes where we think of one outcome as being a "success" while the other outcome is labeled as a "failure".
- The probability of success on every trial is \(p\) (and hence the probability of failure is \(1-p\) ).
- \(X\) models the number of trials it takes to obtain our \( r^{th} \)success.
Then \( X \sim NegBin(r,p) \).
To summarize, the Negative Binomial random variable models the total number of independent trials it takes to obrain \(r\) successes.
Theorem: Suppose an experiment is to be performed with the following characteristics:
- The experiment can be thought of as repeating independent trials until we reach our \( r^{th} \) success at which point the experiment stops.
- The trial may result in one of two outcomes where we think of one outcome as being a "success" while the other outcome is labeled as a "failure".
- The probability of success on every trial is \(p\) (and hence the probability of failure is \(1-p\) ).
- \(X\) models the number of trials it takes to obtain our \( r^{th} \)success.
Then \( X \sim NegBin(r,p) \).
- Proof:
-
Left as a homework exercise. Hint: If the \( r^{th} \) success occurs on the \( x^{th} \) trial, then what can we say occurred on the previous \(x-1\) trials?
Suppose an experiment consists of administering a drug to a patient and then seeing if the drug is a success (meaning beneficial) for that patient or if the drug is a failure (meaning non-beneficial) for that patient. Let us suppose that a sequence of independent trials are to be performed until ten people are found for which the drug is successful. It is known that the probability the drug is successful for any patient is 0.75. Find the probability that the tenth success occurs on the thirtieth patient.
- Answer
-
Let \(X\) model the number of patients it takes to obtain our tenth success. Notice that
1) The experiment can be thought of as repeating independent trials until we reach our tenth success at which point the experiment stops.
2) The trial may result in one of two outcomes where we think of one outcome as being a "success" while the other outcome is labeled as a "failure".
3) The probability of success on every trial is 0.75 (and hence the probability of failure is 0.25).
4) \(X\) models the number of trials it takes to obtain our tenth success.
Hence, \(X \sim NegBin(10,0.75)\). That is,
\[f(x) = P(X = x) = \begin{cases}
\binom{x-1}{9} 0.75^{10} (0.25)^{x-10} & \text{if} ~ x = 10, 11, 12, \ldots \\
0 & \text{otherwise} \\
\end{cases} \nonumber\ \]Hence \(P(\text{the tenth success occurs on the thirtieth patient}) = P(X=30) = f(30) = \binom{29}{9} 0.75^{10} (0.25)^{20} \) or alteratively, we may write \(P(\text{the tenth success occurs on the thirtieth patient}) = P(X=30) = f(30) = NegBinpdf(10, 0.75, 30) \)
Suppose an experiment consists of administering a drug to a patient and then seeing if the drug is a success (meaning beneficial) for that patient or if the drug is a failure (meaning non-beneficial) for that patient. Let us suppose that a sequence of independent trials are to be performed until ten people are found for which the drug is successful. It is known that the probability the drug is successful for any patient is 0.75. Find the probability that it takes more than thirty patients to obtain our tenth success.
- Answer
-
Let \(X\) model the number of patients it takes to obtain our tenth success. As argued above , \(X \sim NegBin (10,0.75)\). That is,
\[f(x) = P(X = x) = \begin{cases}
\binom{x-1}{9} 0.75^{10} (0.25)^{x-10} & \text{if} ~ x = 10, 11, 12, \ldots \\
0 & \text{otherwise} \\
\end{cases} \nonumber\ \]Hence \(P(\text{it takes more than thirty patients to obtain our tenth success}) = P(X > 30) = 1 - P(X \leq 30) = 1 - NegBincdf(10, 0.75, 30). \)
We now present the expected value and variance for the Negative Binomial random variable which concludes this section.
Theorem: If \( X \sim NegBin(r,p) \) then
\begin{align*} \mathbb{E}[X] &= \frac{r}{p} \\ \mathbb{V}ar[X] &=\frac{r(1-p)}{p^2} \end{align*}