9.5: The Poisson Random Variable
In this section, we will discuss our last important discrete random variable - the Poisson random variable. As usual, we will use our framework for introducing random variables by first defining the pmf or cdf, then by understanding what the random variable models, and then proving our interpretation is correct (when feasible).
Definition: Let \( \lambda \) denote a positive real number. A random variable \(X\) whose probability mass function is given by
\[ \displaystyle f(x) = P(X = x) =
\begin{cases}
\displaystyle e^{- \lambda} \times \frac{\lambda^{x}}{x!} & \text{if} ~ x =0, 1, 2, 3, \ldots \\
0 & \text{otherwise} \\ \end{cases} \]
is called a Poisson random variable with parameter \(\lambda\) and we write \( X \sim Poisson(\lambda) \).
We offer two interpretations of a Poisson random variable.
Suppose an experiment is to be performed with the following characteristics:
- The random variable \(X\) can be thought of as a Binomial random variable with parameters \(n\) and \(p\).
- \(n\) is sufficiently large.
- \(p\) is sufficiently small.
Then \( X \sim Poisson(\lambda) \) where \( \lambda = np \).
Suppose an experiment is to be performed with the following characteristics:
- \(X(t) \) counts or models the number of events which occurs by time \(t\).
- \(X(0) = 0 \).
- \( \{ X(t), t \geq 0 \} \) has independent increments meaning the number of events that occurs in disjoint (nonoverlapping) time intervals are independent.
- \( P \bigg( X(t+h) - X(t) = 1 \bigg) = \lambda h + o(h) \) where the function \(f\) is said to be \(o(h) \) if \( \displaystyle \lim_{h \rightarrow 0} \frac{f(h)}{h} = 0 \).
- \( P \bigg( X(t+h) - X(t) \geq 2 \bigg) = o(h) \).
Then \( X \sim Poisson(\lambda) \).
Essentially, Interpretation 1 is positing that the Poisson random variable can be used as an approcimation for a Binomial random variable when \(n\) is alrge and \(p\) is small enough. Meanwhile, Interpretation 2 is stating that the Poisson random variable models the number of times a particular event occurs under certain conditions.
Our focus will be on Interpretation 1 and so allow us to give an example of when we have a Poisson distribution. For instance, if \(X\) models the number of customers arriving at a store, then we can argue that \(X\) is a Poisson random variable. To argue this, suppose we know that customers arrive to a store, on average, 10 customers per hour. What we can do is divide the hour into minutes or seconds or milliseconds and consider each time unit as a trial. If we were to divide the hour into seconds, then we have 3,600 seconds and hence 3,600 trials. On any given second/trial, the probability that a customer will arrive to the store is \( \frac{10}{3600} \) and hence we have a Binomial process with a large value of \(n = 3600 \) and a small value of \( p = \frac{10}{3600} \). We will now prove the validity of Interpretation 1.
Theorem: Suppose an experiment is to be performed with the following characteristics:
- The random variable \(X\) can be thought of as a Binomial random variable with parameters \(n\) and \(p\).
- \(n\) is sufficiently large.
- \(p\) is sufficiently small.
Then \( X \sim Poisson(\lambda) \) where \( \lambda = np \).
- Proof:
-
We now derive the probability mass function of \(X\). First consider the possible values of \(X\). Remember that \(X\) denotes the number of occurrences of an event and so \(x = 0, 1, 2, 3, \ldots \). Thus, the possible values of \(X\) are \(0, 1, 2, 3, \ldots \).
We will now find the probability that \(X\) takes each of these values. Allow us to argue this generally by finding an approximation for \( P(X = x) \) where \(x\) is some value in the support of \(X\). Since \(X\) is a Binomial random variable, \( \displaystyle P(X = x) = \binom{n}{x} p^x (1-p)^{n-x} \). Define (\ \lambda = np \) and so \( \displaystyle p = \frac{\lambda}{n} \). Doing so, we obtain the following:
\begin{align*}
P(X = x) &= \binom{n}{x} p^x (1-p)^{n-x} & & \text{Substituting in} ~ p = \frac{\lambda}{n} \\ \\
&= \binom{n}{x} \bigg(\frac{\lambda}{n}\bigg)^x \bigg(1 - \frac{\lambda}{n} \bigg)^{n-x} & & \text{Expanding the binomial} \\ \\&= \frac{n!}{x!(n-x)!} \bigg(\frac{\lambda}{n}\bigg)^x \bigg(1 - \frac{\lambda}{n} \bigg)^{n-x} & & \text{Simplifying the} ~ \frac{n!}{(n-x)!} ~ \text{yields} \\ \\ &= \frac{n(n-1)(n-2) \ldots (n-x+1)}{x!} \bigg(\frac{\lambda}{n}\bigg)^x \bigg(1 - \frac{\lambda}{n} \bigg)^{n-x} & & \text{Rewriting the last two terms yields} \\ \\ &= \frac{n(n-1)(n-2) \ldots (n-x+1)}{x!} \frac{\lambda^{x}}{n^x} \frac{(1 - \frac{\lambda}{n})^n}{(1- \frac{\lambda}{n})^x} & & \text{Rearranging some of the terms yields} \\ \\ &= \frac{n(n-1)(n-2) \ldots (n-x+1)}{n^x} \frac{\lambda^{x}}{x!} \frac{(1 - \frac{\lambda}{n})^n}{(1- \frac{\lambda}{n})^x} & &
\end{align*}For a sufficiently large \(n\), notice that
\begin{align*} \frac{n(n-1)(n-2) \ldots (n-x+1)}{n^x} &= \frac{n(n-1)(n-2) \ldots (n-x+1)}{n \times n \times \ldots \times n} \\ \\ &= \frac{n}{n} \times \frac{n-1}{n} \times \frac{n-2}{n} \times \ldots \times \frac{n-x+1}{n} \\ \\ &= 1 \times \bigg( 1 - \frac{1}{n} \bigg) \times \bigg( 1 - \frac{2}{n} \bigg) \times \ldots \times \bigg( 1 - \frac{x-1}{n} \bigg) \rightarrow 1 \times (1 - 0) \times (1 - 0) \times \ldots \times (1 - 0) = 1 \end{align*}
Also recall from calculus that for a sufficiently large \(n\), we have \( \bigg( 1 - \frac{\lambda}{n} \bigg)^n \rightarrow e^{- \lambda} \) and \( \bigg( 1 - \frac{\lambda}{n} \bigg)^x \rightarrow 1 \).
Putting everything together, we see that
\begin{align*} P(X = x) &= \underbrace{\frac{n(n-1)(n-2) \ldots (n-x+1)}{n^x}}_{1} \frac{\lambda^{x}}{x!} \underbrace{\frac{(1 - \frac{\lambda}{n})^n}{(1- \frac{\lambda}{n})^x} }_{e^{- \lambda} } \\ & \rightarrow \frac{\lambda^{x}}{x!} e^{- \lambda} \\ &= e^{- \lambda} \frac{\lambda^{x}}{x!} \end{align*}
Allow us to consider an example which shows the relationship between the Binomial and the Poisson.
A manufacturer of light bulbs knows that 2% of it's bulbs are defective. The manufacturer sells the bulbs in packs of 100. Assuming that the bulbs are independent of each other, find the probability that the pack contains at most three defective bulbs.
- Answer
-
Let \(X\) denote the random variable which models the number of defective bulbs in a pack. Then \(X \sim Bin(100, 0.02) \). Hence,
\( P(\text{we obtain at most three defective bulbs}) = P(X \leq 3) =binomcdf(100, 0.02, 3) \approx 0.85896 \)
A manufacturer of light bulbs knows that 2% of it's bulbs are defective. The manufacturer sells the bulbs in packs of 100. Assuming that the bulbs are independent of each other, find the probability that the pack contains at most three defective bulbs.
- Answer
-
Let \(X\) denote the random variable which models the number of defective bulbs in a pack. Then \(X \sim Bin(100, 0.02) \). However, we may argue that \(n = 100 \) is sufficiently large and that \( p = 0.02 \) is sufficiently small. In doing so, we are saying that \(X \sim Poisson(\lambda) \) where \( \lambda = np = 100( 0.02) = 2 \) and so \(X\) is approximately distributed as a Poisson(2) random variable. Hence,
\( P(\text{we obtain at most three defective bulbs}) = P(X \leq 3) \approx poissoncdf(2, 3) \approx 0.85712 \)
Notice our estimate is only off by approximately 0.2%.
In all of the preceding random variables, it was clear what the parameters represent. For instance, in the Binomial random variable, \(n\) represents the number of trials and \(p\) represents the probability of success on any trial. And so, perhaps we are wondering what does the parameter \( \lambda \) represent for the Poisson random variable. Allow us to now interpret what the value of \( \lambda \) means.
Looking back at Interpretation 1, we see that \( \lambda = np \). In the context of the previous example, we said that \( \lambda = np = 100( 0.02) = 2 \) but allow us to now be more specific and consider the units. \( \lambda = np = (100 ~ \text{bulbs})(2 \% ~ \text{defective rate} ) = 2 ~ \text{defects per box} \). Hence, we can think of \( \lambda \) as a rate .
Additionally, the following theorem which presents the expected value and variance for a Poisson random variable will give us another way to view \( \lambda \).
Theorem: If \(X \sim Poisson(\lambda)\) then
\begin{align*} \mathbb{E}[X] &= \lambda \\ \mathbb{V}ar[X] &= \lambda \end{align*}
Hence, \( \lambda \) is not only a rate, but is also the average and variance of the Poisson distribution.
Customers arrive at a restaurant at a rate of 10 customers per hour. Assuming that the number of arrivals follows a Poisson distribution, find the probability that more than 14 customers arrive within a particular hour.
- Answer
-
Let \(X\) model the number of customers that arrive per hour. The question tells us that the rate is 10 customers per hour and so \(\lambda = 10 \). Thus, \(X \sim Poisson(10) \). Hence,
\[ P(\text{more than 14 customers arrive within a particular hour}) = P(X > 14) = 1 - P(X \leq 14) = 1 - poissoncdf(10,14) \approx 0.0834584728 \nonumber\ \]
Theorem: Suppose there is some event which occurs according to a Poisson distribution. That is, it's occurrence can be modeled by \( X \sim Poisson(\lambda) \) which means the event occurs at a mean rate of \( \lambda\) occurrences per unit. Then the random variable \(Y\) which models the number of events occurring in the interval of length \(t\) is given by \( Y \sim Poisson( \lambda t) \).
The above theorem is saying the following: suppose the number of earthquakes that occurs every year in California follows a Poisson distribution with parameter 12. That is, if \(X\) models the number of earthquakes per year in California, then \(X \sim Poisson (12) \). If we were interested in modeling the number of earthquakes that occurs every two years, then this would be \(Poisson(24) \). If we were interested in modeling the number of earthquakes that occurs every three years, then this would be \(Poisson(36) \). If we were interested in modeling the number of earthquakes that occurs every four years, then this would be \(Poisson(48) \). If we were interested in modeling the number of earthquakes that occurs every half a year, then this would be \(Poisson(6) \).
With the above theorem in mind, allow us to consider the following example.
Customers arrive at a restaurant at a rate of 10 customers per hour. Assuming that the number of arrivals follows a Poisson distribution, find the probability that more than 35 customers arrive within the span of three hours.
- Answer
-
Let \(X\) model the number of customers that arrive per hour . We wish to have a random variable that models the number of customers which arrive per every three hours . Let \(Y\) denote the random variable which models the number of customers which arrive per every three hours. By our above theorem, \( Y \sim Poisson( \lambda t) = Poisson( 10 \times 3) = Poisson(30) \). Hence,
\[ P(\text{more than 35 customers arrive within the span of three hours}) = P(Y > 35 ) = 1 - P(Y \leq 35) = 1 - poissoncdf(30.35) \approx 0.1573834736 \nonumber\ \]
We finally end this section off with an example of how conditional probabilities and random variables can be related.
The number of earthquakes in California and Alaska follows a Poisson distribution with an average of 4 earthquakes per week in California and 3 earthquakes per week in Alaska. One of these states are chosen at random and it is learned that 6 earthquakes occurred in one week. What is the probability that the state of Alaska was chosen?
- Answer 1
-
The difficulty in this problem arises from the fact that we are unsure as to which state was picked and so allow us to introduce a partition. Let \(B_1 = \{ \text{California was chosen} \} \) and \(B_2 = \{ \text{Alaska was chosen} \} \). We have learned some event which we will denote by \(A\) so \(A = \{ \text{6 earthquakes occurred in one week} \}. We are asked to find \( P(B_2 | A) and so we have the following by Bayes' Theorem:
\begin{align*}
P(B_2 | A) &=\frac{P(B_2)P(A|B_2)}{P(B_1)P(A|B_1) + P(B_2)P(A|B_2) } & & \text{since both states are equally likely to be chosen} \\ \\
&= \frac{0.5 P(A|B_2)}{0.5 P(A|B_1) + 0.5 P(A|B_2) } & & \text{Let X and Y model the number of earthquakes in California and Alaska respectively} \\ \\ &= \frac{0.5 P(Y=6)}{0.5 P(X=6) + 0.5 P(Y=6) } & & \text{Note that} ~ X \sim Poisson(4) ~ \text{and} ~ Y \sim Poisson(3) \\ \\ &= \frac{0.5 poissonpdf(3,6)}{0.5 poissonpdf(4,6) + 0.5 poissonpdf(3,6) } \\ \\ & \approx 0.3260528007 \end{align*}
- Answer 2
-
Making the tree diagram for this experiment yields the following:
We are asked to find \( P( \text{Alaska was chosen} | \text{6 earthquakes occurred}) \). By Bayes' Theorem, we obtain the following:
\begin{align*} P( \text{Alaska was chosen} | \text{6 earthquakes occurred}) &= \frac{P(\text{Alaska was chosen AND 6 earthquakes occurred})}{P(\text{6 earthquakes occurred})} \\ \\ &= \frac{0.5 poissonpdf(3,6)}{0.5 poissonpdf(4,6) + 0.5 poissonpdf(3,6) } \\ \\ & \approx 0.3260528007 \end{align*}