6.2: Independence
In the previous section, we saw how information about some event \(B\) can affect the probability of some other event, \(A\). In the initial lottery example, we saw that learning some information \(B\) decreased the probability of \(A\). In the dice example, we saw that learning some information \(B\) increased the probability of \(A\). But then we saw the following interesting example:
Suppose a card is selected at random from a regular deck of cards. Let \( E = \{ \text{the selected card is an ace} \} \) and let \( F = \{ \text{the selected card is a spade} \} \). Find \(P(E), P(F), P(E|F) \) and \(P(F|E) \).
- Answer
-
Since the card is selected at random, we are in a simple sample space. Hence \(P(E) = \frac{|E|}{|S|} = \frac{4}{52} = \frac{1}{13} \). By a similar argument, \(P(F) = \frac{|F|}{|S|} = \frac{13}{52} = \frac{1}{4} \).
To find \( P(E|F) \), we simply note that
\[ P( E | F ) = \frac{P(E \cap F)}{P(F)} = \frac{ \frac{|E \cap F|}{|S|} }{ \frac{|F|}{|S|} }= \frac{ \frac{1}{52} }{\frac{13}{52}} = \frac{1}{13} \nonumber \]
Notice that \( P(E) = P(E|F) \).
Meanwhile, to find \( P(F|E) \), we simply note that
\[ P( F | E ) = \frac{P(F \cap E)}{P(E)} = \frac{ \frac{|F \cap E|}{|S|} }{ \frac{|E|}{|S|} }= \frac{ \frac{1}{52} }{\frac{4}{52}} = \frac{1}{4} \nonumber \]
Notice that \( P(F) = P(F|E) \).
We noted that \( P(E) = P(E|F) \) and \( P(F) = P(F|E) \). Allow us to now interpret this.
When we say \( P(E) = P(E|F) \), we are saying that the occurrence of \(F\) has not affected the probability of \(E\).
Similarly, when we say \( P(F) = P(F|E) \), we are saying that the occurrence of \(E\) has not affected the probability of \(F\).
Hence, we might be inclined to say that the probability of the event \(E\) does not depend upon the knowledge of \(F\) occurring and the probability of the event \(F\) does not depend upon the knowledge of \(E\) occurring.
And finally, the last leap for us to make is that a simple way of saying "does not depend" is to say that these events are independent. That is, \(E\) and \(F\) are independent events since the occurrence of one event does not affect the probability of the other event, and vice-versa.
When we use the word "independent" in the above paragraph, we do so colloquially. Allow us to mathematically define what it means for two events to be independent.
Definition: Two events, \(E\) and \(F\) are said to be independent provided that
\[ P(E \cap F) = P(E) P(F) \]
If \( P(E \cap F) \neq P(E) P(F) \), then we say that \(E\) and \(F\) are dependent .
In words, the formal definition is saying that for two events to be independent, it means that the probability of their intersection is equal to the product of the individual probabilities. Note that the formal definition of independence gives us our intuitive idea of what independence should be. To see this, note that by using the formal definition, we obtain the following two equalities:
\[ P(E | F) \underbrace{=}_{def.} \frac{ P(E \cap F) }{ P(F) } \underbrace{=}_{def.} \frac{ P(E ) P(F) }{ P(F) } = P(E) \nonumber\ \]
\[ P(F | E) \underbrace{=}_{def.} \frac{ P(F \cap E) }{ P(E) } \underbrace{=}_{def.} \frac{ P(F ) P(E) }{ P(E) } = P(F) \nonumber\ \]
Suppose that two printers are operated independently of each other. Further suppose that the probability that Printer 1 will malfunction in a 24 hour period is \( \frac{1}{3} \) and the probability that Printer 2 will malfunction in the same 24 hour period is \( \frac{1}{4} \). Find the probability that at least one machine will malfunction during the same 24-hour period.
- Answer
-
Let \(A = \{ \text{at least one printer will malfunction during the same 24-hour period} \} \). Notice that we can rewrite \(A\) as
\begin{align*} A &= \{ \text{at least one printer will malfunction during the same 24-hour period} \} \\ &= \{ \text{Printer 1 will malfunction OR Printer 2 will malfunction} \} \\ &= \{ \text{Printer 1 will malfunction} \} \cup \{ \text{Printer 2 will malfunction} \} \\ &= M_1 \cup M_2 \end{align*}
where \(M_i = \{ \text{Machine} ~ i ~ \text{will malfunction during the 24-hour period} \} \). Hence we obtain the following:
\begin{align*} P(A) &= P(M_1 \cup M_2 ) \\ &= P(M_1) + P(M_2) - \underbrace{P(M_1 \cap M_2)}_{independence} \\ &= P(M_1) + P(M_2) - P(M_1)P(M_2) \\ &= \frac{1}{3} + \frac{1}{4} - \Bigg(\frac{1}{3} \times \frac{1}{4} \bigg) \\ &= 0.5\end{align*}
When asked to explain what it means for two events to be independent, some students falsely paraphrase the idea and state something along the lines of independence means that one event does not affect the other event. With this type of thinking, they are then often led to believe that since one event does not affect the other event, then independence means that these events must be separate or mutually exclusive. However, this is incorrect! Can you explain why?
- Answer
-
Independence means that one event does not affect the probability of the other event and vice-versa. However, it is possible that two events are independent and one event does affect the outcomes in the other event.
To see this, let us think back to the card example considered in the beginning of this section. Notice that \( P(E) = P(E|F) \) and \( P(F) = P(F|E) \) and so one event did not affect the probability of the other event and vice-versa. However, notice that the knowledge of \(E\) does affect the outcomes which can happen in \(F\) and vice-versa. You see, before we learn anything, the possible outcomes in \(E\) are \( A \heartsuit, A \clubsuit, A \diamondsuit, A \spadesuit \). After learning that \(F\) has occurred, the only way for \(E\) to occur is to obtain \( A \spadesuit \). That is, the other outcomes in \(E\) can no longer occur and so learning that the event \(F\) has occurred has affected which outcomes in \(E\) can occur. Despite this, the overall probability of \(E\) remains unchanged.
In summary, if two events are independent, then each event does not affect the probability of the other's but it is possible that each event affects which outcomes can occur in the other.
Additionally, there is an important distinction to be made. Notice that it \(E\) and \(F\) are two events with nonzero probabilities, then it is impossible for \(E\) and \(F\) to be both mutually exclusive and independent. To see why, let us assume that it is possible to have two events \(E\) and \(F\) where \(P(E)>0\) and \( P(F) > 0\) such that \(E\) and \(F\) are both independent and disjoint. Independence guarantees that
\[ \underbrace{P(E \cap F) = P(E) P(F)}_{independence} > 0 \nonumber\ \] while disjoint guarantees \[ \underbrace{P(E \cap F) = P(\emptyset)}_{mutually ~ exclusive} = 0 \nonumber\ \]
Hence \(P(E \cap F) > 0 \) and \(P(E \cap F) = 0 \). Contradiction!
We summarize the above discussion in the following remark.
In summary, the independence of two events means that the occurrence of one event does not affect the probability of the other event and vice-versa.*
Additionally, if you have two independent events, both of which have a non-zero probability of occurring, then it is impossible for the events to also be disjoint.
*Note that we will be revising this statement in a few moments.
Our next theorem will be especially important for later examples.
Theorem: If \(E\) and \(F\) are independent events, then the following events are also independent:
- \(E^c\) and \(F\)
- \(E\) and \(F^c\)
- \(E^c\) and \(F^c\)
- Answer
-
We discuss the proof of (1) and leave the other two proofs as an exercise for homework.
Let us assume that \(E\) and \(F\) are independent events. We must show that \(E^c\) and \(F\) are independent. That is, we must show \(P(E^c \cap F) = P(E^c) P(F) \).
\begin{align*} P(E^c \cap F) &= P(F) - \underbrace{P(E \cap F)}_{independence} \\ &= P(F) - P(E) P(F) \\ &= P(F) \big[ 1 - P(E) \big] \\ &= P(F) P(E^c) \\ &= P(E^c) P(F) \end{align*}
By virtue of the above theorem, we see that if \(E\) and \(F\) are independent then:
- \( P(E) = P(E|F) = P(E| F^c) \) and
- \( P(F) = P(F|E) = P(F| E^c) \).
Thus, we can say that intuitively, independence means that the occurrence or nonoccurrence of one event does not affect the probability of the other event and vice-versa.
Our next step is to generalize the definition of independence for more than two events. Most students reason that since for two events to be independent, it means that the probability of their intersection is equal to the product of the individual probabilities, then the same must be true for three events to be independent. That is, most students will guess that three events, \(E, F\) and \(G\) are independent provided that \( P(E \cap F \cap G) = P(E) P(F) P(G) \).
However, there is more to it! Since \(E, F\) and \(G\) are independent then every subcollection of events is also independent. That is, \(E\) and \(F\) are independent, \(E\) and \(G\) are independent and \(F\) and \(G\) are independent. Hence we have the following definition:
Definition: Three events, \(E, F\) and \(G\) are said to be mutually independent provided that all of the following equations are satisfied:
\begin{align*} P(E \cap F) &= P(E) P(F) \\ P(E \cap G) &= P(E) P(G) \\ P(F \cap G) &= P(F) P(G) \\ P(E \cap F \cap G) &= P(E) P(F) P(G) \end{align*}
The following definition generalizes the above idea in order to extend independence to \(k\) events.
Definition: The \(k\) events, \(A_1, A_2, \ldots, A_k \) are mutually independent provided that for every subset \( A_{i_{1}}, A_{i_{2}}, \ldots, A_{i_{j}} \) of \(j\) of these events, \( (j = 2, 3, 4, \ldots, k ) \),
\[ P(A_{i_{1}} \cap A_{i_{2}} \cap \ldots \cap A_{i_{j}}) = P(A_{i_{1}}) \times P(A_{i_{2}}) \times\ldots \times P(A_{i_{j}}) \nonumber\ \]
In reliability theory, a parallel system refers to a system which will function as long as at least one of it's components are working. Suppose the batteries in the Mars Rover is a parallel system that has three components. Further suppose that the probability that any component is working is 0.99. If the components operate independently of each other, then what is the probability that the system will function?
- Solution 1
-
We will first answer this question directly. Let \(E = \{ \text{the system will function} \} \). As usual, we pause for a moment and ask if we can re-write \(E\). We notice we can write
\begin{align*} E &= \{ \text{the system will function} \} \\ &= \{ \text{at least one component is working} \} \\ &= \{ \text{Component 1 works OR Component 2 works OR Component 3 works} \} \\ &= C_1 \cup C_2 \cup C_3 \end{align*}
where \(C_i = \{ \text{Component} ~ i ~ \text{is working} \} \). By Equation 1.3.13 , we obtain the following:
\begin{align*} P(E) &= P(C_1 \cup C_2 \cup C_3) \\ &= P(C_1) + P(C_2) + P(C_3) - \underbrace{P(C_1 \cap C_2)}_{indep.} - \underbrace{P(C_1 \cap C_3)}_{indep.} - \underbrace{P(C_2 \cap C_3)}_{indep.} + \underbrace{P(C_1 \cap C_2 \cap C_3)}_{indep.} \\ &= P(C_1) + P(C_2) + P(C_3) - P(C_1) P(C_2) - P(C_1)P(C_3) - P(C_2)P(C_3) + P(C_1)P(C_2)P(C_3) \\ &= 0.99 + 0.99 + 0.99 - 0.99(0.99) - 0.99(0.99) - 0.99(0.99) + 0.99(0.99)(0.99) \\ &= 0.999999 \end{align*}
Notice that if the Rover only had one battery component, then the Rover will function with a 99% chance. However, by adding an additional two components as part of a parallel system, the Rover will now have a higher chance of functioning correctly.
In reliability theory, a parallel system refers to a system which will function as long as at least one of it's components are working. Suppose the batteries in the Mars Rover is a parallel system that has three components. Further suppose that the probability that any component is working is 0.99. If the components operate independently of each other, then what is the probability that the system will function?
- Solution 2
-
We will now answer this question indirectly via the complement. Since \( E = \{ \text{the system will function} \} = \{ \text{at least one component is working} \} \), then \begin{align*} E^c &= \{ \text{none of the components are working} \} \\& = \{ \text{Component 1 is not working AND Component 2 is not working AND component 3 is not working} \} \\ &= \{ \text{Component 1 is not working} \} \cap \{ \text{Component 2 is not working} \} \cap \{ \text{Component 3 is not working} \} \\ &= C_1^c \cap C_2^c \cap C_3^c \end{align*}
Putting everything together yields
\begin{align*} P(E) = 1 - P(E^c) &= 1 - \underbrace{ P(C_1^c \cap C_2^c \cap C_3^c)}_{indep.} \\ &= 1 - P(C_1^c) P(C_2^c) P(C_3^c) \\ &= 1 - 0.01(0.01)(0.01) \\ &= 0.999999 \end{align*}
Suppose an experiment consists of repeatedly flipping a fair coin until a heads appears for the first time at which point the experiment stops. Assuming that the flips of the coin are independent of each other, find the probability that the experiment will eventually terminate. That is, find the probability that a heads is eventually obtained.
- Answer
-
Before we answer this, allow us to guess the solution. If we are given a fair coin and we are asked what is the probability that we will eventually obtain a heads, our intuition might tell us that the probability should be 1. That is, we are saying that given an infinite amount of coin flips, we are certain that eventually we will obtain a heads. Thus, we expect the answer to be 1. Allow us to now formally answer the problem.
We may answer this question directly as in Example 3 but we will opt for an indirect approach as in Example 4. (The direct approach is left as an exercise for homework). Let \( E = \{ \text{we eventually obtain our first heads} \} \). Then \begin{align*} E^c &= \{ \text{it is not the case we eventually obtain our first heads} \} \\ &= \{ \text{we never obtain our first heads} \} \\ &= \{ \text{we always obtain a tails} \} \\ &= \{ \text{flip 1 results in tails AND flip 2 results in tails AND flip 3 results in tails AND} \ldots \} \\ &= T_1 \cap T_2 \cap T_3 \cap \ldots \\ &= \end{align*}
where \(T_i = \{ \text{we obtain a tails on flip} ~ i \} \). Putting everything together yields
\begin{align*} P(E) = 1 - P(E^c) &= 1 - \underbrace{ P(T_1 \cap T_2 \cap T_3 \cap \ldots)}_{indep.} \\ &= 1 - P(T_1) P(T_2) P(T_3) \ldots \\ &= 1 - 0.5(0.5)(0.5) \ldots \\ &= 1 - \lim_{n \rightarrow \infty} 0.5^n \\ &= 1 - 0 = 1 \end{align*}
A sequence of five independent trials are to be performed. Each trial consists of administering a drug to a patient and then seeing if the drug is a success (meaning beneficial) for that patient or if the drug is a failure (meaning non-beneficial) for that patient. The probability the drug is successful for any patient is 0.75. Find the probability that the drug is successful for at least one patient.
- Answer
-
We may answer this question directly as in Example 3, but we will opt for an indirect approach as in Examples 4 and 5. (The direct approach is left as an exercise for homework). Let \( E = \{ \text{the drug is successful for at least one patient} \} \). Then \begin{align*} E^c &= \{ \text{the drug is successful for no patients} \} \\ &= \{ \text{the drug is a failure for all patients} \} \\ &= \{ \text{the drug is a failure for patient 1 AND the drug is a failure for patient 2 AND ... AND the drug is a failure for patient 5} \} \\ &= F_1 \cap F_2 \cap F_3 \cap F_4 \cap F_5 \end{align*} where \(F_i = \{ \text{the drug is a failure for patient} ~ i \} \). Putting everything together yields \begin{align*} P(E) = 1 - P(E^c) &= 1 - \underbrace{ P(F_1 \cap F_2 \cap F_3 \cap F_4 \cap F_5) }_{indep.} \\ &= 1 - P(F_1) P(F_2) P(F_3)P(F_4)P(F_5) \\ &= 1 - 0.25(0.25)(0.25)(0.25)(0.25) \ldots \\ &= 1 - 0.25^5 \\ &= \approx 0.9990 \end{align*}
Our last example for this section is particularly important as it's generalization will yield an important result.
A sequence of five independent trials are to be performed. Each trial consists of administering a drug to a patient and then seeing if the drug is a success (meaning beneficial) for that patient or if the drug is a failure (meaning non-beneficial) for that patient. The probability the drug is successful for any patient is 0.75. Find the probability that exactly three successes occur.
- Answer
-
Let \( E = \{ \text{we obtain exactly three successes} \} \). Notice that the complement of \(E\) is that we obtain zero or one or two or four or five successes. Since the complement is more tedious than \(E\) itself, we will stick to answering this question directly.
As usual, we ask ourselves is it possible to rewrite \(E\)? If we let \(F_i = \{ \text{the drug is a failure for patient} ~ i \} \) and \(S_i = \{ \text{the drug is a success for patient} ~ i \} \) then \(E\) is the union of all possible sequences of five trials where three of the trials are a success. That is,
\[ E = (S_1 \cap S_2 \cap S_3 \cap F_4 \cap F_5) \cup (S_1 \cap S_2 \cap F_3 \cap F_4 \cap S_5) \cup \ldots \cup (F_1 \cup S_2 \cup S_3 \cup S_4 \cap F_5) \nonumber\ \]
Notice the events, \( (S_1 \cap S_2 \cap S_3 \cap F_4 \cap F_5), (S_1 \cap S_2 \cap F_3 \cap F_4 \cap S_5), \ldots, (F_1 \cup S_2 \cup S_3 \cup S_4 \cap F_5) \) are all disjoint and so
\[ P(E) = P(S_1 \cap S_2 \cap S_3 \cap F_4 \cap F_5) + P(S_1 \cap S_2 \cap F_3 \cap F_4 \cap S_5) + \ldots + P(F_1 \cup S_2 \cup S_3 \cup S_4 \cap F_5) \nonumber\ \]
Notice that by independence, \( P(S_1 \cap S_2 \cap S_3 \cap F_4 \cap F_5) = P(S_1) P(S_2) P(S_3) P(F_4) P(F_5) = 0.75^3 \times 0.25^2 \) and similarly, \( P(S_1 \cap S_2 \cap F_3 \cap F_4 \cap S_5) = 0.75^3 \times 0.25^2 \) and \( P(F_1 \cup S_2 \cup S_3 \cup S_4 \cap F_5) = 0.75^3 \times 0.25^2 \). That is, since all of the events in the union of \(E\) have three successes and two failures, each event occurs with probability \(0.75^3 \times 0.25^2\).
Hence, \begin{align*} P(E) &= P(S_1 \cap S_2 \cap S_3 \cap F_4 \cap F_5) + P(S_1 \cap S_2 \cap F_3 \cap F_4 \cap S_5) + \ldots + P(F_1 \cup S_2 \cup S_3 \cup S_4 \cap F_5) \\ &= \underbrace{0.75^3 \times 0.25^2 + 0.75^3 \times 0.25^2 + \ldots + 0.75^3 \times 0.25^2}_{\text{how many times are these added together?}} \end{align*}
The number of times we have \(0.75^3 \times 0.25^2\) is equal to the number of times we are able to have three successes among the five placeholders. That is, there are \( \binom{5}{3} = 10 \) possible arrangements. Putting everything together yields \begin{align*} P(E) &= P(S_1 \cap S_2 \cap S_3 \cap F_4 \cap F_5) + P(S_1 \cap S_2 \cap F_3 \cap F_4 \cap S_5) + \ldots + P(F_1 \cup S_2 \cup S_3 \cup S_4 \cap F_5) \\ &= \underbrace{0.75^3 \times 0.25^2 + 0.75^3 \times 0.25^2 + \ldots + 0.75^3 \times 0.25^2}_{ \binom{5}{3} } \\ &= \binom{5}{3} 0.75^3 0.25^2 \end{align*}
The above example can be generalized to show the following result:
Theorem: Suppose a sequence of \(n\) independent trials are to be performed where each trial may result in either a success with probability \(p\) or may result in a failure with probability \(1-p\). The probability that we obtain exactly \(x\) successes (where \(x\) is either \( 0, 1, 2, \ldots, n \) ) is given by \[ \binom{n}{x}p^x (1-p)^{n-x} \nonumber \]
- Proof:
-
The proof is left as a homework exercise. Hint: Generalize the above problem!