1.3: Basic Propositions in Probability
In this section, we will develop a small list of theorems that we will use for the entire duration of the semester. For this reason, it is important that we understand the proof of each theorem, both formally and informally.
\[ P(A^c) = 1 - P(A) \]
Proof: Recall The Complement Laws from Section 1.1. Since \( A \cup A^c = S \), it follows that \[ P(A \cup A^c ) = P(S) \label{comp1} \] Again by The Complement Laws , \( A \cap A^c = \emptyset \) and so \(A\) and \( A^c \) are mutually exclusive. Since we have a union of mutually exclusive events, Axiom 3 tells us that \[ P(A \cup A^c ) = P(A) + P(A^c) \label{comp2} \] Substituting \ref{comp2} into \ref{comp1} yields \[ P(A) + P(A^c) = 1 \nonumber\ \]
Hence, \[ P(A^c) = 1 - P(A) \nonumber\ \]
\[ P( \emptyset ) = 0 \]
Proof: Recall The Domination Laws from Section 1.1. Since \( A \cup \emptyset = A \), it follows that \[ P(A \cup \emptyset ) = P(A) \label{null1} \] Again by The Domination Laws , \( A \cap \emptyset = \emptyset \) and so \(A\) and the \( \emptyset \) are mutually exclusive. Since we have a union of mutually exclusive events, Axiom 3 tells us that \[ P(A \cup \emptyset ) = P(A) + P( \emptyset ) \label{null2} \] Substituting \ref{null2} into \ref{null1} yields \[ P(A) + P(\emptyset ) = P(A) \nonumber\ \]
Hence, \[ P( \emptyset) = 0 \nonumber\ \]
If \( E \subseteq F \) then \( P(E) \leq P(F) \).
Proof: Recall from Section 1.1. that if \( E \subseteq F \) then \( F = E \cup (E^c \cap F) \). Hence, \[ P(F) = P \big( E \cup (E^c \cap F) \big) \label{sub1} \]
Notice that \(E \) and \( E^c \cap F \) are mutually exclusive. Thus \( E \cup (E^c \cap F) \) is a union of mutually exclusive events. Since we have a union of mutually exclusive events, Axiom 3 tells us that \[ P \big( E \cup (E^c \cap F \big) = P(E) + P(E^c \cap F) \label{sub2} \]
Substitution Equation \ref{sub2} into the right hand side of Equation \ref{sub1} yields \[ P( F ) =P(E) + P(E^c \cap F) \nonumber\ \]
By Axiom 1 , we know that \( P(E^c \cap F) \geq 0 \) and so it follows that \[ P(E) \leq P(F) \nonumber\ \]
For each event \( A \) of a sample space \( S \), \( P(A) \leq 1 \).
Proof: (Try this on your own before revealing the solution!)
- Answer
-
Since \( A \subseteq S \), then by the above theorem, \( P(A) \leq P(S) \). By Axiom 2 , we know that \( P(S) = 1 \) and so \( P(A) \leq 1 \).
Our next theorem is particularly important since the result will generalize the probability of the union for any two events. Currently, by Axiom 3 , we know that if \(A\) and \(B\) are disjoint, then \( P(A \cup B) = P(A) + P(B) \). However, what if \(A\) and \(B\) were not disjoint? In such a case, how can we find \( P(A \cup B) \)? The answer is given to us in the following theorem.
For any two events, \(A\) and \(B\) of a sample space \(S\), \( P(A \cup B) = P(A) + P(B) - P(A \cap B) \).
Proof:
- Answer
-
Observe that \( A \cup B = A \cup (A^c \cap B) \). Thus, \[ P(A \cup B) = P \big(A \cup (A^c \cap B) \big) \label{union1} \]
Since \( A \) and \( A^c \cap B \) are mutually exclusive, then by Axiom 3 , \[ P \big(A \cup (A^c \cap B) \big) = P(A) + P(A^c \cap B) \label{union2} \]
Thus,
\[ P(A \cup B) = P(A) + P(A^c \cap B) \label{union4} \]
Similarly, since \( B = (A^c \cap B) \cup (A \cap B) \) where \( (A^c \cap B) \) and \( (A \cap B) \) are mutually exclusive, then by Axiom 3 ,
\[ P(B) = P(A^c \cap B) + P(A \cap B) \nonumber \]
Rearranging the above equation yields
\[P(A^c \cap B) = P(B) - P(A \cap B) \label{union3} \]
Substituting Equation \ref{union3} into the right hand side of Equation \ref{union4} yields \[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \nonumber \]
The above theorem establishes that \( P(A \cup B) = P(A) + P(B) - P(A \cap B) \). A visual way of understanding this is to consider the number of times each region is counted or shaded in. Consider the Venn Diagram for \(A \cup B \) when \(A\) and \(B\) are not disjoint and let us study \( P(A \cup B) \). Notice that there are three regions and each region is counted or shaded once. Draw it out! One region is \(A \cap B^c \), the other region is \( A \cap B \), and the last region is \( A^c \cap B \).
Similarly, let us again consider the Venn Diagram for \(A \cup B \) and let us study \( P(A) + P(B) - P(A \cap B) \) by counting the number of times each region is shaded. We first shade in \( P(A) \) and so there are two regions that get counted once. (Draw it out!). We then shade in \( P(B) \). At this point in time the region \(A \cap B^c \) is counted once, the other region is \( A \cap B \) is counted twice , and the last region \( A^c \cap B \) is counted once. In order to get this picture to match the picture described in the above paragraph, we simply subtract out \( P(A \cap B) \) exactly once.
Clearly the above theorem holds in the special case that \(A\) and \(B\) are disjoint. Why?
We now extend the above formula for the union of any three events.
For any three events, \(E_1, E_2 \) and \(E_3\) of a sample space \(S\), \[ P( E_1 \cup E_2 \cup E_3) = P(E_1) + P(E_2) + P(E_3) - P(E_1 \cap E_2) - P(E_1 \cap E_3) - P(E_2 \cap E_3) + P(E_1 \cap E_2 \cap E_3) \]
Proof: This is left as an exercise to the reader.
- Hint
-
We can re-write the union of these three events as the union of two events via the associative property of the union: \( P( E_1 \cup E_2 \cup E_3) = P \big( (E_1 \cup E_2 ) \cup E_3 \big) \). We can now apply the formula for the union of two events and proceed from there.
The above theorem can also be understood by considering the Venn diagram. To end this section, we present the general inclusion-exclusion identity which can be proved by mathematical induction.
For every \( n\) events \( E_1, E_2, \ldots, E_n \) of a sample space \( S \),
\[ P(E_1 \cup E_2 \cup \ldots \cup E_n) = \sum_{i=1}^{n} P(E_i) - \sum_{i_1 < i_2} P(E_{i_{1}} \cap E_{i_{2}} )+ \ldots + (-1)^{k+1} \sum_{i_1 < i_2 < \ldots < i_k} P(E_{i_{1}} \cap E_{i_{2}} \cap \ldots \cap E_{i_{k}}) + \ldots + (-1)^{n} P(E_1 \cap E_2 \cap \ldots \cap E_n) \]
Simply put, the above theorem states that the probability of the union of any \(n\) events is equal to the sum of the individual probabilities, minus the sum of the probabilities of every possible intersection of two events, plus the sum of the probabilities of every possible intersection of three events and so on.