1.2: The Three Axioms of Probability
In the last section, we stated that our informal definition of probability has some holes in it and this is problematic! In order to study probability, we first must agree as to what exactly a probability is. We then made a note that the formal definition of probability is rooted in the language of sets and so we studied set theory. Although we are now in a position to formally define what is a probability, there is one last topic to discuss which will hopefully grant us some insight into probability and more generally, mathematics.
In mathematics, we have a few different structures. For instance, we have definitions, theorems, axioms, lemmas, corollaries, and conjectures to name a few. Perhaps some of these terms are familiar to us but what I would like to focus on specifically are axioms and theorems. What do we mean by an axiom?
Simply put, an axiom is a starting point in mathematics. More precisely, an axiom is a statement which we have assumed to be true . That is, there is no proving an axiom. As an analogy, axioms are sort of our building blocks. They give us the underlying structure of the branch in mathematics we are working in and the entire branch is developed from those axioms. For us, our entire theory of probability and statistics rests upon the following three axioms:
Probability is a real-valued function \( P \) that assigns to each event \(A\) in a sample space \(S\) a number called the probability of the event \(A\), denoted by \( P(A) \), such that the following three properties are satisfied:
- \( P(A) \geq 0 \).
- \( P(S) = 1 \).
- If \( A_1, A_2, A_3, \ldots \) are events and \( A_i \cap A_j = \emptyset \) for all \( i \neq j \) then
\[ P(A_1 \cup A_2 \cup A_3 \cup \ldots ) = P(A_1) + P(A_2) + P(A_3) + \ldots .\]
A similar result holds for every finite collection of mutually exclusive events. That is, if \( A_1, A_2, A_3, \ldots , A_k \) are events and \( A_i \cap A_j = \emptyset \) for all \( i \neq j \) then
\[ P(A_1 \cup A_2 \cup A_3 \cup \ldots \cup A_k ) = P(A_1) + P(A_2) + P(A_3) + \ldots + P(A_k) \]
Simply put, here are the important aspects of the above definition in words:
- When we say probability is a real-valued function that assigns to each event \(A\) in a sample space \(S\) a number, we mean that \( P : S \rightarrow \mathbb{R} \).
- The first axiom states that a probability is nonnegative.
- The second axiom states that the probability of the sample space is equal to 1.
- The third axiom states that for every collection of mutually exclusive events, the probability of their union is the sum of the individual probabilities.
Looking back at the above definition, we see that the problems we highlighted in the last section with the intuitive definition of probability are no longer present in this definition. More generally, it seems rather difficult to poke any holes in the above definition. Furthermore, we will see in the next section that the above definition of probability yields sensical results which agrees with our intuition.
_____________________________________________________________________________________________________________________________________
- Note that the first part of Axiom 3 is what we refer to as countable additivity while the second part of Axiom 3 is called c ountable subadditivity . Moreover, if we assume countable additivity then we can prove that countable subadditivity holds. That is, the existence of countable subadditivity is actually a theorem. However, for simplicity, we will take both countable additivity and subadditivity as an axiom and we leave the proof that countably additivity implies countable subadditivity as an exercise.