8.1: Discrete Random Variables
As a brief recap of our journey thus far, we first laid down the foundations of probability and then developed a set of theorems which stems from our axiomatic foundation. Afterwards, we related counting techniques to probability and then we related those counting techniques to conditional probabilities. What we are going to do now is make an important jump - we are going to discuss what is meant by a random variable . Before we formally define the term, random variable, allow us to make a note. When it comes to random variables, there are three classes of random variables.
We may have either a discrete random variable, a continuous random variable or a mixed random variable. Once we know how to deal with one branch of random variables, the theory concerning the other two branches are very similar. Hence, we will devote a substantial amount of time discussing discrete random variables and then we will see how the theory for discrete random variables can be generalized to continuous random variables and mixed random variables.
With that said, allow us to formally define what is meant by a random variable.
Definition: Let \(S\) denote the sample space of an experiment. A random variable is a real-valued function defined on \(S\). Often, we will denote a random variable by a capital letter such as \(X\).
In terms of our usual notation for functions, note that \(X: S \rightarrow \mathbb{R} \). That is, \(X\) takes elements in \(S\) and sends them to an element of \(\mathbb{R} \). We typically denote an element in \(S\) by \(s\) and we will denote an element in \( \mathbb{R} \) by \(x\). Hence, \( X(s) = x \). Often, \(x\) is called a realization of \(X\) and this will be further explained a moment.
In order to illustrate what is going on, allow us to consider a concrete example.
Suppose an experiment consists of flipping a fair coin ten times.
- What does the sample space for this experiment look like?
- Write down a few outcomes in \(S\).
- Allow us to define the \( X \) to be the random variable which models the number of heads obtained. Determine the value of \(X\) when \(X\) is evaluated at the points in \(S\) from Part 2.
- What are all the possible values of \(X\)?
- Does \(X\) always take the same value?
- Answer
-
- Let \(S\) denote the set of all outcomes where an outcome an ordered 10-tuple where each entry is either an \(H\) for Heads, or a \(T\) for tails.
- Here are some outcomes in \(S: (H, H, H, H, H, H, H, H, H, H), (H, T, H, T, H, T, H, T, H, T), (H, H, H, H, H, T, T, T, T, T), (T, T, T, T, T, T, T, T, T, T) \).
- Observe that \begin{align*} X \bigg( (H, H, H, H, H, H, H, H, H, H) \bigg) &= 10 \\ \\ X \bigg( (H, T, H, T, H, T, H, T, H, T) \bigg) &= 5 \\ X \bigg((H, H, H, H, H, T, T, T, T, T) \bigg) &=5 \\ X \bigg( (T, T, T, T, T, T, T, T, T, T) \bigg) &= 0 \end{align*}
- Since we are flipping the coin ten times, then the number of possible heads are 0, 1, 2,3 4, 5, 6, 7, 8, 9, or 10. Hence \(X: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \).
- Clearly, as demonstrated above, \(X\) does not always take on the same value. It is possible that after flipping the coin ten times, we obtain 0 heads while other times we may obtain 1 head.
We are now going to take an important leap: notice in Question 5, that \(X\) does not always take the same value. We see sometimes that \(X(s) = 0 \), \(X(s) = 1, \ldots \) or \(X(s) = 10 \). Since the value of \(X\) is not always a single, fixed number, we may begin to assign a probability that our random variable, \(X\) takes each value. That is, we can find the probability of the event that \(X(s) = 0 \) and we can find the probability of the event that \(X(s) = 1 \) and so on.
In the example above, find the probability that \(X\) takes each value in \( \mathbb{R} \).
- Answer
-
The key to answering this question is to re-phrase the question in terms of events and use our previous methods. Doing so yields the following:
\( P( X(s) = 0 ) = P( \text{we obtain 0 heads} ) = \frac{\text{|we obtain 0 heads|}}{|S|} = \frac{\binom{10}{0}}{2^{10}} \).
\(P( X(s) = 1 ) = P( \text{we obtain 1 head} ) = \frac{\text{|we obtain 1 head|}}{|S|} = \frac{\binom{10}{1}}{2^{10}} \).
\(P( X(s) = 2 ) = P( \text{we obtain 2 heads} ) = \frac{\text{|we obtain 2 heads|}}{|S|} = \frac{\binom{10}{2}}{2^{10}} \).
\(P( X(s) = 3 ) = P( \text{we obtain 3 heads} ) = \frac{\text{|we obtain 3 heads|}}{|S|} = \frac{\binom{10}{3}}{2^{10}} \).
\(P( X(s) = 4 ) = P( \text{we obtain 4 heads} ) =\frac{\text{|we obtain 4 heads|}}{|S|} = \frac{\binom{10}{4}}{2^{10}} \).
\(P( X(s) = 5 ) = P( \text{we obtain 5 heads} ) = \frac{\text{|we obtain 5 heads|}}{|S|} = \frac{\binom{10}{5}}{2^{10}} \).
\(P( X(s) = 6 ) = P( \text{we obtain 6 heads} ) = \frac{\text{|we obtain 6 heads|}}{|S|} = \frac{\binom{10}{6}}{2^{10}} \).
\(P( X(s) = 7 ) = P( \text{we obtain 7 heads} ) = \frac{\text{|we obtain 7 heads|}}{|S|} = \frac{\binom{10}{7}}{2^{10}} \).
\(P( X(s) = 8 ) = P( \text{we obtain 8 heads} ) = \frac{\text{|we obtain 8 heads|}}{|S|} = \frac{\binom{10}{8}}{2^{10}} \).
\(P( X(s) = 9 ) = P( \text{we obtain 9 heads} ) = \frac{\text{|we obtain 9 heads|}}{|S|}= \frac{\binom{10}{9}}{2^{10}} \).
\(P( X(s) = 10 ) = P( \text{we obtain 10 heads} ) = \frac{\text{|we obtain 10 heads|}}{|S|} = \frac{\binom{10}{10}}{2^{10}} \).
If we had to summarize the above information, we may compactly write the following:
\begin{align*} P( X(s) = x ) = \frac{ \binom{10}{x} }{ 2^{10} } ~~ \text{if} ~~ x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \end{align*}
Notice, however, we have not answered the question. The above question says to find the probability that \(X\) takes each possible value in \( \mathbb{R} \). We have only specified the probability \(X\) takes the values 0, 1, 2, 3,4 ,5 6, 7, 8, 9, and 10. Seeing that the probability that \(X\) takes any other value is 0, we can further generalize the function to read:
\begin{align*} P( X(s) = x ) &= \frac{ \binom{10}{x} }{ 2^{10} } ~~ \text{if} ~~ x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \\
&=\begin{cases}
\frac{ \binom{10}{x} }{ 2^{10} }, & \mbox{if } x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \\
0, & \mbox{otherwise }
\end{cases}\end{align*}Finally, we make one last change to the above function. Recall in calculus, we do not always write \(f(x)\). Instead, we often simply say or write \(f\). That is, we do not always have to specify the independent variable since the domain is understood. We adopt the same convention here and instead of writing \(X(s) = x\), we simply write \(X = x\).
\begin{align*} P( X = x ) &= \frac{ \binom{10}{x} }{ 2^{10} } ~~ \text{if} ~~ x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \\
&=\begin{cases}
\frac{ \binom{10}{x} }{ 2^{10} }, & \mbox{if } x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \\
0, & \mbox{otherwise }
\end{cases}\end{align*}
Allow us to interpret the above function. We have written \( P( X(s) = x ) \) which means this function returns the probability that our random variable, \(X(s) \) takes a particular value which we denote by \(x\). We see that there are only finitely many values for our random variable \(X\) and this is what is partly meant by a discrete random variable . Moreover, we assign the function \( P( X(s) = x ) \) a special name. We call this function the probability mass function of \(X\) and we denote this function by \(f(x)\). That is,
\begin{align*} f(x) = P( X = x ) &= \frac{ \binom{10}{x} }{ 2^{10} } ~~ \text{if} ~~ x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \\
&=\begin{cases}
\frac{ \binom{10}{x} }{ 2^{10} }, & \mbox{if } x = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \\
0, & \mbox{otherwise }
\end{cases}\end{align*}
And so, allow us to formalize the above terms.
Definition: A random variable \(X\) is said to be a discrete random variable if \(X\) takes on either a finite number of possible values or a countable infinite number of possible values.
1) If \(X: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \), then \(X\) is a discrete random variable since there are a finite number of possibilities for \(X\).
2) If \(X: 0, 1, 2, 3, \ldots \), then \(X\) is a discrete random variable since there are a countably infinite number of possible values for \(X\).
3) If \(X\) can take any value in the interval \( [0,1] \), then \(X\) is not a discrete random variable since there are an uncountable number of elements in the interval \( [0,1] \).
Definition: For a discrete random variable \(X\), we define the probability mass function (p.m.f.) of \(X\), often denoted by \(f\), to be:
\[ f(x) = P(X = x) \nonumber\ \]
Definition: For a discrete random variable \(X\) with probability mass function \(f\), we define the support of \(X\), or sometimes referred to as the space of \(X\), to be \( \{ x \in \mathbb{R} : f(x) > 0 \} \).
Again, the job of the probability mass function, often denoted by \(f\), is tell you the probability that your random variable \(X\) takes a specific value, \(x\). We sometimes refer to \(x\) as a realization of the random variable. The collection of all possible values of \(X\) that occurs with nonzero probabilities is called the support or space of \(X\). For instance, in the above coin toss question, the support of \(X\) is \( \{0, 1, 2,3 , 4, 5, 6, 7, 8, 9, 10 \} \).
Note that the graph of the probability mass function looks like the following:
For us, a common question will involve an experiment and a random variable being described. From this description, we will have to find the probability mass function. In order to do so, we apply the following recipe:
1) Ask yourself, what are the possible values of \(X\)? (This gives you the support or space of \(X\)).
2) Find the probabilities that \(X\) takes these values.
3) Summarize these probabilities in a function.
We end this section with a theorem concerning the probability mass function.
Theorem: Given a discrete random variable \(X\), the probability mass function, \(f\), must satisfy the following three properties:
1) \(f(x) \geq 0\).
2) \( \displaystyle \sum_{\text{all} x \in X(S) } f(x) = 1 \).
3) For any event \(A\) of the sample space \(S\), \( \displaystyle P(X \in A) = \sum_{\text{all} ~ x \in A} f(x) \).
For instance, in the above example with the coin, we see that:
1) \(f(x) \geq 0\) for all \(x\)
2) \( f(0) + f(1) + \ldots + f(10) = 1\)
3) If \(A = \{ \text{we obtained 5 heads of less} \} = \{ X \leq 5 \} \), then \( \displaystyle P(A) = \sum_{\text{all} ~ x \in A} f(x) = f(0) + f(1) + f(2) + f(3) + f(4) + f(5) \).