7.5: Basic Concepts of Probability
- Define probability including impossible and certain events.
- Calculate basic theoretical probabilities.
- Calculate basic empirical probabilities.
- Distinguish among theoretical, empirical, and subjective probability.
- Calculate the probability of the complement of an event.
It all comes down to this. The game of Monopoly that started hours ago is in the home stretch. Your sister has the dice, and if she rolls a 4, 5, or 7 she’ll land on one of your best spaces and the game will be over. How likely is it that the game will end on the next turn? Is it more likely than not? How can we measure that likelihood? This section addresses this question by introducing a way to measure uncertainty.
Introducing Probability
Uncertainty is, almost by definition, a nebulous concept. In order to put enough constraints on it that we can mathematically study it, we will focus on uncertainty strictly in the context of experiments. Recall that experiments are processes whose outcomes are unknown; the sample space for the experiment is the collection of all those possible outcomes. When we want to talk about the likelihood of particular outcomes, we sometimes group outcomes together; for example, in the Monopoly example at the beginning of this section, we were interested in the roll of 2 dice that might fall as a 4, 5, or 7. A grouping of outcomes that we’re interested in is called an event . In other words, an event is a subset of the sample space of an experiment; it often consists of the outcomes of interest to the experimenter.
Once we have defined the event that interests us, we can try to assess the likelihood of that event. We do that by assigning a number to each event ( ) called the probability of that event ( ). The probability of an event is a number between 0 and 1 (inclusive). If the probability of an event is 0, then the event is impossible. On the other hand, an event with probability 1 is certain to occur. In general, the higher the probability of an event, the more likely it is that the event will occur.
Consider an experiment that consists of rolling a single standard 6-sided die (with faces numbered 1-6). Decide if these probabilities are equal to zero, equal to one, or somewhere in between.
- Answer
-
Let's start by identifying the sample space. For one roll of this die, the possible outcomes are {1, 2, 3, 4, 5,6}. We can use that to assess these probabilities:
- We see that 4 is in the sample space, so it’s possible that it will be the outcome. It’s not certain to be the outcome, though. So, .
- Notice that 7 is not in the sample space. So, .
- Every outcome in the sample space is a positive number, so this event is certain. Thus, .
- Since is not in the sample space, .
- Some outcomes in the sample space are even numbers (2, 4, and 6), but the others aren’t. So, .
- Every outcome in the sample space is a single-digit number, so .
Jorge is about to conduct an experiment that consists of flipping a coin 4 times and recording the number of heads \(\text{(H)}\). Decide if these probabilities are equal to zero, equal to one, or somewhere in between.
\(P(\text{H} < 5)\)
\(P(\text{H} < 4)\)
\(P(\text{H} ≥ 5)\)
Three Ways to Assign Probabilities
The probabilities of events that are certain or impossible are easy to assign; they’re just 1 or 0, respectively. What do we do about those in-between cases, for events that might or might not occur? There are three methods to assign probabilities that we can choose from. We’ll discuss them here, in order of reliability.
Method 1: Theoretical Probability
The theoretical method gives the most reliable results, but it cannot always be used. If the sample space of an experiment consists of equally likely outcomes, then the theoretical probability of an event is defined to be the ratio of the number of outcomes in the event to the number of outcomes in the sample space.
For an experiment whose sample space consists of equally likely outcomes, the theoretical probability of the event is the ratio
where and denote the number of outcomes in the event and in the sample space, respectively.
Recall that a standard deck of cards consists of 52 unique cards which are labeled with a rank (the whole numbers from 2 to 10, plus J, Q, K, and A) and a suit ( , , , or ). A standard deck is thoroughly shuffled, and you draw one card at random (so every card has an equal chance of being drawn). Find the theoretical probability of each of these events:
- The card is .
- The card is a .
- The card is a king (K).
- Answer
-
There are 52 cards in the deck, so the sample space for each of these experiments has 52 elements. That will be the denominator for each of our probabilities.
- There is only one in the deck, so this event only has one outcome in it. Thus, .
- There are 13 in the deck, so .
- There are 4 cards of each rank in the deck, so .
You are about to roll a fair (meaning that each face has an equal chance of landing up) 6-sided die, whose faces are labeled with the numbers 1 through 6. Find the theoretical probabilities of each outcome.
You roll a 4.
You roll a number greater than 2.
You roll an odd number.
It is critical that you make sure that every outcome in a sample space is equally likely before you compute theoretical probabilities!
In the Basic Concepts of Probability, we were considering a Monopoly game where, if your sister rolled a sum of 4, 5, or 7 with 2 standard dice, you would win the game. What is the probability of this event? Use tables to determine your answer.
Now, each of the 36 ordered pairs in the table represent an equally likely outcome.
- Answer
-
We should think of this experiment as occurring in two stages: (1) one die roll, then (2) another die roll. Even though these two stages will usually occur simultaneously in practice, since they’re independent, it’s okay to treat them separately.
Step 1: Since we have two independent stages, let’s create a table (Figure
), which is probably the most efficient method for determining the sample space.
Step 2: To make our analysis easier, let’s replace each ordered pair with the sum (Figure
).Step 3: Since the event we’re interested in is the one consisting of rolls of 4, 5, or 7. Let’s shade those in (Figure
).Our event contains 13 outcomes, so the probability that your sister rolls a losing number is .
If you roll a pair of 4-sided dice with faces labeled with the numbers 1 through 4, what is the probability of rolling a sum of 6 or 7?
If you flip a fair coin 3 times, what is the probability of each event? Use a tree diagram to determine your answer
- You flip exactly 2 heads.
- You flip 2 consecutive heads at some point in the 3 flips.
- All 3 flips show the same result.
- Answer
-
Let’s build a tree to identify the sample space (Figure
).
The sample space is {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}, which has 8 elements.
- Flipping exactly 2 heads occurs three times (HHT, HTH, THH), so the probability is .
- Flipping 2 consecutive heads at some point in the experiment happens 3 times: HHH, HHT, THH. So, the probability is .
- There are 2 outcomes that all show the same result: HHH and TTT. So, the probability is .
You have a modified deck of cards containing only 3\(\heartsuit\), 4\(\heartsuit\), and 5\(\heartsuit\). You draw 2 two cards without replacing them (where order matters). What is the probability of each event?
The first card drawn is 3\(\heartsuit\).
The first card drawn has a lower number than the second card.
One of the cards drawn is 4\(\heartsuit\).
The first card drawn has a lower number than the second card.
One of the cards drawn is 4\(\heartsuit\).
The first known text that provided a systematic approach to probabilities was written in 1564 by Gerolamo Cardano (1501–1576). Cardano was a physician whose illegitimate birth closed many doors that would have otherwise been open to someone with a medical degree in 16th-century Italy. As a result, Cardano often turned to gambling to help ends meet. He was a remarkable mathematician, and he used his knowledge to gain an edge when playing at cards or dice. His 1564 work, titled Liber de ludo aleae (which translates as Book on Games of Chance ), summarized everything he knew about probability. Of course, if that book fell into the hands of those he played against, his advantage would disappear. That’s why he never allowed it to be published in his lifetime (it was eventually published in 1663). Cardano made other contributions to mathematics; he was the first person to publish the third degree analogue of the Quadratic Formula (though he didn’t discover it himself), and he popularized the use of negative numbers.
Method 2: Empirical Probability
Theoretical probabilities are precise, but they can’t be found in every situation. If the outcomes in the sample space are not equally likely, then we’re out of luck. Suppose you’re watching a baseball game, and your favorite player is about to step up to the plate. What is the probability that he will get a hit?
In this case, the sample space is {hit, not a hit}. That doesn’t mean that the probability of a hit is , since those outcomes aren’t equally likely. The theoretical method simply can’t be used in this situation. Instead, we might look at the player’s statistics up to this point in the season, and see that he has 122 hits in 531 opportunities. So, we might think that the probability of a hit in the next plate appearance would be about . When we use the outcomes of previous replications of an experiment to assign a probability to the next replication, we’re defining an empirical probability . Empirical probability is assigned using the outcomes of previous replications of an experiment by finding the ratio of the number of times in the previous replications the event occurred to the total number of previous replications.
Empirical probabilities aren’t exact, but when the number of previous replications is large, we expect them to be close. Also, if the previous runs of the experiment are not conducted under the exact set of circumstances as the one we’re interested in, the empirical probability is less reliable. For instance, in the case of our favorite baseball player, we might try to get a better estimate of the probability of a hit by looking only at his history against left- or right-handed pitchers (depending on the handedness of the pitcher he’s about to face).
One of the broad uses of statistics is called statistical inference, where statisticians use collected data to make a guess (or inference) about the population the data were collected from. Nearly every tool that statisticians use for inference is based on probability. Not only is the method we just described for finding empirical probabilities one type of statistical inference, but some more advanced techniques in the field will give us an idea of how close that empirical probability might be to the actual probability!
Assign an empirical probability to the following events:
- Jose is on the basketball court practicing his shots from the free throw line. He made 47 out of his last 80 attempts. What is the probability he makes his next shot?
- Amy is about to begin her morning commute. Over her last 60 commutes, she arrived at work 12 times in under half an hour. What is the probability that she arrives at work in 30 minutes or less?
- Felix is playing Yahtzee with his sister. Felix won 14 of the last 20 games he played against her. How likely is he to win this game?
- Answer
-
- Since Jose made 47 out of his last 80 attempts, assign this event an empirical probability of .
- Amy completed the commute in under 30 minutes in 12 of the last 60 commutes, so we can estimate her probability of making it in under 30 minutes this time at .
- Since Felix has won 14 of the last 20 games, assign a probability for a win this time of .
Jessie is in charge of quality control at a factory manufacturing SUVs. Today, she’s checking the placement of the taillight housing. Of the last thousand units off the line, 13 had faulty placement. What empirical probability might Jesse assign to the next vehicle coming off the line having bad placement?
A famous early question about probability (posed by Georges-Louis Leclerc, Comte de Buffon in the 18th century) had to do with the probability that a needle dropped on a floor finished with wooden slats would lay across one of the seams. If the distance between the slats is exactly the same length as the needle, then it can be shown using calculus that the probability that the needle crosses a seam is . Using toothpicks or matchsticks (or other uniformly long and narrow objects), assign an empirical probability to this experiment by drawing parallel lines on a large sheet of paper where the distance between the lines is equal to the length of your dropping object, then repeatedly dropping the objects and noting whether the object touches one of the lines. Once you have your empirical probability, take its reciprocal and multiply by 2. Is the result close to ?
Method 3: Subjective Probability
In cases where theoretical probability can’t be used and we don’t have prior experience to inform an empirical probability, we’re left with one option: using our instincts to guess at a subjective probability . A subjective probability is an assignment of a probability to an event using only one’s instincts.
Subjective probabilities are used in cases where an experiment can only be run once, or it hasn’t been run before. Because subjective probabilities may vary widely from person to person and they’re not based on any mathematical theory, we won’t give any examples. However, it’s important that we be able to identify a subjective probability when we see it; they will in general be far less accurate than empirical or theoretical probabilities.
Classify each of the following probabilities as theoretical, empirical, or subjective.
- An eccentric billionaire is testing a brand new rocket system. He says there is a 15% chance of failure.
- With 4 seconds to go in a close basketball playoff game, the home team need 3 points to tie up the game and send it to overtime. A TV commentator says that team captain should take the final 3-point shot, because he has a 38% chance of making it (greater than every other player on the team).
- Felix is losing his Yahtzee game against his sister. He has one more chance to roll 2 dice; he’ll win the game if they both come up 4. The probability of this is about 2.8%.
- Answer
-
- This experiment has never been run before, so the given probability is subjective.
- Presumably, the commentator has access to each player’s performance statistics over the entire season. So, the given probability is likely empirical.
- Rolling 2 dice results in a sample space with equally likely outcomes. This probability is theoretical. (We’ll learn how to calculate that probability later in this chapter.)
Classify each of the following probabilities as theoretical, empirical, or subjective.
You have entered a raffle with 500 entrants. Your probability of winning is 0.2%.
Your little brother takes the bus to school each morning. On the first day of school, you believe that the probability that the bus arrives between 7:15 AM and 7:30 AM is about 80%.
Your little brother takes the bus to school each morning. On the last day of school, you believe that the probability that the bus arrives between 7:15 AM and 7:30 AM is about 73%.
In 1938, Frank Benford published a paper (“The law of anomalous numbers,” in Proceedings of the American Philosophical Society ) with a surprising result about probabilities. If you have a list of numbers that spans at least a couple of orders of magnitude (meaning that if you divide the largest by the smallest, the result is at least 100), then the digits 1–9 are not equally likely to appear as the first digit of those numbers, as you might expect. Benford arrived at this conclusion using empirical probabilities; he found that 1 was about 6 times as likely to be the initial digit as 9 was!
New Probabilities from Old: Complements
One of the goals of the rest of this chapter is learning how to break down complicated probability calculations into easier probability calculations. We’ll look at the first of the tools we can use to accomplish this goal in this section; the rest will come later.
Given an event , the complement of (denoted ) is the collection of all of the outcomes that are not in . (This is language that is taken from set theory, which you can learn more about elsewhere in this text.) Since every outcome in the sample space either is or is not in , it follows that . So, if the outcomes in are equally likely, we can compute theoretical probabilities and . Then, adding these last two equations, we get
Thus, if we subtract from both sides, we can conclude that . Though we performed this calculation under the assumption that the outcomes in are all equally likely, the last equation is true in every situation.
How is this helpful? Sometimes it is easier to compute the probability that an event won’t happen than it is to compute the probability that it will . To apply this principle, it’s helpful to review some tricks for dealing with inequalities. If an event is defined in terms of an inequality, the complement will be defined in terms of the opposite inequality: Both the direction and the inclusivity will be reversed, as shown in the table below.
| If is defined with: | then is defined with: |
|---|---|
- If you roll a standard 6-sided die, what is the probability that the result will be a number greater than one?
- If you roll two standard 6-sided dice, what is the probability that the sum will be 10 or less?
- If you flip a fair coin 3 times, what is the probability that at least one flip will come up tails?
- Answer
-
- Here, the sample space is {1, 2, 3, 4, 5, 6}. It’s easy enough to see that the probability in question is , because there are 5 outcomes that fall into the event “roll a number greater than 1.” Let’s also apply our new formula to find that probability. Since is defined using the inequality , then is defined using . Since there’s only one outcome (1) in , we have . Thus, .
- In Example 7.18, we found the following table of equally likely outcomes for rolling 2 dice (Figure ):
Here, the event is defined by the inequality . Thus, is defined by . There are three outcomes in : two 11s and one 12. Thus, .
- In Example 7.15, we found the sample space for this experiment consisted of these equally likely outcomes: {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}. Our event is defined by , so is defined by . The only outcome in is the first one on the list, where zero tails are flipped. So, .
If you roll a pair of 4-sided dice with faces labeled 1 through 4, what is the probability that the sum of the resulting numbers will be greater than 3? Hint: You found this sample space in an earlier Your Turn.
Check Your Understanding
- You have two coins: a nickel and a quarter. You flip them both. Find the probabilities of these events:
- Both come up heads.
- The quarter comes up heads.
- You get one heads and one tails.
- You get three tails.
- Decide whether the given probabilities were most likely derived theoretically, empirically, or subjectively.
- A poker player has a 16% chance of making a hand called a flush on the next card.
- Your friend Jacob tells you that there’s a 20% chance he’ll get married in the next 5 years.
- Ashley has a coin that they think might not be fair, so they flip it 100 times and note that the result was heads 58 times. So, Ashley says the probability of flipping heads on that coin is about 58%.
- If you flip a fair coin 50 times, the probability of getting 20 or fewer heads is about 10.1% (a fact we’ll learn how to verify later).
- If \(E\) is the event “number of heads is 20 or fewer”, describe the event \(E^{\prime}\) using an inequality.
- Find \(P(E^{\prime} )\).