6.3: Normal Distributions
By the end of this chapter, the student should be able to:
- Recognize and use the normal probability distribution.
- Recognize and use the standard normal probability distribution.
A special continuous distribution, called normal, is the most common and therefore most important of all the distributions. It is widely used and even more widely abused. Its graph is bell-shaped. You see the bell curve in almost all disciplines. Some of these include psychology, business, economics, the sciences, nursing, and, of course, mathematics. Some of your instructors may use the normal distribution to help determine your grade. Most IQ scores are normally distributed. Often real-estate prices fit a normal distribution. The normal distribution is extremely important, but it cannot be applied to everything in the real world.
In this section, you will study the normal distribution, the standard normal distribution, and applications associated with them. To study normal distributions, we must introduce a normal probability density function and its graph. Normal distributions have two parameters, the mean (\(\mu\)) and the standard deviation (\(\sigma\)). If \(X\) is a quantity to be measured that has a normal distribution with mean (\(\mu\)) and standard deviation (\(\sigma\)), we denote it as \(X\sim N(\mu,\sigma)\) and the probability density function of \(X\) is
\(f(x) = \dfrac{1}{\sigma \cdot \sqrt{2 \cdot \pi}} \cdot e^{\left(-\dfrac{1}{2}\right) \cdot \left(\dfrac{x-\mu}{\sigma}\right)^{2}}\)
As you can see, the probability density function is a rather complicated function. Do not memorize it as it is not necessary. So instead, let's focus on its graph which varies depending on the values of \(\mu\) and \(\sigma\).
Note that the two parameters uniquely determine the shape of the probability density curve, i.e. the center and the spread. The curve is symmetrical about a vertical line drawn through the mean, \(\mu\). In theory, the mean is the same as the median, because the graph is symmetric about \(\mu\). Since the area under the curve must equal one, a change in the standard deviation, \(\sigma\), causes a change in the shape of the curve; the curve becomes fatter or skinnier depending on the change of \(\sigma\). A change in \(\mu\) causes the graph to shift to the left or right. This means there are an infinite number of normal probability distributions. One of special interest is called the standard normal distribution, \(Z \sim N(0, 1)\).
Since normal random variables are continuous random variables, everything that we learned about continuous random variables in the previous section will hold for normal variables as well. For example, one of the most important facts is that the area under the probability density curve is the probability of some event that can be expressed in teh inequality form.
The shaded area in the following graph indicates the area to the left of \(x\). This area is represented by the probability \(P(X < x)\).
How would you represent the area to the left of one in a probability statement?
Answer
\(P(X < 1)\)
Since the \(P(X=x)=0\) for any continuous random variable \(X\) and any value \(x\), we conclude that \(P(X < x)\) is the same as \(P(X \leq x)\) and \(P(X > x)\) is the same as \(P(X \geq x)\) for normal distributions as well.
Is \(P(X < 1)\) equal to \(P(X \leq 1)\)? Why?
Answer
Yes, because they are the same in a continuous distribution: \(P(X = 1) = 0\)
Since the total area under the probability density curve is 1, the area to the right is then \(P(X > x) = 1 – P(X < x)\). Remember, \(P(X < x) =\) Area to the left of the vertical line through \(x\). \(P(X > x) = 1 – P(X < x) =\) Area to the right of the vertical line through \(x\).
Express symbolically the area to the right of three:
Answer
\(1 – P(X < 3)\) or \(P(X > 3)\)
If the area to the right of \(x\) in a normal distribution is 0.543, what is the area to the left of \(x\)?
Answer
\(1 - 0.543 = 0.457\)
Calculations of Probabilities
Normal tables, computers, and calculators are used to provide or calculate the probabilities such as \(P(a < X < b)\), \(P(X < c)\), or \(P(X > c)\) for any variable \(X\) and any values \(a, b, c\). Technology has made the tables virtually obsolete. For that reason, as well as the fact that there are various table formats, we are not including table instructions in this textbook.
There are many online calculators that can be used to compute the probabilities that involve normal and other random variables. For example, you can use this one :
Also, you are encouraged to ask your instructor about which calculator is allowed/recommended for this course.
Use the calculator provided above to verify the following probability statements for \(X \sim N(6, 2)\):
\(P(X<3.5)=0.1056\)
\(P(4.1<X<6.3)=0.3885\)
\(P(X>8.2)=0.1357\)
\(X \sim N(–3, 4)\)
Find the probability that \(X\) is between one and four, i.e. \(P(1<X<4)\).
Answer
0.1186
Recall that the k-th percentile of a dataset is the value such that k% of the data are below this value. In the context of a random variable, the k-th percentile of \(X\) is the value \(x\) such that the probability \(P(X<x)=k/100\).
For \(X\sim N(35,5)\), \(P(X<36)=0.58\). Therefore \(36\) is the 58-th percentile of \(X\). If we were asked to find the 58-th percentile of \(X\) then essentially we are asked to find the value \(x\) such that \(P(X<x)=0.58\). Many online calculators allow to set up such an equation by leaving blank instead of \(x\).
Find the 42-nd percentile for a normal random variable \(X\sim N(\mu=93.8, \sigma=5.8)\).
- Answer
-
92.63
As we indicated above, normal distributions have miryad of applications.
The final exam scores in a statistics class were normally distributed with a mean of 63 and a standard deviation of five.
- Find the probability that a randomly selected student scored more than 65 on the exam.
- Find the probability that a randomly selected student scored less than 85.
- Find the 90 th percentile (that is, find the score \(k\) that has 90% of the scores below k and 10% of the scores above \(k\)).
- Find the 70 th percentile (that is, find the score \(k\) such that 70% of scores are below \(k\) and 30% of the scores are above \(k\)).
Answer
a. Let \(X\) be a score on the final exam. \(X \sim N(63, 5)\), where \(\mu = 63\) and \(\sigma = 5\). Draw a graph. Then, find \(P(x > 65)\).
\(P(X > 65) = 0.3446\nonumber \)
The probability that any student selected at random scores more than 65 is 0.3446.
Answer
b. Draw a graph. Then find \(P(X < 85)\), and shade the graph. Using a computer or calculator, find \(P(X < 85) = 1\). The probability that one student scores less than 85 is approximately one (or 100%).
Answer
c. Find the 90 th percentile. For each problem or part of a problem, draw a new graph. Draw the \(x\)-axis. Shade the area that corresponds to the 90 th percentile.
Let \(k\) be the 90 th percentile. The variable \(k\) is located on the \(x\)-axis. \(P(X < k)\) is the area to the left of \(k\). The 90 th percentile \(k\) separates the exam scores into those that are the same or lower than \(k\) and those that are the same or higher. Ninety percent of the test scores are the same or lower than \(k\), and ten percent are the same or higher. The variable \(k\) is often called a critical value. Using a computer or calculator, \(k = 69.4\)
The 90 th percentile is 69.4. This means that 90% of the test scores fall at or below 69.4 and 10% fall at or above.
Answer
d. Find the 70 th percentile.
Draw a new graph and label it appropriately. The 70 th percentile is 65.6. This means that 70% of the test scores fall at or below 65.5 and 30% fall at or above.
A personal computer is used for office work at home, research, communication, personal finances, education, entertainment, social networking, and a myriad of other things. Suppose that the average number of hours a household personal computer is used for entertainment is two hours per day. Assume the times for entertainment are normally distributed and the standard deviation for the times is half an hour.
- Find the probability that a household personal computer is used for entertainment between 1.8 and 2.75 hours per day.
- Find the maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment.
- Answer
-
a. Let \(X =\) the amount of time (in hours) a household personal computer is used for entertainment. \(X \sim N(2, 0.5)\) where \(\mu = 2\) and \(\sigma = 0.5\).
Find \(P(1.8 < X < 2.75)\).
The probability for which you are looking is the area between \(x = 1.8\) and \(x = 2.75\). \(P(1.8 < X < 2.75) = 0.5886\)
The probability that a household personal computer is used between 1.8 and 2.75 hours per day for entertainment is 0.5886.
b. To find the maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment, find the 25 th percentile, \(k\), where \(P(X < k) = 0.25\).
The maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment is 1.66 hours.
There are approximately one billion smartphone users in the world today. In the United States the ages 13 to 55+ of smartphone users approximately follow a normal distribution with approximate mean and standard deviation of 36.9 years and 13.9 years, respectively.
a. Determine the probability that a random smartphone user in the age range 13 to 55+ is between 23 and 64.7 years old.
b. Determine the probability that a randomly selected smartphone user in the age range 13 to 55+ is at most 50.8 years old.
c. Find the 80 th percentile of this distribution, and interpret it in a complete sentence.
- Answer
-
a. \(0.8186\)
b. \(0.8413\)
c. T he 80 th percentile is 48.6 years meaning 80% of the smartphone users in the age range 13 – 55+ are 48.6 years old or less.
The life of Sunshine CD players is normally distributed with a mean of 4.1 years and a standard deviation of 1.3 years. We are interested in the length of time a CD player lasts.
a. A CD player is guaranteed for three years. Find the probability that a CD player will break down during the guarantee period.
b. Find the probability that a CD player will last between 2.8 and six years.
c. Find the 70 th percentile of the distribution for the time a CD player lasts.
Answer
- 19.87%
- 76.94%
- 4.78 years.