9.3: Measures of Central Tendency

Last updated
Save as PDF

Page ID: 59978

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $ $ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $$\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$ $\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$$\newcommand{\AA}{\unicode[.8,0]{x212B}}$

Let’s begin by trying to find the most “typical” value of a data set.

Note that we just used the word “typical” although in many cases you might think of using the word “average.” We need to be careful with the word “average” as it means different things to different people in different contexts. One of the most common uses of the word “average” is what mathematicians and statisticians call the arithmetic mean, or just plain old mean for short. “Arithmetic mean” sounds rather fancy, but it is likely you have calculated a mean many times without realizing it; the mean is what most people think of when they use the word “average”.

Definition: Mean

The mean of a set of data is the sum of the data values divided by the number of values.

Example $\PageIndex{1}$

Marci’s exam scores for her last math class were: 79, 86, 82, 94. The mean of these values would be:

$ \dfrac{79 + 86 + 82 + 94}{4} = 85.25 $

Typically, we round the mean to one more decimal place than the original data. In this case, we would round $85.25$ to $85.3$. Thus, we can say Marci’s average score on her math exams was $85.25$ or about $85.3$.

Example $\PageIndex{2}$

The number of touchdown (TD) passes thrown by each of the 31 teams in the National Football League in the 2000 season are shown below.

37 33 33 32 29 28 28 23 22 22 22 21 21 21 20

20 19 19 18 18 18 18 16 15 14 14 14 12 12 9 6

Adding these values, we get a sum total of 634 TDs. Dividing by 31, the total number of data values, we get $\dfrac{634}{31} = 20.4516$. It would be appropriate to round this to 20.5.

It would be most correct for us to report that “The mean number of touchdown passes thrown in the NFL in the 2000 season was 20.5 passes,” but it is not uncommon to see the more casual word “average” used in place of “mean.”

Try It Now 4

The price of a jar of peanut butter at 5 stores was: $3.29, $3.59, $3.79, $3.75, and $3.99. Find the mean price.

Let’s look at an example for calculating the mean given a frequency table.

Example $\PageIndex{3}$

The one hundred families in a particular neighborhood are asked their annual household income, to the nearest $5 thousand dollars. The results are summarized in a frequency table below.

Income (thousands of dollars)	Frequency
15	6
20	8
25	11
30	17
35	19
40	20
45	12
50	7

Calculating the mean by hand could get tricky if we try to type in all 100 values:

$\dfrac{15+15+15+15+15+15+20+20+…+50+50+50+50+50+50+50}{100}$

That’s one long numerator! We could calculate this more efficient by noticing that adding $15$ to itself six times is the same as $15 \cdot 6 = 90$. Using this simplification, we get

$\dfrac{(15 \cdot 6) + (20 \cdot 8) + (25 \cdot 11) + (30 \cdot 17) + (35 \cdot 19) + (40 \cdot 20) + (45 \cdot 12) + (50 \cdot 7)}{100} = \dfrac{3390}{100} = 33.9$

The mean household income of our sample is $33.9$ thousand dollars ($$33,900$).

Example $\PageIndex{4}$

Extending off the last example, suppose a new family moves into the neighborhood example that has a household income of $5 million ($5000 thousand). Adding this to our sample, our mean is now:

$\dfrac{(15 \cdot 6) + (20 \cdot 8) + (25 \cdot 11) + (30 \cdot 17) + (35 \cdot 19) + (40 \cdot 20) + (45 \cdot 12) + (50 \cdot 7) + (5000 \cdot 1)}{100} = \dfrac{8390}{101} = 83.069$

While $83.1$ thousand dollars ($$83,069$) is the correct mean household income, it no longer represents a “typical” value.

Imagine the data values on a see-saw or balance scale. The mean is the value that keeps the data in balance, like in the picture below.

$clipboard_eafa42f40ea49d8250d4eea9269a080ff.png$

If we graph our household data, the $5 million data value is so far out to the right that the mean has to adjust up to keep things in balance

$clipboard_e1d2b4aaf9e7b7f2e12e00b196fc64d0c.png$

For this reason, when working with data that have outliers – values far outside the primary grouping – it is common to use a different measure of center, the median.

Definition: Median

The median of a set of data is the value in the middle when the data is in order

To find the median, begin by listing the data in order from smallest to largest, or largest to smallest.

If the number of data values, $N$, is odd, then the median is the middle data value. This value can be found by rounding $\dfrac{N}{2}$ up to the next whole number.

If the number of data values is even, there is no one middle value, so we find the mean of the two middle values (values $\dfrac{N}{2}$ and $\dfrac{N}{2} + 1$)

We can interpret the median as “half of the data is less than the median and the other half is more than the median.” Of course, we can rewrite this in context of the problem.

Example $\PageIndex{5}$

Returning to the football touchdown data, we would start by listing the data in order. Luckily, it was already in decreasing order, so we can work with it without needing to reorder it first.

37 33 33 32 29 28 28 23 22 22 22 21 21 21 20

20 19 19 18 18 18 18 16 15 14 14 14 12 12 9 6

Since there are 31 data values, an odd number, the median will be the middle number, the 16^th data value ($\dfrac{31}{2} = 15.5$, round up to 16, leaving 15 values below and 15 above). The 16^th data value is 20, so the median number of touchdown passes in the 2000 season was 20 passes. Notice that for this data, the median is fairly close to the mean we calculated earlier, 20.5. This means that half of the touchdowns scored were less than 20 and the other half were more than 20.

Example $\PageIndex{6}$

Find the median of these quiz scores: 5 10 8 6 4 8 2 5 7 7

Solution

We start by listing the data in order: 2 4 5 5 6 7 7 8 8 10

Since there are 10 data values, an even number, there is no one middle number. So, we find the mean of the two middle numbers, 6 and 7, and get

\[\dfrac{(6+7)}{2} = 6.5. \nonumber\]

The median quiz score was 6.5. We can say, half of the quiz scores were lower than 6.5 and the other half were higher than 6.5.

Try It Now 5

The price of a jar of peanut butter at 5 stores were: $3.29, $3.59, $3.79, $3.75, and $3.99. Find the median price.

Example $\PageIndex{7}$

Let us return now to our original household income data.

Income (thousands of dollars)	Frequency
15	6
20	8
25	11
30	17
35	19
40	20
45	12
50	7

Here we have 100 data values. If we didn’t already know that, we could find it by adding the frequencies. Since 100 is an even number, we need to find the mean of the middle two data values - the 50^th and 51^st data values. To find these, we start counting up from the bottom:

There are 6 data values of $15, so Values 1 to 6 are $15 thousand
The next 8 data values are $20, so Values 7 to (6+8)=14 are $20 thousand
The next 11 data values are $25, so Values 15 to (14+11)=25 are $25 thousand
The next 17 data values are $30, so Values 26 to (25+17)=42 are $30 thousand
The next 19 data values are $35, so Values 43 to (42+19)=61 are $35 thousand

From this we can tell that values 50 and 51 will be $35 thousand, and the mean of these two values is $35 thousand. The median income in this neighborhood is $35 thousand. Thus, half of the households’ earned income is less than $35,000 and the other half earned more than $35,000.

Example $\PageIndex{8}$

If we add in the new neighbor with a $5 million household income, then there will be 101 data values, and the 51^st value will be the median. As we discovered in the last example, the 51^st value is $35 thousand. Notice that the new neighbor did not affect the median in this case. The median is not swayed as much by outliers as the mean.

Let’s think about the previous example. When we added the 101^st family’s income, the mean was $81,069 from $31,900. That’s a big difference in the average household income. We see that the mean is influenced by the values of the data, i.e., the mean could get larger or smaller depending on the values of the data. However, when calculating the median including the 101^st family’s income, the median wasn’t influenced at all. In fact, in general, the median is known as a better statistic for household income since there is a wide spread of income among families. Thus, the values of the data influence the mean, but not the median.

In addition to the mean and the median, there is one other common measurement of the “typical” value of a data set: the mode.

Definition: Mode

The mode is the observed value of the data set that occurs most frequently.

The mode is most commonly used for categorical data, for which the median and mean cannot be computed. Also, the mode is the only central tendency that is used for both categorical and quantitative data. The mean and median are only used with quantitative data.

Example $\PageIndex{9}$

In our vehicle color survey, we collected the data.

Color	Frequency
Blue	3
Green	5
Red	4
White	3
Black	2
Grey	3

For this data, Green is the mode, since it is the data value that occurred the most frequently.

It is possible for a data set to have more than one mode if several categories have the same frequency, or no modes if every category occurs only once.

Try It Now 6

Reviewers were asked to rate a product on a scale of 1 to 5. Find

The mean rating
The median rating
The mode rating

Rating	Frequency
1	4
2	8
3	7
4	3
5	1

Definition: Mean

Example \(\PageIndex{1}\)

Example \(\PageIndex{2}\)

Try It Now 4

Example \(\PageIndex{3}\)

Example \(\PageIndex{4}\)

Definition: Median

Example \(\PageIndex{5}\)

Example \(\PageIndex{6}\)

Solution

Try It Now 5

Example \(\PageIndex{7}\)

Example \(\PageIndex{8}\)

Definition: Mode

Example \(\PageIndex{9}\)

Try It Now 6