Skip to main content
Mathematics LibreTexts

8.3.3: What Makes a Good Sample?

  • Page ID
    39026
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Lesson

    Let's see what makes a good sample.

    Exercise \(\PageIndex{1}\): Number Talk: Division by Powers of 10

    Find the value of each quotient mentally.

    \(34,000\div 10\)

    \(340\div 100\)

    \(34\div 10\)

    \(3.4\div 100\)

    Exercise \(\PageIndex{2}\): Selling Paintings

    Your teacher will assign you to work with either means or medians.

    1. A young artist has sold 10 paintings. Calculate the measure of center you were assigned for each of these samples:
      1. The first two paintings she sold were for $50 and $350.
      2. At a gallery show, she sold three paintings for $250, $400, and $1,200.
      3. Her oil paintings have sold for $410, $400, and $375.
    2. Here are the selling prices for all 10 of her paintings:
      \($50\quad $200\quad $250\quad $275\quad $280\quad $350\quad $375\quad $400\quad $410\quad $1,200\)
      Calculate the measure of center you were assigned for all of the selling prices.
    3. Compare your answers with your partner. Were the measures of center for any of the samples close to the same measure of center for the population?

    Exercise \(\PageIndex{3}\): Sampling the Fish Market

    The price per pound of catfish at a fish market was recorded for 100 weeks.

    1. Here are dot plots showing the population and three different samples from that population. What do you notice? What do you wonder?
    2. If the goal is to have the sample represent the population, which of the samples would work best? Which wouldn't work so well? Explain your reasoning.

    To use this applet, drag the gray bar at the bottom up to see the sample dot plots.

    Are you ready for more?

    When doing a statistical study, it is important to keep the goal of the study in mind. Representative samples give us the best information about the distribution of the population as a whole, but sometimes a representative sample won’t work for the goal of a study!

    For example, suppose you want to study how discrimination affects people in your town. Surveying a representative sample of people in your town would give information about how the population generally feels, but might miss some smaller groups. Describe a way you might choose a sample of people to address this question.

    Exercise \(\PageIndex{4}\): Auditing Sales

    An online shopping company tracks how many items they sell in different categories during each month for a year. Three different auditors each take samples from that data. Use the samples to draw dot plots of what the population data might look like for the furniture and electronics categories.

    Auditor 1’s sample

    clipboard_e2cb5ed860ea25678f5a02fac08eab989.png
    Figure \(\PageIndex{1}\): A dot plot for monthly sales of furniture online in hundreds. The numbers 66 through 74 are indicated. The data titled Auditor ones sample are as follows: 67 hundred, 1 dot. 70 hundred, 1 dot. 73 hundred, 1 dot.

    Auditor 2's sample

    clipboard_e030ef0bd709d1ba62d99f0ec0218fc2b.png
    Figure \(\PageIndex{2}\)

    Auditor 3’s sample

    clipboard_edb726f90fc2c90d4d3597d9e99c5a4f3.png
    Figure \(\PageIndex{3}\): A dot plot for monthly sales of furniture online in hundreds. The numbers 66 through 74 are indicated. The data titled Auditor threes sample are as follows: 71 hundred, 2 dots. 73 hundred, 1 dot.

    Population

    clipboard_e544c13669f9f24a7a45a2baeb4acf696.png
    Figure \(\PageIndex{4}\)

    Auditor 1’s sample

    clipboard_e0fbfa0503c4844f8b6da678abcd29639.png
    Figure \(\PageIndex{5}\): A dot plot for monthly sales of electronics online in thousands. The numbers 38 through 43 are indicated. The data titled Auditor one's sample are as follows: 39 thousand, 1 dot. 41 thousand, 1 dot. 43 thousand, 1 dot.

    Auditor 2's sample

    clipboard_e860a385befd8aabdfed79bbaca0abcc7.png
    Figure \(\PageIndex{6}\): A dot plot for monthly sales of electronics online in thousands. The numbers 38 through 43 are indicated. The data titled Auditor two's sample are as follows: 41 thousand, 1 dot. 43 thousand, 2 dots.

    Auditor 3's sample

    clipboard_e8cbe8495fa79b035e58dcb35e0b3f6bb.png
    Figure \(\PageIndex{7}\): A dot plot for monthly sales of electronics online in thousands. The numbers 38 through 43 are indicated. The data titled Auditor three's sample are as follows: 40 thousand, 1 dot. 41 thousand, 1 dot. 43 thousand, 1 dot.

    Population

    clipboard_e7de6c28285507d98ede33242957a346f.png
    Figure \(\PageIndex{8}\)

    Summary

    A sample that is representative of a population has a distribution that closely resembles the distribution of the population in shape, center, and spread.

    For example, consider the distribution of plant heights, in cm, for a population of plants shown in this dot plot. The mean for this population is 4.9 cm, and the MAD is 2.6 cm.

    clipboard_e0d81d877868e1092ef04364663f376dc.png
    Figure \(\PageIndex{9}\): A dot plot for height in centimeters. The numbers 1 through 11 are indicated. The data are as follows: 1 centimeter, 5 dots; 2 centimeters, 7 dots; 3 centimeters, 8 dots; 4 centimeters, 8 dots; 5 centimeters, 5 dots; 6 centimeters, 3 dots; 7 centimeters, 2 dots; 8 centimeters, 2 dots; 9 centimeters, 1 dot; 10 centimeters, 3 dots; 11 centimeters, 5 dots.

    A representative sample of this population should have a larger peak on the left and a smaller one on the right, like this one. The mean for this sample is 4.9 cm, and the MAD is 2.3 cm.

    clipboard_ebc554a4e1f2d364d8f4443cd43274511.png
    Figure \(\PageIndex{10}\): A dot plot for height in centimeters. The numbers 1 through 11 are indicated. The data are as follows: 1 centimeter, 1 dot; 2 centimeters, 2 dots; 3 centimeters, 4 dots; 4 centimeters, 4 dots; 5 centimeters, 2 dots; 6 centimeters, 1 dot; 7 centimeters, 1 dot; 10 centimeters, 1 dot; 11 centimeters, 2 dots.

    Here is the distribution for another sample from the same population. This sample has a mean of 5.7 cm and a MAD of 1.5 cm. These are both very different from the population, and the distribution has a very different shape, so it is not a representative sample.

    clipboard_e37beff9cacda459681fccd55b3ca09c5.png
    Figure \(\PageIndex{11}\): A dot plot for height in centimeters. The numbers 1 through 11 are indicated. The data are as follows: 3 centimeters, 1 dot; 4 centimeters, 3 dots; 5 centimeters, 3 dots; 6 centimeters, 2 dots; 7 centimeters, 1 dot; 8 centimeters, 2 dots; 9 centimeters, 1 dot.

    Glossary Entries

    Definition: Mean

    The mean is one way to measure the center of a data set. We can think of it as a balance point. For example, for the data set 7, 9, 12, 13, 14, the mean is 11.

    clipboard_e0af9b462a4649223d9101bc04ae54726.png
    Figure \(\PageIndex{12}\)

    To find the mean, add up all the numbers in the data set. Then, divide by how many numbers there are. \(7+9+12+13+14=55\) and \(55\div 5=11\).

    Definition: Mean Absolute Deviation (MAD)

    The mean absolute deviation is one way to measure how spread out a data set is. Sometimes we call this the MAD. For example, for the data set 7, 9, 12, 13, 14, the MAD is 2.4. This tells us that these travel times are typically 2.4 minutes away from the mean, which is 11.

    clipboard_e0af9b462a4649223d9101bc04ae54726.png
    Figure \(\PageIndex{13}\)

    To find the MAD, add up the distance between each data point and the mean. Then, divide by how many numbers there are.

    \(4+2+1+2+3=12\) and \(12\div 5=2.4\)

    Definition: Median

    The median is one way to measure the center of a data set. It is the middle number when the data set is listed in order.

    For the data set 7, 9, 12, 13, 14, the median is 12.

    For the data set 3, 5, 6, 8, 11, 12, there are two numbers in the middle. The median is the average of these two numbers. \(6+8=14\) and \(14\div 2=7\).

    Definition: Population

    A population is a set of people or things that we want to study.

    For example, if we want to study the heights of people on different sports teams, the population would be all the people on the teams.

    Definition: Representative

    A sample is representative of a population if its distribution resembles the population's distribution in center, shape, and spread.

    For example, this dot plot represents a population.

    clipboard_e34cf530bb1c8f1472b0bd8d103d407c3.png
    Figure \(\PageIndex{14}\)

    This dot plot shows a sample that is representative of the population.​​​​​

    clipboard_eecb2f6d5c6d39e7ffee42e5a79b10df0.png
    Figure \(\PageIndex{15}\)

    Definition: Sample

    A sample is part of a population. For example, a population could be all the seventh grade students at one school. One sample of that population is all the seventh grade students who are in band.

    Practice

    Exercise \(\PageIndex{5}\)

    Suppose 45% of all the students at Andre’s school brought in a can of food to contribute to a canned food drive. Andre picks a representative sample of 25 students from the school and determines the sample’s percentage.

    He expects the percentage for this sample will be 45%. Do you agree? Explain your reasoning.

    Exercise \(\PageIndex{6}\)

    This is a dot plot of the scores on a video game for a population of 50 teenagers.

    clipboard_e03994fb1ce090e9a002f2b0827de185f.png
    Figure \(\PageIndex{16}\): A dot plot for score on a video game. The numbers 40 through 200, in increments of 10, are indicated. The data are as follows: Score of 40, 1 dot. Score of 45, 1 dot. Score of 60, 1 dot. Score of 65, 2 dots. Score of 70, 2 dots. Score of 75, 2 dots. Score of 80, 2 dots. Score of 85, 2 dots. Score of 90, 2 dots. Score of 95, 2 dots. Score of 100, 2 dots. Score of 105, 1 dot. Score of 110, 2 dots. Score of 115, 2 dots. Score of 120, 3 dots. Score of 125, 3 dots. Score of 130, 5 dots. Score of 135, 2 dots. Score of 145, 1 dot. Score of 150, 1 dot. Score of 155, 1 dot. Score of 160, 1 dot. Score of 170, 2 dots. Score of 175, 2 dots. Score of 180, 1 dot. Score of 190, 2 dots. Score of 195, 1 dot. Score of 200, 1 dot.

    The three dot plots together are the scores of teenagers in three samples from this population. Which of the three samples is most representative of the population? Explain how you know.

    clipboard_e689f4a4f2c4dc8c1f96942d80fef70c2.png
    Figure \(\PageIndex{17}\): Three dot plots for score on a video game are labeled sample 1, sample 2, and sample 3. The numbers 40 through 200, in increments of 10, are indicated. The data are as follows: Sample 1: Score of 75, 2 dots. Score of 100, 1 dot. Score of 110, 1 dot. Score of 130, 1 dot. Score of 160, 1 dot. Score of 170, 2 dots. Score of 180, 1 dot. Score of 195, 1 dot. Sample 2: Score of 160, 1 dot. Score of 170, 2 dots. Score of 175, 2 dots. Score of 180, 1 dot. Score of 190, 2 dots. Score of 195, 1 dot. Score of 200, 1 dot. Sample 3: Score of 40, 1 dot. Score of 45, 1 dot. Score of 60, 1 dot. Score of 70, 2 dots. Score of 80, 1 dot. Score of 100, 2 dots. Score of 105, 1 dot. Score of 115, 1 dot.

    Exercise \(\PageIndex{7}\)

    This is a dot plot of the number of text messages sent one day for a sample of the students at a local high school. The sample consisted of 30 students and was selected to be representative of the population.

    clipboard_efe7ccf3db1d675594ce47abc0d0d6290.png
    Figure \(\PageIndex{18}\): A dot plot for number of text messages sent. The numbers 0 through 90, in increments of 5, are indicated. The data are as follows: 0 text messages, 6 dots. 2 text messages, 2 dots. 8 text messages, 3 dots. 10 text messages, 2 dots. 11 text messages, 1 dot. 13 text messages, 1 dot. 14 text messages, 1 dot. 16 text messages, 1 dot. 17 text messages, 1 dot. 20 text messages, 1 dot. 23 text messages, 1 dot. 24 text messages, 1 dot. 26 text messages, 1 dot. 30 text messages, 1 dot. 31 text messages, 2 dots. 32 text messages, 1 dot. 35 text messages, 1 dot. 41 text messages, 1 dot. 75 text messages, 1 dot. 90 text messages, 1 dot.
    1. What do the six values of 0 in the dot plot represent?
    2. Since this sample is representative of the population, describe what you think a dot plot for the entire population might look like.

    Exercise \(\PageIndex{8}\)

    A doctor suspects you might have a certain strain of flu and wants to test your blood for the presence of markers for this strain of virus. Why would it be good for the doctor to take a sample of your blood rather than use the population?

    (From Unit 8.3.2)

    Exercise \(\PageIndex{9}\)

    How many different outcomes are in each sample space? Explain your reasoning. (You do not need to write out the actual options, just provide the number and your reasoning.)

    1. A letter of the English alphabet is followed by a digit from 0 to 9.
    2. A baseball team’s cap is selected from 3 different colors, 2 different clasps, and 4 different locations for the team logo. A decision is made to include or not to include reflective piping.
    3. A locker combination like 7-23-11 uses three numbers, each from 1 to 40. Numbers can be used more than once, like 7-23-7.

    (From Unit 8.2.2)


    This page titled 8.3.3: What Makes a Good Sample? is shared under a CC BY license and was authored, remixed, and/or curated by Illustrative Mathematics.

    • Was this article helpful?