Skip to main content
Mathematics LibreTexts

4.4: Other Summaries

  • Page ID
    91561
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    Learning Objectives

    • Create and interpret stemplots, line graphs, time series for univariate data.
    • Create and interpret contingency tables and scatter plots for bivariate data.

    Stem-and-Leaf Diagrams or Stemplots

    One simple graph, the stem-and-leaf graph or stemplot, comes from the field of exploratory data analysis. It is a good choice when the data sets are small. To create the plot, divide each observation of data into a stem and a leaf. The leaf consists of a final significant digit. For example, 23 has stem two and leaf three. The number 432 has stem 43 and leaf two. Likewise, the number 5,432 has stem 543 and leaf two. The decimal 9.3 has stem nine and leaf three. Write the stems in a vertical line from smallest to largest. Draw a vertical line to the right of the stems. Then write the leaves in increasing order next to their corresponding stem.

    Example \(\PageIndex{1}\)

    For Susan Dean's spring pre-calculus class, scores for the first exam were as follows (smallest to largest):

    33; 42; 49; 49; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78; 80; 83; 88; 88; 88; 90; 92; 94; 94; 94; 94; 96; 100

    Stem-and-Leaf Graph
    Stem Leaf
    3 3
    4 2 9 9
    5 3 5 5
    6 1 3 7 8 8 9 9
    7 2 3 4 8
    8 0 3 8 8 8
    9 0 2 4 4 4 4 6
    10 0

    The stemplot shows that most scores fell in the 60s, 70s, 80s, and 90s. Eight out of the 31 scores or approximately 26% \(\left(\frac{8}{31}\right)\) were in the 90s or 100, a fairly high number of As.

    Exercise \(\PageIndex{1}\)

    For the Park City basketball team, scores for the last 30 games were as follows (smallest to largest):

    32; 32; 33; 34; 38; 40; 42; 42; 43; 44; 46; 47; 47; 48; 48; 48; 49; 50; 50; 51; 52; 52; 52; 53; 54; 56; 57; 57; 60; 61

    Construct a stem plot for the data.

    Answer
    Stem Leaf
    3 2 2 3 4 8
    4 0 2 2 3 4 6 7 7 8 8 8 9
    5 0 0 1 2 2 2 3 4 6 7 7
    6 0 1
    Interactive Exercise \(\PageIndex{1}\)

     

    The stemplot is a quick way to graph data and gives an exact picture of the data. You want to look for an overall pattern and any outliers. An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500) while others may indicate that something unusual is happening. It takes some background information to explain outliers, so we will cover them in more detail later.

    Example \(\PageIndex{2}\)

    The data are the distances (in kilometers) from a home to local supermarkets. Create a stemplot using the data:

    1.1; 1.5; 2.3; 2.5; 2.7; 3.2; 3.3; 3.3; 3.5; 3.8; 4.0; 4.2; 4.5; 4.5; 4.7; 4.8; 5.5; 5.6; 6.5; 6.7; 12.3

    Do the data seem to have any concentration of values?

    HINT: The leaves are to the right of the decimal.

    Answer

    The value 12.3 may be an outlier. Values appear to concentrate at three and four kilometers.

    Stem Leaf
    1 1 5
    2 3 5 7
    3 2 3 3 5 8
    4 0 2 5 5 7 8
    5 5 6
    6 5 7
    7  
    8  
    9  
    10  
    11  
    12 3
    Exercise \(\PageIndex{2}\)

    The following data show the distances (in miles) from the homes of off-campus statistics students to the college. Create a stem plot using the data and identify any outliers:

    0.5; 0.7; 1.1; 1.2; 1.2; 1.3; 1.3; 1.5; 1.5; 1.7; 1.7; 1.8; 1.9; 2.0; 2.2; 2.5; 2.6; 2.8; 2.8; 2.8; 3.5; 3.8; 4.4; 4.8; 4.9; 5.2; 5.5; 5.7; 5.8; 8.0

    Answer
    Stem Leaf
    0 5 7
    1 1 2 2 3 3 5 5 7 7 8 9
    2 0 2 5 6 8 8 8
    3 5 8
    4 4 8 9
    5 2 5 7 8
    6  
    7  
    8 0

    The value 8.0 may be an outlier. Values appear to concentrate at one and two miles.

    Example \(\PageIndex{3}\): Side-by-Side Stem-and-Leaf plot

    A side-by-side stem-and-leaf plot allows a comparison of the two data sets in two columns. In a side-by-side stem-and-leaf plot, two sets of leaves share the same stem. The leaves are to the left and the right of the stems. Tables \(\PageIndex{1}\) and \(\PageIndex{2}\) show the ages of presidents at their inauguration and at their death. Construct a side-by-side stem-and-leaf plot using this data.

    Table \(\PageIndex{1}\): Presidential Ages at Inauguration
    President Ageat Inauguration President Age President Age
    Pierce 48 Harding 55 Obama 47
    Polk 49 T. Roosevelt 42 G.H.W. Bush 64
    Fillmore 50 Wilson 56 G. W. Bush 54
    Tyler 51 McKinley 54 Reagan 69
    Van Buren 54 B. Harrison 55 Ford 61
    Washington 57 Lincoln 52 Hoover 54
    Jefferson 57 Grant 46 Truman 60
    Madison 57 Hayes 54 Eisenhower 62
    J. Q. Adams 57 Arthur 51 L. Johnson 55
    Monroe 58 Garfield 49 Kennedy 43
    J. Adams 61 A. Johnson 56 F. Roosevelt 51
    Jackson 61 Cleveland 47 Nixon 56
    Taylor 64 Taft 51 Clinton 47
    Buchanan 65 Coolidge 51 Trump 70
    W. H. Harrison 68 Cleveland 55 Carter 52
    \(\PageIndex{2}\) Presidential Age at Death
    President Age President Age President Age
    Washington 67 Lincoln 56 Hoover 90
    J. Adams 90 A. Johnson 66 F. Roosevelt 63
    Jefferson 83 Grant 63 Truman 88
    Madison 85 Hayes 70 Eisenhower 78
    Monroe 73 Garfield 49 Kennedy 46
    J. Q. Adams 80 Arthur 56 L. Johnson 64
    Jackson 78 Cleveland 71 Nixon 81
    Van Buren 79 B. Harrison 67 Ford 93
    W. H. Harrison 68 Cleveland 71 Reagan 93
    Tyler 71 McKinley 58    
    Polk 53 T. Roosevelt 60    
    Taylor 65 Taft 72    
    Fillmore 74 Wilson 67    
    Pierce 64 Harding 57    
    Buchanan 77 Coolidge 60

    Answer

    Ages at Inauguration   Ages at Death
    9 9 8 7 7 7 6 3 2 4 6 9
    8 7 7 7 7 6 6 6 5 5 5 5 4 4 4 4 4 2 1 1 1 1 1 0 5 3 6 6 7 7 8
    9 5 4 4 2 1 1 1 0 6 0 0 3 3 4 4 5 6 7 7 7 8
      7 0 0 1 1 1 4 7 8 8 9
      8 0 1 3 5 8
      9 0 0 3 3
    Exercise \(\PageIndex{3}\)

    The table shows the number of wins and losses the Atlanta Hawks have had in 42 seasons. Create a side-by-side stem-and-leaf plot of these wins and losses.

    Losses Wins Year Losses Wins Year
    34 48 1968–1969 41 41 1989–1990
    34 48 1969–1970 39 43 1990–1991
    46 36 1970–1971 44 38 1991–1992
    46 36 1971–1972 39 43 1992–1993
    36 46 1972–1973 25 57 1993–1994
    47 35 1973–1974 40 42 1994–1995
    51 31 1974–1975 36 46 1995–1996
    53 29 1975–1976 26 56 1996–1997
    51 31 1976–1977 32 50 1997–1998
    41 41 1977–1978 19 31 1998–1999
    36 46 1978–1979 54 28 1999–2000
    32 50 1979–1980 57 25 2000–2001
    51 31 1980–1981 49 33 2001–2002
    40 42 1981–1982 47 35 2002–2003
    39 43 1982–1983 54 28 2003–2004
    42 40 1983–1984 69 13 2004–2005
    48 34 1984–1985 56 26 2005–2006
    32 50 1985–1986 52 30 2006–2007
    25 57 1986–1987 45 37 2007–2008
    32 50 1987–1988 35 47 2008–2009
    30 52 1988–1989 29 53 2009–2010
    Answer
    Table \(\PageIndex{A}\): Atlanta Hawks Wins and Losses
    Number of Wins   Number of Losses
    3 1 9
    9 8 8 6 5 2 5 5 9
    8 7 6 6 5 5 4 3 1 1 1 1 0 3 0 2 2 2 2 4 4 5 6 6 6 9 9 9
    8 8 7 6 6 6 3 3 3 2 2 1 1 0 4 0 0 1 1 2 4 5 6 6 7 7 8 9
    7 7 6 3 2 0 0 0 0 5 1 1 1 2 3 4 4 6 7
      6 9

    Line Graphs or Frequency Polygons

    Another type of graph that is useful for specific data values is a line graph or a frequency polygon. In the particular line graph shown in the example below, the x-axis (horizontal axis) consists of data values and the y-axis (vertical axis) consists of frequency points. The frequency points are connected using line segments.

    Example \(\PageIndex{4}\)

    In a survey, 40 mothers were asked how many times per week a teenager must be reminded to do his or her chores. The results are shown in Table and in Figure.

    Number of times teenager is reminded Frequency
    0 2
    1 5
    2 8
    3 14
    4 7
    5 4
    A line graph showing the number of times a teenager needs to be reminded to do chores on the x-axis and  frequency on the y-axis.
    Figure \(\PageIndex{1}\)
    Exercise \(\PageIndex{4}\)

    In a survey, 40 people were asked how many times per year they had their car in the shop for repairs. The results are shown in Table. Construct a line graph.

    Number of times in shop Frequency
    0 7
    1 10
    2 14
    3 9

    Answer

    Figure \(\PageIndex{2}\).
    Interactive Exercise \(\PageIndex{4}\)

     

    Frequency polygons can also be used for comparing distributions. This is achieved by overlaying the frequency polygons drawn for different data sets.

    Example \(\PageIndex{5}\)

    We will construct an overlay frequency polygon comparing the final exam scores with the students’ final numeric grade.

    Frequency Distribution for Calculus Final Test Scores
    Range Midpoint Frequency Cumulative Frequency
    49.5-59.5 54.5 5 5
    59.5-69.5 64.5 10 15
    69.5-79.5 74.5 30 45
    79.5-89.5 84.5 40 85
    89.5-99.5 94.5 15 100
    Frequency Distribution for Calculus Final Grades
    Range Midpoint Frequency Cumulative Frequency
    49.5-59.5 54.5 10 10
    59.5-69.5 64.5 10 20
    69.5-79.5 74.5 30 50
    79.5-89.5 84.5 45 95
    89.5-99.5 94.5 5 100
    This is an overlay frequency polygon that matches the supplied data. The x-axis shows the grades, and the y-axis shows the frequency.
    Figure \(\PageIndex{6}\).

    Time Series Graphs

    Suppose that we want to study the temperature range of a region for an entire month. Every day at noon we note the temperature and write this down in a log. A variety of statistical studies could be done with this data. We could find the mean or the median temperature for the month. We could construct a histogram displaying the number of days that temperatures reach a certain range of values. However, all of these methods ignore a portion of the data that we have collected.

    One feature of the data that we may want to consider is that of time. Since each date is paired with the temperature reading for the day, we don't have to think of the data as being random. We can instead use the times given to impose a chronological order on the data. A graph that recognizes this ordering and displays the changing temperature as the month progresses is called a time series graph.

    To construct a time series graph, we must look at both pieces of our paired data set. We start with a standard Cartesian coordinate system. The horizontal axis is used to plot the date or time increments, and the vertical axis is used to plot the values of the variable that we are measuring. By doing this, we make each point on the graph correspond to a date and a measured quantity. The points on the graph are typically connected by straight lines in the order in which they occur.

    Example \(\PageIndex{6}\)

    The following data shows the Annual Consumer Price Index, each month, for ten years. Construct a time series graph for the Annual Consumer Price Index data only.

    Year Jan Feb Mar Apr May Jun Jul
    2003 181.7 183.1 184.2 183.8 183.5 183.7 183.9
    2004 185.2 186.2 187.4 188.0 189.1 189.7 189.4
    2005 190.7 191.8 193.3 194.6 194.4 194.5 195.4
    2006 198.3 198.7 199.8 201.5 202.5 202.9 203.5
    2007 202.416 203.499 205.352 206.686 207.949 208.352 208.299
    2008 211.080 211.693 213.528 214.823 216.632 218.815 219.964
    2009 211.143 212.193 212.709 213.240 213.856 215.693 215.351
    2010 216.687 216.741 217.631 218.009 218.178 217.965 218.011
    2011 220.223 221.309 223.467 224.906 225.964 225.722 225.922
    2012 226.665 227.663 229.392 230.085 229.815 229.478 229.104
    Year Aug Sep Oct Nov Dec Annual
    2003 184.6 185.2 185.0 184.5 184.3 184.0
    2004 189.5 189.9 190.9 191.0 190.3 188.9
    2005 196.4 198.8 199.2 197.6 196.8 195.3
    2006 203.9 202.9 201.8 201.5 201.8 201.6
    2007 207.917 208.490 208.936 210.177 210.036 207.342
    2008 219.086 218.783 216.573 212.425 210.228 215.303
    2009 215.834 215.969 216.177 216.330 215.949 214.537
    2010 218.312 218.439 218.711 218.803 219.179 218.056
    2011 226.545 226.889 226.421 226.230 225.672 224.939
    2012 230.379 231.407 231.317 230.221 229.601 229.594

    Answer

    This is a times series graph that matches the supplied data. The x-axis shows years from 2003 to 2012, and the y-axis shows the annual CPI.
    Figure \(\PageIndex{7}\).
    Exercise \(\PageIndex{6}\)

    The following table is a portion of a data set from www.worldbank.org. Use the table to construct a time series graph for CO2 emissions for the United States.

    CO2 Emissions
      Ukraine United Kingdom United States
    2003 352,259 540,640 5,681,664
    2004 343,121 540,409 5,790,761
    2005 339,029 541,990 5,826,394
    2006 327,797 542,045 5,737,615
    2007 328,357 528,631 5,828,697
    2008 323,657 522,247 5,656,839
    2009 272,176 474,579 5,299,563
    This is a times series graph that matches the supplied data. The x-axis shows years from 2003 to 2012, and the y-axis shows the annual CPI.
    Figure \(\PageIndex{8}\).

    Time series graphs are important tools in various applications of statistics. When recording values of the same variable over an extended period of time, sometimes it is difficult to discern any trend or pattern. However, once the same data points are displayed graphically, some features jump out. Time series graphs make trends easy to spot.

    Bivariate Data

    Earlier we discussed the methods of summarizing and organizing the data obtained from observing one variable. Such data are called univariate data. When we are interested in studying the relationship between a pair of variables we must collect and organize the data on two variables at the same time. Such data itself is called bivariate data and there are several ways to organize such data depending on the types of the variables.

    Univariate Data

    Bivariate Data

    The mileage of 5 cars

    Car

    Mileage (mi)

    1

    78524

    2

    12574

    3

    24914

    4

    65813

    5

    39824

    The ages and mileage of 5 cars

    Car

    Age (years)

    Mileage (mi)

    1

    7

    78524

    2

    2

    12574

    3

    1

    24914

    4

    5

    65813

    5

    3

    39824

    Scatter Plots

    When both variables are quantitative, we can construct a scatter plot to organize the data. To construct a scatterplot, we use a horizontal axis for the observations of one variable and a vertical axis for the observations of the other. When picking which axis to use for each variable consider whether you suspect that one variable depends on the other. The independent variable will be on the x-axis and dependent will be on the y axis. Each pair of observations is then plotted as a point.

    Example \(\PageIndex{7}\)

    The summary of the ages and the mileages of a sample of 5 cars is shown below:

    Table \(\PageIndex{A}\): Table showing the ages and the mileages of a sample of 5 cars.
    Car \(x\) (age in years) \(y\) (mileage in miles)

    1

    7

    78524

    2

    2

    12574

    3

    1

    24914

    4

    5

    65813

    5

    3

    39824

    To construct the scatterplot, we use a horizontal axis for the ages of cars and a vertical axis for the mileage.

    clipboard_e5f9535330dbc03f9a07cde0b598da592.png

    Figure \(\PageIndex{B}\): Scatter plot showing the ages and the mileages of a sample of 5 cars.

    Exercise \(\PageIndex{7}\)

    Amelia plays basketball for her high school. She wants to improve to play at the college level. She notices that the number of points she scores in a game goes up in response to the number of hours she practices her jump shot each week. She records the following data:

    \(X\) (hours practicing jump shot) \(Y\) (points scored in a game)
    5 15
    7 22
    9 28
    10 31
    11 33
    12 36

    Construct a scatter plot and state if what Amelia thinks appears to be true.

    Answer

    This is a scatter plot for the data provided. The x-axis is labeled in increments of 2 from 0 - 16. The y-axis is labeled in increments of 5 from 0 - 35.

    Figure \(\PageIndex{2}\)

    Yes, Amelia’s assumption appears to be correct. The number of points Amelia scores per game goes up when she practices her jump shot more.

    Interactive Exercise \(\PageIndex{7}\)

    Contingency Tables

    When we are interested in studying the relationship between a pair of variables we must collect and organize the data on two variables at the same time. Recall that such data itself is called bivariate data and there are several ways to organize such data depending on the types of the variables. Below are examples of univariate and bivariate qualitative data.

    Univariate Data

    Bivariate Data

    The colors of 5 cars

    Car

    Color

    1

    Light

    2

    Dark

    3

    Dark

    4

    Light

    5

    Dark

    The color and condition of 5 cars

    Car

    Color

    Condition

    1

    Light

    Like new

    2

    Dark

    Used

    3

    Dark

    Like New

    4

    Light

    Used

    5

    Dark

    Used

    A table called contingency table can be used to organize bivariate data in which one or both variables are qualitative.

    Example \(\PageIndex{8}\)

    To construct the table, we arrange the observed frequencies into rows and columns. The intersection of a row and a column of a contingency table is called a cell. Each cell shows the frequency of observations that fit the description in the corresponding row and column.

    Color\Condition

    Like new

    Used

    Light

    1

    1

    Dark

    1

    2

    Many times, it makes sense to add the column and row with the totals for each column and row and for the entire table.

    Color\Condition

    Like new

    Used

    Total:

    Light

    1

    1

    2

    Dark

    1

    2

    3

    Total:

    2

    3

    5

    Alternatively, the contingency table can be referred to as a two-way frequency table. Why? In the example above, cover the second and third columns and what's left can be easily seen as the frequency table for the colors of cars! Cover the second and third rows instead and what’s left can be easily seen as the frequency table for whether car conditions! Also note that every contingency table can be easily broken down into two frequency tables, but the two frequency tables cannot be combined into a contingency table if the original data is lost!

    Exercise \(\PageIndex{8}\)

    Destiny surveyed the students in her neighborhood and obtained the following data. Construct the contingency table that summarizes the school enrollment by level and type.

    Student School level School type
    Student 1 Middle School Public
    Student 2 Middle School Private
    Student 3 Elementary School Public
    Student 4 Middle School Public
    Student 5 Elementary School Public
    Student 6 High School Public
    Student 7 High School Private
    Student 8 Middle School Public
    Student 9 High School Public
    Student 10 Elementary School Public
    Student 11 Middle School Private
    Student 12 Elementary School Private
    Student 13 Elementary School Private
    Student 14 Elementary School Private
    Student 15 Elementary School Private
    Student 16 Middle School Private
    Student 17 Middle School Private
    Student 18 High School Private
    Student 19 Elementary School Private
    Answer

    The following contingency table summarizes the school enrollment by level and type:

      Public Private Total
    Elementary School 3 5 8
    Middle School 3 4 7
    High School 2 2 4
    Total

    8

    11

    19

    Exercise \(\PageIndex{8}\)


    This page titled 4.4: Other Summaries is shared under a CC BY license and was authored, remixed, and/or curated by OpenStax.