Search

Text Color

Margin Size

Font Type

Enable Dyslexic Font

3.2: Fitting Linear Models to Data

Last updated

Jul 13, 2024
Save as PDF
- 3.1.1: Exercises - Linear Applications
- 3.2E: Exercises - Fitting Linear Models to Data

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

Learning Objectives

Create and interpret scatter plots.
Fit a regression line to a set of data and use the linear model to make predictions.

Prerequisite Skills

Before you get started, take this prerequisite quiz.

1. If $y=-3.215x-78.2$ , solve for $y$ when $x=-21$ .

Click here to check your answer

$y=-10.685$

If you missed this problem, review here. (Note that this will open a different textbook in a new window.)

2. If $y=-3.215x-78.2$ , solve for $x$ when $y=-46.05$ .

Click here to check your answer

$x=-10$

If you missed this problem, review Section 1.1. (Note that this will open in a new window.)

A professor is attempting to identify trends among final exam scores. His class has a mixture of students, so he wonders if there is any relationship between age and final exam scores. One way for him to analyze the scores is by creating a diagram that relates the age of each student to the exam score received. In this section, we will examine one such diagram known as a scatter plot.

Drawing and Interpreting Scatter Plots

A scatter plot is a graph of plotted points that may show a relationship between two sets of data. If the relationship is from a linear model, or a model that is nearly linear, the professor can draw conclusions using his knowledge of linear functions. Figure $\PageIndex{1}$ shows a sample scatter plot.

Scatter plot, titled 'Final Exam Score VS Age'. The x-axis is the age, and the y-axis is the final exam score. The range of ages are between 20s - 50s, and the range for scores are between upper 50s and 90s — Figure $\PageIndex{1}$ : A scatter plot of age and final exam score variables

Notice this scatter plot does not indicate a linear relationship. The points do not appear to follow a trend. In other words, there does not appear to be a relationship between the age of the student and the score on the final exam.

To create a scatter plot on Desmos

Open your Desmos app, or go to https://www.desmos.com/calculator.
Above the equation list, click the + icon.
Click "table."
Enter the x-values in the first column and the y-values in the second column.
Click the magnifying glass to the left of the table to automatically adjust the viewing window to see the points. You can also manually adjust the window as necessary.

Example $\PageIndex{1}$ : Using a Scatter Plot to Investigate Cricket Chirps

The table below shows the number of cricket chirps in 15 seconds, for several different air temperatures, in degrees Fahrenheit. Plot this data and determine whether the data appears to be linearly related.

Table $\PageIndex{1}$
Chirps	44	35	20.4	33	31	35	18.5	37	26
Temperature	80.5	70.5	57	66	68	72	52	73.5	53

Solution

Plotting this data, as depicted in Figure $\PageIndex{2}$ suggests that there may be a trend. We can see from the trend in the data that the number of chirps increases as the temperature increases. The trend appears to be roughly linear, though certainly not perfectly so.

Scatter plot, titled 'Cricket Chirps Vs Air Temperature'. The x-axis is the Cricket Chirps in 15 Seconds, and the y-axis is the Temperature (F). The line regression is generally positive. — Figure $\PageIndex{2}$ : The line regression is generally positive.

Finding the Line of Best Fit

Once we recognize a need for a linear function to model that data, the natural follow-up question is “what is that linear function?” One way to approximate our linear function is to sketch the line that seems to best fit the data. Then we can extend the line until we can verify the y-intercept. We can approximate the slope of the line by extending it until we can estimate the $\frac{\text{rise}}{\text{run}}$ .

Example $\PageIndex{2}$ : "Eyeballing" a Line of Best Fit

Find a linear function that fits the data in Table $\PageIndex{1}$ by “eyeballing” a line that seems to fit.

Solution

On a graph, we could try sketching a line.

Using the starting and ending points of our hand drawn line, points $(0, 30)$ and $(50, 90)$ , this graph has a slope of

$m=\dfrac{60}{50}=1.2\nonumber$

and a y-intercept at 30. This gives an equation of

$T(c)=1.2c+30\nonumber$

where $c$ is the number of chirps in 15 seconds, and $T(c)$ is the temperature in degrees Fahrenheit. The resulting equation is represented in Figure $\PageIndex{3}$ .

Scatter plot, showing the line of best fit. It is titled 'Cricket Chirps Vs Air Temperature'. The x-axis is 'c, Number of Chirps', and the y-axis is 'T(c), Temperature (F)'. — Figure $\PageIndex{3}$ : Scatter plot, showing the line of best fit.

Analysis

This linear equation can then be used to approximate answers to various questions we might ask about the trend.

Exercise $\PageIndex{1}$

According to the data from Table $\PageIndex{1}$ , what temperature can we predict it is if we counted 29 chirps in 15 seconds?

Solution

According to the line the temperature would be somewhere in the low 60s, approximately 63°F.

Finding the Line of Best Fit Using a Graphing Program

While eyeballing a line works reasonably well, some statistical techniques exist for fitting a line to data that minimize the differences between the line and data values[2]. One such technique is called least squares regression and can be computed by many graphing calculators, spreadsheet software, statistical software, and many web-based calculators[3]. Least squares regression is one means to determine the line that best fits the data, and here we will refer to this method as linear regression.

To find a line of best fit using Desmos

Open your Desmos app, or go to https://www.desmos.com/calculator.
Create a scatter plot, as listed above.
After the xy table, in the space for the next equation, write y1~mx1+b. This tells the program to use the values of x1 and y1 from the table above.
- The ~ sign tells the program to approximate the equation, since a linear equation won't fit the data precisely.
- The $x_1$ and $y_1$ tell the program to use the values of $x_1$ and $y_1$ from the table above.
- The $m$ and $b$ represent slope and y-intercept of the line, respectively.
Press Enter.
The resulting information will display the appropriate slope (m) and y-intercept (b) values of the line of best fit.
- The information will also display statistics of $r^2$ and $r$ . An r-value of 1 will indicate an exact fit. The closer the r-value is to 1, the closer the data fits the line of best fit.
For a video explanation of this process, click HERE. The first 53 seconds show how to enter a scatter plot. The line of best fit begins at the 0:53 mark.

Example $\PageIndex{3}$ : Finding a Linear Regression Equation

Find the line of best fit using the cricket-chirp data in Table $\PageIndex{1}$ .

Solution

Table $\PageIndex{1}$
Chirps	44	35	20.4	33	31	35	18.5	37	26
Temperature	80.5	70.5	57	66	68	72	52	73.5	53

Enter the input (chirps) in $x_1$ . Enter the output (temperature) in $y_1$ .

Enter $y_1~mx_1+b$ in the next equation space.

Note the m and b values for the data.

Therefore we obtain the equation:

$y=1.143x+30.281 \nonumber$

$T=1.143c+30.281 \nonumber$

$clipboard_ececfa3e8d5978f085324ce3c623cda18.png$

$QA.png$ Will there ever be a case where two different lines will serve as the best fit for the data?

No. There is only one best fit line.

Predicting with a Regression Line

Once we determine that a set of data is linear, we can use the regression line to make predictions. As we learned above, a regression line is a line that is closest to the data in the scatter plot, which means that only one such line is a best fit for the data.

Example $\PageIndex{4}$ : Using a Regression Line to Make Predictions

Gasoline consumption in the United States has been steadily increasing. Consumption data from 1994 to 2004 is shown in Table $\PageIndex{3}$ .

a. Graph the data and determine whether the trend is linear. If so, find a model for the data.

b. Use the model to predict the consumption in 2008.

c. Use the model to predict when oil consumption would reach 160 billion gallons.

Table $\PageIndex{3}$
Year	'94	'95	'96	'97	'98	'99	'00	'01	'02	'03	'04
Consumption (billions of gallons)	113	116	118	119	123	125	126	128	131	133	136

a. The scatter plot of the data, including the least squares regression line, is shown in Figure $\PageIndex{5}$ .

Scatter plot, showing the line of best fit. It is titled 'Gas Consumption VS Year'. The x-axis is 'Year After 1994', and the y-axis is 'Gas Consumption (billions of gallons)'. — Figure $\PageIndex{5}$ : Scatter plot, showing the line of best fit.

We can introduce new input variable, $t$ ,representing years since 1994.

The linear regression equation is:

$C=113.318+2.209t \nonumber$

b. Using this to predict consumption in 2008, set $t=14$ .

$\begin{align*} C&=113.318+2.209(14) \\ &=144.244 \end{align*}$

The model predicts 144.244 billion gallons of gasoline consumption in 2008.

c. Using the model to predict time when consumption is 160 billion gallons, set $C=160$ .

$\begin{align*} 160&=113.318+2.209(t) \\ 46.682&=2.209(t) \\ 21.133 &=t \end{align*}$

21 years after 1994 is the year 2015, so oil consumption is expected to reach 160 billion gallons in 2015.

Exercise $\PageIndex{2}$

Use the model we created using technology in Example $\PageIndex{4}$ to predict the gas consumption in 2011.

Answer: 150.871 billion gallons

Key Concepts

Scatter plots show the relationship between two sets of data.
Scatter plots may represent linear or non-linear models.
The line of best fit may be estimated or calculated, using a calculator or statistical software.
The correlation coefficient, $r$ , indicates the degree of linear relationship between data.
A regression line best fits the data.
The least squares regression line may be used to make predictions regarding either of the variables.

Contributors and Attributions

Jay Abramson (Arizona State University) with contributing authors. Textbook content produced by OpenStax College is licensed under a Creative Commons Attribution License 4.0 license. Download for free at https://openstax.org/details/books/precalculus.

Learning Objectives

Prerequisite Skills

Drawing and Interpreting Scatter Plots

To create a scatter plot on Desmos

Example 3.2.1\PageIndex{1}: Using a Scatter Plot to Investigate Cricket Chirps

Finding the Line of Best Fit

Example 3.2.2\PageIndex{2}: "Eyeballing" a Line of Best Fit

Exercise 3.2.1\PageIndex{1}

Finding the Line of Best Fit Using a Graphing Program

To find a line of best fit using Desmos

Example 3.2.3\PageIndex{3}: Finding a Linear Regression Equation

Predicting with a Regression Line

Example 3.2.4\PageIndex{4}: Using a Regression Line to Make Predictions

Exercise 3.2.2\PageIndex{2}

Key Concepts

Contributors and Attributions

Support Center

How can we help?

Example $\PageIndex{1}$ : Using a Scatter Plot to Investigate Cricket Chirps

Example $\PageIndex{2}$ : "Eyeballing" a Line of Best Fit

Exercise $\PageIndex{1}$

Example $\PageIndex{3}$ : Finding a Linear Regression Equation

Example $\PageIndex{4}$ : Using a Regression Line to Make Predictions

Exercise $\PageIndex{2}$