4.1: Fitting a Straight Line
Suppose the fitting curve is a line. We write for the fitting curve
\[y(x)=\alpha x+\beta . \nonumber \]
The distance \(r_{i}\) from the data point \(\left(x_{i}, y_{i}\right)\) and the fitting curve is given by
\[\begin{aligned} r_{i} &=y_{i}-y\left(x_{i}\right) \\ &=y_{i}-\left(\alpha x_{i}+\beta\right) . \end{aligned} \nonumber \]
A least-squares fit minimizes the sum of the squares of the \(r_{i}\) ’s. This minimum can be shown to result in the most probable values of \(\alpha\) and \(\beta\) .
We define
\[\begin{aligned} \rho &=\sum_{i=1}^{n} r_{i}^{2} \\ &=\sum_{i=1}^{n}\left(y_{i}-\left(\alpha x_{i}+\beta\right)\right)^{2} \end{aligned} \nonumber \]
To minimize \(\rho\) with respect to \(\alpha\) and \(\beta\) , we solve
\[\frac{\partial \rho}{\partial \alpha}=0, \quad \frac{\partial \rho}{\partial \beta}=0 \nonumber \]
Taking the partial derivatives, we have
\[\begin{aligned} &\frac{\partial \rho}{\partial \alpha}=\sum_{i=1}^{n} 2\left(-x_{i}\right)\left(y_{i}-\left(\alpha x_{i}+\beta\right)\right)=0 \\ &\frac{\partial \rho}{\partial \beta}=\sum_{i=1}^{n} 2(-1)\left(y_{i}-\left(\alpha x_{i}+\beta\right)\right)=0 \end{aligned} \nonumber \]
These equations form a system of two linear equations in the two unknowns \(\alpha\) and \(\beta\) , which is evident when rewritten in the form
\[\begin{aligned} \alpha \sum_{i=1}^{n} x_{i}^{2}+\beta \sum_{i=1}^{n} x_{i} &=\sum_{i=1}^{n} x_{i} y_{i} \\ \alpha \sum_{i=1}^{n} x_{i}+\beta n &=\sum_{i=1}^{n} y_{i} \end{aligned} \nonumber \]
These equations can be solved either analytically, or numerically in MATLAB, where the matrix form is
\[\left(\begin{array}{ccc} \sum_{i=1}^{n} & x_{i}^{2} & \sum_{i=1}^{n} \\ \sum_{i=1}^{n} & x_{i} & n \end{array}\right)\left(\begin{array}{c} \alpha \\ \beta \end{array}\right)=\left(\begin{array}{c} \sum_{i=1}^{n} x_{i} y_{i} \\ \sum_{i=1}^{n} y_{i} \end{array}\right) . \nonumber \]
A proper statistical treatment of this problem should also consider an estimate of the errors in \(\alpha\) and \(\beta\) as well as an estimate of the goodness-of-fit of the data to the model. We leave these further considerations to a statistics class.