4.2: Fitting to a Linear Combination of Functions

Last updated

May 31, 2022
Save as PDF
- 4.1: Fitting a Straight Line
- 5: Interpolation

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

Consider the general fitting function

$y(x)=\sum_{j=1}^{m} c_{j} f_{j}(x) \nonumber$

where we assume $m$ functions $f_{j}(x)$ . For example, if we want to fit a cubic polynomial to the data, then we would have $m=4$ and take $f_{1}=1, f_{2}=x, f_{3}=x^{2}$ and $f_{4}=x^{3}$ . Typically, the number of functions $f_{j}$ is less than the number of data points; that is, $m<n$ , so that a direct attempt to solve for the $c_{j}^{\prime}$ s such that the fitting function exactly passes through the $n$ data points would result in $n$ equations and $m$ unknowns. This would be an over-determined linear system that in general has no solution.

We now define the vectors

$\mathbf{y}=\left(\begin{array}{c} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{array}\right), \quad \mathbf{c}=\left(\begin{array}{c} c_{1} \\ c_{2} \\ \vdots \\ c_{m} \end{array}\right) \nonumber$

and the $n$ -by-m matrix

$\mathrm{A}=\left(\begin{array}{cccc} f_{1}\left(x_{1}\right) & f_{2}\left(x_{1}\right) & \cdots & f_{m}\left(x_{1}\right) \\ f_{1}\left(x_{2}\right) & f_{2}\left(x_{2}\right) & \cdots & f_{m}\left(x_{2}\right) \\ \vdots & \vdots & \ddots & \vdots \\ f_{1}\left(x_{n}\right) & f_{2}\left(x_{n}\right) & \cdots & f_{m}\left(x_{n}\right) \end{array}\right) \nonumber$

With $m<n$ , the equation $A \mathbf{c}=\mathbf{y}$ is over-determined. We let

$\mathbf{r}=\mathbf{y}-\mathrm{A} \mathbf{c} \nonumber$

be the residual vector, and let

$\rho=\sum_{i=1}^{n} r_{i}^{2} \nonumber$

The method of least squares minimizes $\rho$ with respect to the components of c. Now, using $T$ to signify the transpose of a matrix, we have

$\begin{aligned} \rho &=\mathbf{r}^{T} \mathbf{r} \\ &=(\mathbf{y}-\mathrm{A} \mathbf{c})^{T}(\mathbf{y}-\mathrm{A} \mathbf{c}) \\ &=\mathbf{y}^{T} \mathbf{y}-\mathbf{c}^{T} \mathrm{~A}^{T} \mathbf{y}-\mathbf{y}^{T} \mathrm{~A} \mathbf{c}+\mathbf{c}^{T} \mathrm{~A}^{T} \mathrm{~A} \mathbf{c} . \end{aligned} \nonumber$

Since $\rho$ is a scalar, each term in the above expression must be a scalar, and since the transpose of a scalar is equal to the scalar, we have

$\mathbf{c}^{T} \mathrm{~A}^{T} \mathbf{y}=\left(\mathbf{c}^{T} \mathrm{~A}^{T} \mathbf{y}\right)^{T}=\mathbf{y}^{T} \mathrm{~A} \mathbf{c} . \nonumber$

Therefore,

$\rho=\mathbf{y}^{T} \mathbf{y}-2 \mathbf{y}^{T} \mathrm{~A} \mathbf{c}+\mathbf{c}^{T} \mathrm{~A}^{T} \mathrm{~A} \mathbf{c} . \nonumber$

To find the minimum of $\rho$ , we will need to solve $\partial \rho / \partial c_{j}=0$ for $j=1, \ldots, m$ . To take the derivative of $\rho$ , we switch to a tensor notation, using the Einstein summation convention, where repeated indices are summed over their allowable range. We can write

$\rho=y_{i} y_{i}-2 y_{i} \mathrm{~A}_{i k} c_{k}+c_{i} \mathrm{~A}_{i k}^{T} \mathrm{~A}_{k l} c_{l} \nonumber$

Taking the partial derivative, we have

$\frac{\partial \rho}{\partial c_{j}}=-2 y_{i} \mathrm{~A}_{i k} \frac{\partial c_{k}}{\partial c_{j}}+\frac{\partial c_{i}}{\partial c_{j}} \mathrm{~A}_{i k}^{T} \mathrm{~A}_{k l} c_{l}+c_{i} \mathrm{~A}_{i k}^{T} \mathrm{~A}_{k l} \frac{\partial c_{l}}{\partial c_{j}} \nonumber$

Now,

$\frac{\partial c_{i}}{\partial c_{j}}= \begin{cases}1, & \text { if } i=j \\ 0, & \text { otherwise }\end{cases} \nonumber$

Therefore,

$\frac{\partial \rho}{\partial c_{j}}=-2 y_{i} \mathrm{~A}_{i j}+\mathrm{A}_{j k}^{T} \mathrm{~A}_{k l} c_{l}+c_{i} \mathrm{~A}_{i k}^{T} \mathrm{~A}_{k j} \nonumber$

Now,

$\begin{aligned} c_{i} \mathrm{~A}_{i k}^{T} \mathrm{~A}_{k j} &=c_{i} \mathrm{~A}_{k i} \mathrm{~A}_{k j} \\ &=\mathrm{A}_{k j} \mathrm{~A}_{k i} c_{i} \\ &=\mathrm{A}_{j k}^{T} \mathrm{~A}_{k i} c_{i} \\ &=\mathrm{A}_{j k}^{T} \mathrm{~A}_{k l} c_{l} \end{aligned} \nonumber$

Therefore,

$\frac{\partial \rho}{\partial c_{j}}=-2 y_{i} \mathrm{~A}_{i j}+2 \mathrm{~A}_{j k}^{T} \mathrm{~A}_{k l} c_{l} \nonumber$

With the partials set equal to zero, we have

$\mathrm{A}_{j k}^{T} \mathrm{~A}_{k l} c_{l}=y_{i} \mathrm{~A}_{i j} \nonumber$

$\mathrm{A}_{j k}^{T} \mathrm{~A}_{k l} c_{l}=\mathrm{A}_{j i}^{T} y_{i} \nonumber$

In vector notation, we have

$\mathrm{A}^{T} \mathrm{~A} \mathbf{c}=\mathrm{A}^{T} \mathbf{y} . \nonumber$

Equation (4.2) is the so-called normal equation, and can be solved for c by Gaussian elimination using the MATLAB backslash operator. After constructing the matrix A given by (4.1), and the vector y from the data, one can code in MATLAB

$c=\left(A^{\prime} A\right) \backslash\left(A^{\prime} y\right) \nonumber$

But in fact the MATLAB back slash operator will automatically solve the normal equations when the matrix $A$ is not square, so that the MATLAB code

$c=A \backslash y \nonumber$

yields the same result.

Search

Text Color

Text Size

Margin Size

Font Type

Support Center

How can we help?