1.5: Linear and Affine Functions

Last updated

Sep 2, 2021
Save as PDF
- 1.4.E: Lines, Planes, and Hyperplanes (Exercises)
- 1.5.E: Linear and Affine Functions (Exercises)

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

One of the central themes of calculus is the approximation of nonlinear functions by linear functions, with the fundamental concept being the derivative of a function. This section will introduce the linear and affine functions which will be key to understanding derivatives in the chapters ahead.

Linear functions

In the following, we will use the notation $f: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}$ to indicate a function whose domain is a subset of $\mathbb{R}^{m}$ and whose range is a subset of $\mathbb{R}^{n}$ . In other words, $f$ takes a vector with $m$ coordinates for input and returns a vector with $n$ coordinates. For example, the function

$f(x, y, z)=\left(\sin (x+y), 2 x^{2}+z\right) \nonumber$

is a function from $\mathbb{R}^{3}$ to $\mathbb{R}^{2}$ .

Definition $\PageIndex{1}$

We say a function $L: \mathbb{R}^{m} \rightarrow \mathbb{R}^{m}$ is linear if (1) for any vectors $\mathbf{x}$ and $\mathbf{y}$ in $\mathbb{R}^{m}$ ,

$L(\mathbf{x}+\mathbf{y})=L(\mathbf{x})+L(\mathbf{y}),$

and (2) for any vector $\mathbf{x}$ in $\mathbb{R}^{m}$ and scalar $a$ ,

$L(a \mathbf{x})=a L(\mathbf{x}).$

Example $\PageIndex{1}$

Suppose $f: \mathbb{R} \rightarrow \mathbb{R}$ is defined by $f(x)=3 x$ . Then for any $x$ and $y$ in $\mathbb{R}$ ,

$f(x+y)=3(x+y)=3 x+3 y=f(x)+f(y), \nonumber$

and for any scalar $a$ ,

$f(a x)=3 a x=a f(x). \nonumber$

Thus $f$ is linear.

Example $\PageIndex{2}$

Suppose $L: \mathbb{R}^{2} \rightarrow \mathbb{R}^{3}$ is defined by

$L\left(x_{1}, x_{2}\right)=\left(2 x_{1}+3 x_{2}, x_{1}-x_{2}, 4 x_{2}\right). \nonumber$

Then if $\mathbf{x}=\left(x_{1}, x_{2}\right)$ and $\mathbf{y}=\left(y_{1}, y_{2}\right)$ are vectors in $\mathbb{R}^2$ ,

$\begin{aligned} L(\mathbf{x}+\mathbf{y}) &=L\left(x_{1}+y_{1}, x_{2}+y_{2}\right) \\ &=\left(2\left(x_{1}+y_{1}\right)+3\left(x_{2}+y_{2}\right), x_{1}+y_{1}-\left(x_{2}+y_{2}\right), 4\left(x_{2}+y_{2}\right)\right) \\ &=\left(2 x_{1}+3 x_{2}, x_{1}-x_{2}, 4 x_{2}\right)+\left(2 y_{1}+3 y_{2}, y_{1}-y_{2}, 4 y_{2}\right) \\ &=L\left(x_{1}, x_{2}\right)+L\left(y_{1}, y_{2}\right) \\ &=L(\mathbf{x})+L(\mathbf{y}). \end{aligned}$

Also, for $\mathbf{x}=\left(x_{1}, x_{2}\right)$ and any scalar $a$ , we have

$\begin{aligned} L(a \mathbf{x}) &=L\left(a x_{1}, a x_{2}\right) \\ &=\left(2 a x_{1}+3 a x_{2}, a x_{1}-a x_{2}, 4 a x_{2}\right) \\ &=a\left(2 x_{2}+3 x_{2}, x_{1}-x_{2}, 4 x_{2}\right) \\ &=a L(\mathbf{x}). \end{aligned}$

Thus $L$ is linear.

Now suppose $L: \mathbb{R} \rightarrow \mathbb{R}$ is a linear function and let $a=L(1)$ . Then for any real number $x$ ,

$L(x)=L(1 x)=x L(1)=a x.$

Since any function $L: \mathbb{R} \rightarrow \mathbb{R}$ defined by $L(x)=a x$ , where $a$ is a scalar, is linear (see Exercise 1), it follows that the only functions $L: \mathbb{R} \rightarrow \mathbb{R}$ which are linear are those of the form $L(x)=a x$ for some real number $a$ . For example, $f(x)=5 x$ is a linear function, but $g(x)=\sin (x)$ is not.

Next, suppose $L: \mathbb{R}^{m} \rightarrow \mathbb{R}$ is linear and let $a_{1}=L\left(\mathbf{e}_{1}\right), a_{2}=L\left(\mathbf{e}_{2}\right), \ldots, a_{m}=L\left(\mathbf{e}_{m}\right)$ . If $\mathbf{x}=\left(x_{1}, x_{2}, \ldots, x_{m}\right)$ is a vector in $\mathbb{R}^{m}$ , then we know that

$\mathbf{x}=x_{1} \mathbf{e}_{1}+x_{2} \mathbf{e}_{2}+\cdots+x_{m} \mathbf{e}_{m}. \nonumber$

Thus

$\begin{align} L(\mathbf{x}) &=L\left(x_{1} \mathbf{e}_{1}+x_{2} \mathbf{e}_{2}+\cdots+x_{m} \mathbf{e}_{m}\right) \nonumber \\ &=L\left(x_{1} \mathbf{e}_{1}\right)+L\left(x_{2} \mathbf{e}_{2}\right)+\cdots+L\left(x_{m} \mathbf{e}_{m}\right) \nonumber\\ &=x_{1} L\left(\mathbf{e}_{1}\right)+x_{2} L\left(\mathbf{e}_{2}\right)+\cdots+x_{m} L\left(\mathbf{e}_{m}\right) \label{} \\ &=x_{1} a_{1}+x_{2} a_{2}+\cdots+x_{m} a_{m} \nonumber\\ &=\mathbf{a} \cdot \mathbf{x}, \nonumber \end{align}$

where $a=\left(a_{1}, a_{2}, \ldots, a_{m}\right)$ . Since for any vector $\mathbf{a}$ in $\mathbb{R}^m$ , the function $L(\mathbf{x})=\mathbf{a} \cdot \mathbf{x}$ is linear (see Exercise 1), it follows that the only functions $L: \mathbb{R}^{m} \rightarrow \mathbb{R}$ which are linear are those of the form $L(\mathbf{x})=\mathbf{a} \cdot \mathbf{x}$ for some fixed vector $\mathbf{a}$ in $\mathbb{R}^m$ . For example,

$f(x, y)=(2,-3) \cdot(x, y)=2 x-3 y \nonumber$

is a linear function from $\mathbb{R}^2$ to $R$ , but

$f(x, y, z)=x^{2} y+\sin (z) \nonumber$

is not a linear function from $\mathbb{R}^3$ to $R$ .

Now consider the general case where $L: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}$ is a linear function. Given a vector $\mathbf{x}$ in $\mathbb{R}^{m}$ , let $L_{k}(\mathbf{x})$ be the $k$ th coordinate of $L(\mathbf{x}), k=1,2, \ldots, n$ . That is,

$L(\mathbf{x})=\left(L_{1}(\mathbf{x}), L_{2}(\mathbf{x}), \ldots, L_{n}(\mathbf{x})\right). \nonumber$

Since $L$ is linear, for any $\mathbf{x}$ and $\mathbf{y}$ in $\mathbb{R}^m$ we have

$L(\mathbf{x}+\mathbf{y})=L(\mathbf{x})+L(\mathbf{y}), \nonumber$

or, in terms of the coordinate functions,

$\begin{aligned} \left(L_{1}(\mathbf{x}+\mathbf{y}), L_{2}(\mathbf{x}+\mathbf{y}), \ldots, L_{n}(\mathbf{x}+\mathbf{y})\right)=\left(L_{1}(\mathbf{x}), L_{2}(\mathbf{x}), \ldots,\right.&\left.L_{n}(\mathbf{x})\right) \\ &+\left(L_{1}(\mathbf{y}), L_{2}(\mathbf{y}), \ldots, L_{n}(\mathbf{y})\right) \\ =\left(L_{1}(\mathbf{x})+L_{1}(\mathbf{y}), L_{2}\right.&(\mathbf{x})+L_{2}(\mathbf{y}) \\ &\left.\ldots, L_{n}(\mathbf{x})+L_{n}(\mathbf{y})\right). \end{aligned}$

Hence $L_{k}(\mathbf{x}+\mathbf{y})=L_{k}(\mathbf{x})+L_{k}(\mathbf{y})$ for $k=1,2, \ldots, n$ . Similarly, if $\mathbf{x}$ is in $\mathbb{R}^{m}$ and $a$ is a scalar, then $L(a \mathbf{x})=a L(\mathbf{x})$ , so

$\begin{aligned} \left(L_{1}(a \mathbf{x}), L_{2}(a \mathbf{x}), \ldots, L_{n}(a \mathbf{x})\right.&=a\left(L_{1}(\mathbf{x}), L_{2}(\mathbf{x}), \ldots, L_{n}(x)\right) \\ &=\left(a L_{1}(\mathbf{x}), a L_{2}(\mathbf{x}), \ldots, a L_{n}(x)\right) . \end{aligned}$

Hence $L_{k}(a \mathbf{x})=a L_{k}(\mathbf{x})$ for $k=1,2, \ldots, n$ . Thus for each $k=1,2, \ldots, n, L_{k}: \mathbb{R}^{m} \rightarrow \mathbb{R}$ is a linear function. It follows from our work above that, for each $k=1,2, \ldots, n$ , there is a fixed vector $\mathbf{a}_{k}$ in $\mathbb{R}^{m}$ such that $L_{k}(x)=\mathbf{a}_{k} \cdot \mathbf{x}$ for all $\mathbf{x}$ in $\mathbb{R}^{m}$ . Hence we have

$L(\mathbf{x})=\left(\mathbf{a}_{1} \cdot \mathbf{x}, \mathbf{a}_{2} \cdot \mathbf{x}, \ldots, \mathbf{a}_{n} \cdot \mathbf{x}\right) \label{1.5.5}$

for all $\mathbf{x}$ in $\mathbb{R}^m$ . Since any function defined as in ( $\ref{1.5.5}$ ) is linear (see Exercise 1 again), it follows that the only linear functions from $\mathbb{R}^m$ to $\mathbb{R}^n$ must be of this form.

Theorem $\PageIndex{1}$

If $L: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}$ is linear, then there exist vectors $\mathbf{a}_{1}, \mathbf{a}_{2}, \ldots, \mathbf{a}_{n}$ in $\mathbb{R}^{m}$ such that

$L(\mathbf{x})=\left(\mathbf{a}_{1} \cdot \mathbf{x}, \mathbf{a}_{2} \cdot \mathbf{x}, \ldots, \mathbf{a}_{n} \cdot \mathbf{x}\right) \label{1.5.6}$

for all $\mathbf{x}$ in $\mathbb{R}^{m}$ .

Example $\PageIndex{3}$

In a previous example, we showed that the function $L: \mathbb{R}^{2} \rightarrow \mathbb{R}^{3}$ defined by

$L\left(x_{1}, x_{2}\right)=\left(2 x_{1}+3 x_{2}, x_{1}-x_{2}, 4 x_{2}\right) \nonumber$

is linear. We can see this more easily now by noting that

$L\left(x_{1}, x_{2}\right)=\left((2,3) \cdot\left(x_{1}, x_{2}\right),(1,-1) \cdot\left(x_{1}, x_{2}\right),(0,4) \cdot\left(x_{1}, x_{2}\right)\right). \nonumber$

Example $\PageIndex{4}$

The function

$f(x, y, z)=(x+y, \sin (x+y+z)) \nonumber$

is not linear since it cannot be written in the form of ( $\ref{1.5.6}$ ). In particular, the function $f_{2}(x, y, z)=\sin (x+y+z)$ is not linear; from our work above, it follows that $f$ is not linear.

Matrix Notation

We will now develop some notation to simplify working with expressions such as ( $\ref{1.5.6}$ ). First, we define an $n \times m$ matrix to be to be an array of real numbers with $n$ rows and $m$ columns. For example,

$M=\left[\begin{array}{rr} 2 & 3 \\ 1 & -1 \\ 0 & 4 \end{array}\right] \nonumber$

is a $3 \times 2$ matrix. Next, we will identify a vector $\mathbf{x}=\left(x_{1}, x_{2}, \ldots, x_{m}\right)$ in $\mathbb{R}^{m}$ with the $m \times 1$ matrix

$\mathbf{x}=\left[\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{m} \end{array}\right], \nonumber$

which is called a column vector. Now define the product $M \mathbf{x}$ of an $n \times m$ matrix $M$ with an $m \times 1$ column vector $\mathbf{x}$ to be the $n \times 1$ column vector whose $k$ th entry, $k=1,2, \ldots, n$ , is the dot product of the $k$ th row of $M$ with $\mathbf{x}$ . For example,

$\left[\begin{array}{rr} 2 & 3 \\ 1 & -1 \\ 0 & 4 \end{array}\right]\left[\begin{array}{l} 2 \\ 1 \end{array}\right]=\left[\begin{array}{l} 4+3 \\ 2-1 \\ 0+4 \end{array}\right]=\left[\begin{array}{l} 7 \\ 1 \\ 4 \end{array}\right]. \nonumber$

In fact, for any vector $\mathbf{x}=\left(x_{1}, x_{2}\right)$ in $\mathbb{R}^{2}$ ,

$\left[\begin{array}{rr} 2 & 3 \\ 1 & -1 \\ 0 & 4 \end{array}\right]\left[\begin{array}{l} x_{1} \\ x_{2} \end{array}\right]=\left[\begin{array}{c} 2 x_{1}+3 x_{2} \\ x_{1}-x_{2} \\ 4 x_{2} \end{array}\right]. \nonumber$

In other words, if we let

$L\left(x_{1}, x_{2}\right)=\left(2 x_{1}+3 x_{2}, x_{1}-x_{2}, 4 x_{2}\right), \nonumber$

as in a previous example, then, using column vectors, we could write

$L\left(x_{1}, x_{2}\right)=\left[\begin{array}{cc} 2 & 3 \\ 1 & -1 \\ 0 & 4 \end{array}\right]\left[\begin{array}{l} x_{1} \\ x_{2} \end{array}\right]. \nonumber$

In general, consider a linear function $L: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}$ defined by

$L(\mathbf{x})=\left(\mathbf{a}_{1} \cdot \mathbf{x}, \mathbf{a}_{2} \cdot \mathbf{x}, \ldots, \mathbf{a}_{n} \cdot \mathbf{x}\right)$

for some vectors $\mathbf{a}_{1}, \mathbf{a}_{2}, \ldots, \mathbf{a}_{n}$ in $\mathbb{R}^{m}$ . If we let $M$ be the $n \times m$ matrix whose $k$ th row is $\mathbf{a}_{k}, k=1,2, \ldots, n$ , then

$L(\mathbf{x})=M \mathbf{x}$

for any $\mathbf{x}$ in $\mathbb{R}^m$ . Now, from our work above,

$\mathbf{a}_{k}=\left(L_{k}\left(\mathbf{e}_{1}\right), L_{k}\left(\mathbf{e}_{2}\right), \ldots, L_{k}\left(\mathbf{e}_{m}\right)\right. ,$

which means that the $j$ th column of $M$ is

$\left[\begin{array}{c} L_{1}\left(\mathbf{e}_{j}\right) \\ L_{2}\left(\mathbf{e}_{j}\right) \\ \vdots \\ L_{n}\left(\mathbf{e}_{j}\right) \end{array}\right], \label{1.5.10}$

$j=1,2, \ldots, m$ . But ( $\ref{1.5.10}$ ) is just $L\left(\mathbf{e}_{j}\right)$ written as a column vector. Hence $M$ is the matrix whose columns are given by the column vectors $L\left(\mathbf{e}_{1}\right), L\left(\mathbf{e}_{2}\right), \ldots, L\left(\mathbf{e}_{m}\right)$ .

Theorem $\PageIndex{2}$

Suppose $L: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}$ is a linear function and $M$ is the $n \times m$ matrix whose $j$ th column is $L\left(\mathbf{e}_{j}\right), j=1,2, \ldots, m$ . Then for any vector $\mathbf{x}$ in $\mathbb{R}^m$ ,

$L(\mathbf{x})=M \mathbf{x}.$

Example $\PageIndex{5}$

Suppose $L: \mathbb{R}^{3} \rightarrow \mathbb{R}^{2}$ is defined by

$L(x, y, z)=(3 x-2 y+z, 4 x+y). \nonumber$

Then

$\begin{aligned} &L\left(\mathbf{e}_{1}\right)=L(1,0,0)=(3,4), \\ &L\left(\mathbf{e}_{2}\right)=L(0,1,0)=(-2,1), \end{aligned}$

and

$L\left(\mathbf{e}_{3}\right)=L(0,0,1)=(1,0). \nonumber$

So if we let

$M=\left[\begin{array}{rrr} 3 & -2 & 1 \nonumber \\ 4 & 1 & 0 \end{array}\right], \nonumber$

then

$L(x, y, z)=\left[\begin{array}{lrl} 3 & -2 & 1 \\ 4 & 1 & 0 \end{array}\right]\left[\begin{array}{l} x \\ y \\ z \end{array}\right]. \nonumber$

For example,

$\begin{equation} L(1,-1,3)=\left[\begin{array}{rrr} 3 & -2 & 1 \\ 4 & 1 & 0 \end{array}\right]\left[\begin{array}{r} 1 \\ -1 \\ 3 \end{array}\right]=\left[\begin{array}{l} 3+2+3 \\ 4-1+0 \end{array}\right]=\left[\begin{array}{l} 8 \\ 3 \end{array}\right]. \end{equation} \nonumber$

Example $\PageIndex{6}$

Let $\begin{equation} R_{\theta}: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2} \end{equation}$ be the function that rotates a vector $\mathbf{x}$ in $\mathbb{R}^2$ counterclockwise through an angle θ, as shown in Figure 1.5.1. Geometrically, it seems reasonable that $R_\theta$ is a linear function; that is, rotating the vector $\mathbf{x}+\mathbf{y}$ through an angle θ should give the same result as first rotating $\mathbf{x}$ and $\mathbf{y}$ separately through an angle θ and then adding, and rotating a vector $a \mathbf{x}$ through an angle θ should give the same result as first rotating $\mathbf{x}$ through an angle θ and then multiplying by $a$ . Now, from the definition of $\cos(\theta)$ and $\sin(\theta)$ ,

$R_{\theta}\left(\mathbf{e}_{1}\right)=R_{\theta}(1,0)=(\cos (\theta), \sin (\theta)) \nonumber$

(see Figure 1.5.2), and, since $\mathbf{e}_{2}$ is $\mathbf{e}_{1}$ rotated, counterclockwise, through an angle $\frac{\pi}{2}$ ,

$R_{\theta}\left(\mathbf{e}_{2}\right)=R_{\theta+\frac{\pi}{2}}\left(\mathbf{e}_{1}\right)=\left(\cos \left(\theta+\frac{\pi}{2}\right), \sin \left(\theta+\frac{\pi}{2}\right)\right)=(-\sin (\theta), \cos (\theta)). \nonumber$

Hence

$R_{\theta}(x, y)=\left[\begin{array}{rr} \cos (\theta) & -\sin (\theta) \\ \sin (\theta) & \cos (\theta) \end{array}\right]\left[\begin{array}{l} x \\ y \end{array}\right]. \label{1.5.12}$

Screen Shot 2021-07-19 at 09.44.11.png — Figure $\PageIndex{1}$ : Rotating a vector in the plane

Screen Shot 2021-07-19 at 09.56.23.png — Figure $\PageIndex{2}$ : Rotating $\mathbf{e}_{1}$ through an angle θ

You are asked in Exercise 9 to verify that the linear function defined in ( $\ref{1.5.12}$ ) does in fact rotate vectors through an angle θ in the counterclockwise direction. Note that, for example, when $\theta=\frac{\pi}{2}$ , we have

$R_{\frac{\pi}{2}}(x, y)=\left[\begin{array}{rr} 0 & -1 \\ 1 & 0 \end{array}\right]\left[\begin{array}{l} x \\ y \end{array}\right]. \nonumber$

In particular, note that $R_{\frac{\pi}{2}}(1,0)=(0,1)$ and $R_{\frac{\pi}{2}}(0,1)=(-1,0)$ ; that is, $R_{\frac{\pi}{2}}$ takes $\mathbf{e}_{1}$ to $\mathbf{e}_{2}$ and $\mathbf{e}_{2}$ to $-\mathbf{e}_{1}$ . For another example, if $\theta=\frac{\pi}{6}$ , then

$R_{\frac{\pi}{6}}(x, y)=\left[\begin{array}{cc} \frac{\sqrt{3}}{2} & -\frac{1}{2} \\ \frac{1}{2} & \frac{\sqrt{3}}{2} \end{array}\right]\left[\begin{array}{l} x \\ y \end{array}\right]. \nonumber$

In particular,

$\begin{equation} R_{\frac{\pi}{6}}(1,2)=\left[\begin{array}{cc} \frac{\sqrt{3}}{2} & -\frac{1}{2} \\ \frac{1}{2} & \frac{\sqrt{3}}{2} \end{array}\right]\left[\begin{array}{l} 1 \\ 2 \end{array}\right]=\left[\begin{array}{c} \frac{\sqrt{3}}{2}-1 \\ \frac{1}{2}+\sqrt{3} \end{array}\right]=\left[\begin{array}{c} \frac{\sqrt{3}-2}{2} \\ \frac{1+2 \sqrt{3}}{2} \end{array}\right] \end{equation}. \nonumber$

Affine functions

Definition $\PageIndex{2}$

We say a function $A: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}$ is affine if there is a linear function $L : \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}$ and a vector $\mathbf{b}$ in $\mathbb{R}^n$ such that

$A(\mathbf{x})=L(\mathbf{x})+\mathbf{b}$

for all $\mathbf{x}$ in $\mathbb{R}^m$ .

An affine function is just a linear function plus a translation. From our knowledge of linear functions, it follows that if $A: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n}$ is affine, then there is an $n \times m$ matrix $M$ and a vector $\mathbf{b}$ in $\mathbb{R}^n$ such that

$A(\mathbf{x})=M \mathbf{x}+\mathbf{b}$

for all $\mathbf{x}$ in $\mathbb{R}^m$ . In particular, if $f: \mathbb{R} \rightarrow \mathbb{R}$ is affine, then there are real numbers $m$ and $b$ such that

$f(x)=m x+b$

for all real numbers $x$ .

Example $\PageIndex{7}$

The function

$A(x, y)=(2 x+3, y-4 x+1) \nonumber$

is an affine function from $\mathbb{R}^{2}$ to $\mathbb{R}^{2}$ since we may write it in the form

$A(x, y)=L(x, y)+(3,1), \nonumber$

where $L$ is the linear function

$L(x, y)=(2 x, y-4 x). \nonumber$

Note that $L(1,0)=(2,-4)$ and $L(0,1)=(0,1)$ , so we may also write $A$ in the form

$A(x, y)=\left[\begin{array}{rr} 2 & 0 \\ -4 & 1 \end{array}\right]\left[\begin{array}{l} x \\ y \end{array}\right]+\left[\begin{array}{l} 3 \\ 1 \end{array}\right] . \nonumber$

Example $\PageIndex{8}$

The affine function

$A(x, y)=\left[\begin{array}{cc} \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{array}\right]\left[\begin{array}{l} x \\ y \end{array}\right]+\left[\begin{array}{l} 1 \\ 2 \end{array}\right] \nonumber$

first rotates a vector, counterclockwise, in $\mathbb{R}^{2}$ through an angle of $\frac{\pi}{4}$ and then translates it by the vector $(1,2)$ .

Search

Text Color

Text Size

Margin Size

Font Type

Linear functions

Matrix Notation

Affine functions

Support Center

How can we help?