3.1 Linear Transformations
- Page ID
- 112602
This page is a draft and is under active development.
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Introduction
Until now we have used matrices in the context of linear systems. The equation \[ A\mathbf{x } = \mathbf{b }, \nonumber \nonumber\] where \(A \) is an \(m \times n \) matrix, is just a concise way to write down a system of \(m \) linear equations in \(n \) unknowns. A different way to look at this matrix equation is to consider it as an input-output system: the left-hand side \(A\mathbf{x } \) can be seen as a mapping that sends an `input' \(\mathbf{x } \) to an `output' \(\mathbf{y }= A\mathbf{x } \). For instance, in computer graphics, typically points describing a 3D object have to be converted to points in 2D, to be able to visualize them on a screen. Or, in a dynamical system, a matrix \(A \) may describe how a system evolves from a `state' \(\mathbf{x }_{k} \) at time \(k \) to a state \(\mathbf{x }_{k+1} \)at time \(k+1 \) via : \[ \mathbf{x }_{k+1} = A\mathbf{x }_{k}. \nonumber \nonumber\] A 'state' may be anything ranging from a set of particles at certain positions, a set of pixels describing a minion, concentrations of chemical substances in a reactor tank, to population sizes of different species. Thinking mathematically we would describe such an input-output interpretation as a transformation (or: function, map, mapping, operator) \[ T: \mathbb{R}^n \to \mathbb{R}^m. \nonumber \nonumber\] We will see that these matrix transformations have two characteristic properties which makes them the protagonists of the more general linear algebra concept of a linear transformation.Matrix transformations
Let \(A \) be an \(m\times n \) matrix. We can in a natural way associate a transformation \(T_A:\mathbb{R}^n \to \mathbb{R}^m \) to the matrix \(A \).The transformation \(T_A \) corresponding to the \(m\times n \) matrix \(A \) is the mapping defined by \[ T_A(\mathbf{x }) = A\mathbf{x } \quad \text{or } \quad T:\mathbf{x } \mapsto A\mathbf{x }, \nonumber \nonumber\] where \(\mathbf{x } \in \mathbb{R}^n \). We call such a mapping a matrix transformation. Conversely we say that the matrix \(A \) represents the transformation \(T \).
The transformation corresponding to the matrix \(A = \left[\begin{array}{rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right] \) is defined by \[ T_A(\mathbf{x }) = \left[\begin{array} {rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right]\mathbf{x }. \nonumber \nonumber\] We have, for instance \[ \left[\begin{array} {rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right] \left[\begin{array} {r} 1\\1\\1 \end{array}\right] = \left[\begin{array} {r} 3 \\ 4 \end{array}\right] \quad \text{and} \quad \left[\begin{array} {rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right] \left[\begin{array} {r} 2\\-1\\0 \end{array}\right] = \left[\begin{array} {r} 0\\ 0 \end{array}\right]. \nonumber \nonumber\] According to the definition of the matrix-vector product we can also write \[ A\mathbf{x } = \left[\begin{array} {rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right] \left[\begin{array} {r} x_1\\x_2\\x_3 \end{array}\right] = x_1 \left[\begin{array} {r} 1\\ 1 \end{array}\right]+ x_2 \left[\begin{array} {r} 2 \\ 2 \end{array}\right]+ x_3 \left[\begin{array} {r} 0\\ 1 \end{array}\right]. \label{Eq:LinTrafo:AxIsLinearCombination} \]
From Equation \eqref{Eq:LinTrafo:AxIsLinearCombination} it is clear that the range of the matrix transformation in Example 2 consists of all linear combinations of the three columns of \(A \): \[ \text{Range}(T_A) = \Span{\left[\begin{array} {r} 1\\ 1 \end{array}\right], \left[\begin{array} {r} 2 \\ 2 \end{array}\right], \left[\begin{array} {r} 0\\ 1 \end{array}\right]}. \nonumber \nonumber\] In a later chapter (Sec:SubspaceRn Subspaces) we will call this the column space of the matrix \(A \).
Suppose \[ A = \left[\begin{array} \mathbf{a }_1 & \mathbf{a }_2 & \ldots & \mathbf{a }_n \end{array}\right] \nonumber \nonumber\] is an \(m\times n \) matrix. Then the range of the matrix transformation corresponding to \(A \) is the span of the columns of \(A \): \[ \text{Range}(T_A) = \Span{\mathbf{a }_1, \mathbf{a }_2,\ldots,\mathbf{a }_n }. \nonumber \nonumber\]
The matrix \[ A = \left[\begin{array}{r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array}\right] \nonumber \nonumber\] leads to the transformation \[ T_A: \mathbb{R}^2 \to \mathbb{R}^3, \quad T_A\left(\left[\begin{array} {r} x \\ y \end{array}\right]\right) = \left[\begin{array} {r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array}\right] \left[\begin{array} {r} x \\ y \end{array}\right] = \left[\begin{array} {r} x \\ y \\0 \end{array}\right]. \nonumber \nonumber\] This transformation `embeds' the plane \(\mathbb{R}^2 \) into the space \(\mathbb{R}^3 \), as depicted in Figure 1.
Figure 1. The embedding of \(\mathbb{R}^2 \) into \(\mathbb{R}^3 \)
The range of this transformation is the the span of the two vectors \[ \mathbf{e }_1 = \left[\begin{array}{r} 1\\ 0 \\ 0 \end{array}\right] \quad \text{and} \quad \mathbf{e }_2 = \left[\begin{array}{r} 0\\ 1 \\ 0 \end{array}\right], \nonumber \nonumber\] which is the the \(xy \)-plane in \(\mathbb{R}^3 \).
The transformation corresponding to the matrix \[ A = \left[\begin{array} {r} 1 & 1 \\ 0 & 0 \end{array}\right] \nonumber \nonumber\] is the mapping \[ T_A: \mathbb{R}^2 \to \mathbb{R}^2, \quad T_A\left( \left[\begin{array} {r} x \\ y \end{array}\right]\right) = \left[\begin{array} {r} x +y \\ 0 \end{array}\right]. \nonumber \nonumber\] First we observe that the range of this transformation consists of all multiples of the vector \( \left[\begin{array}{r} 1 \\ 0 \end{array}\right] \), i.e. the \(x \)-axis in the plane. Second, let us find the set of points/vectors that is mapped to an arbitrary point \(\left[\begin{array}{r} c \\ 0 \end{array}\right] \) in the range. For this we solve \[ A\mathbf{x } = \left[\begin{array} {r} 1 & 1 \\ 0 & 0 \end{array}\right] \left[\begin{array} {r} x \\ y \end{array}\right] = \left[\begin{array} {r} c \\ 0 \end{array}\right] \quad \Longleftrightarrow \quad \left[\begin{array} {r} x+y \\ 0 \end{array}\right] = \left[\begin{array} {r} c \\ 0 \end{array}\right]. \nonumber \nonumber\] The points whose coordinates satisfy this equation all lie on the line described by the equation \[ x + y = c. \nonumber \nonumber\] See Figure 2.
Figure 2. The transformation of Example 6
Find out whether the vectors \[ \mathbf{y }_1 = \left[\begin{array}{r} 2 \\ 1 \\ 0\end{array}\right] \quad \text{and} \quad \mathbf{y }_2 = \left[\begin{array}{r} 2 \\ 0 \\ 1\end{array}\right] \nonumber \nonumber\] are in the range of the matrix transformation \[ T(\mathbf{x }) = A\mathbf{x } = \left[\begin{array}{rr} 1 &1&1 \\ 1 &-1&3 \\ -1&2&-4 \end{array}\right]\mathbf{x } \nonumber\]
Consider a model with two cities between which over a fixed period of time migrations take place. Say in a period of ten years 90\ The following table contains the relevant statistics: \[ \begin{array}{c | cc} & A & B \\ \hline A & 0.9 & 0.2 \\ B & 0.1 & 0.8 \\ \hline \end{array} \nonumber \nonumber\] For instance, if at time 0 the population in city \(A \) amounts to 50 (thousand) and in city \(B \) live 100 (thousand) people, then at the end of one period the population in city \(A \) amounts to \[ 0.9 \times 50 + 0.2 \times 100 = 55. \nonumber \nonumber\] Likewise for city \(B \). If we denote the population sizes after \(k \) periods by a vector \[ \mathbf{x }_k =\left[\begin{array}{r} x_k \\ y_k \end{array}\right] \nonumber \nonumber\] it follows that \[ \left[\begin{array}{r} x_{k+1} \\ y_{k+1} \end{array}\right] = \left[\begin{array} 0.9x_k + 0.2y_k \\0.1x_k + 0.8y_k \end{array}\right], \quad \text{i.e., } \mathbf{x }_{k+1} = \left[\begin{array}{r} 0.9 & 0.2 \\ 0.1 & 0.8 \end{array}\right]\left[\begin{array} x_k \\ y_k \end{array}\right] = M \mathbf{x }_{k}. \nonumber \nonumber\] The \(M \) stands for migration matrix. Obviously this model can be generalized to a `world' with any number of cities.
Linear Transformations
In the previous section we saw that the matrix transformation \(\mathbf{y }=A\mathbf{x } \) can also be seen as a mapping \(T(\mathbf{x }) = A\mathbf{x } \). This mapping has special properties that are defined in this section.A linear transformation is a function \(T \) from \(\mathbb{R}^n \) to \(\mathbb{R}^m \) that has the following properties
- For all vectors \(\mathbf{v }_1, \mathbf{v }_2 \) in \(\mathbb{R}^n \): \[ T(\mathbf{v }_1+\mathbf{v }_2) = T(\mathbf{v }_1) + T(\mathbf{v }_2). \nonumber \nonumber\]
- For all vectors \(\mathbf{v } \) in \(\mathbb{R}^n \) and all scalars \(c \) in \(\mathbb{R} \): \[ T(c\mathbf{v }) = c T(\mathbf{v }). \nonumber \nonumber\]
Show that a linear transformation from \(\mathbb{R}^n \) to \(\mathbb{R}^m \) always sends the zero vector to the zero vector, i.e. \[ \text{if } T:\mathbb{R}^n \to\mathbb{R}^m \text{ is a linear transformation, then } T(\mathbf{0 }_n) = \mathbf{0 }_m. \nonumber \nonumber\]
Consider the map \(T:\mathbb{R}^2\rightarrow\mathbb{R}^3 \) that sends each vector \(\left[\begin{array}{r} x \\ y \end{array}\right] \) in \(\mathbb{R}^2 \) to the vector \(\left[\begin{array}{r} x \\ y \\ 0 \end{array}\right] \) in \(\mathbb{R}^3 \). Let us check that this a linear map. For that, we need to check the two properties in the definition. For property (i) we take two vectors \[ \left[\begin{array} {r} x_1 \\ y_1 \end{array}\right] \quad \text{ and }\quad \left[\begin{array}{r} x_2 \\ y_2 \end{array}\right] \quad \text{in } \mathbb{R}^2, \nonumber \nonumber\] and see: \[ T\left( \left[\begin{array} {r} x_1 \\ y_1 \end{array}\right] + \left[\begin{array} {r} x_2 \\ y_2 \end{array}\right] \right) = T\left(\left[\begin{array} {r} x_1+x_2 \\ y_1+y_2 \end{array}\right]\right) = \left[\begin{array} {r} x_1 + x_2 \\ y_1 + y_2 \\ 0 \end{array}\right] = \left[\begin{array} {r} x_1 \\ y_1 \\ 0 \end{array}\right] + \left[\begin{array} {r} x_2 \\ y_2 \\ 0 \end{array}\right]. \nonumber \nonumber\] This last vector indeed equals \[ T\left( \left[\begin{array} {r} x_1 \\ y_1 \end{array}\right]\right) + T\left( \left[\begin{array} {r} x_2 \\ y_2 \end{array}\right]\right). \nonumber \nonumber\] Similarly, for the second property, given any scalar \(c \), \[ T\left(c\left[\begin{array} {r} x_1 \\ y_1 \end{array}\right]\right) = T\left(\left[\begin{array} {r} c x_1 \\ cy_1 \end{array}\right]\right) = \left[\begin{array} {r} c x_1 \\ c y_1 \\ 0 \end{array}\right] = c\left[\begin{array} {r} x_1 \\ y_1 \\ 0 \end{array}\right]= cT\left(\left[\begin{array} {r} x_1 \\ y_1 \end{array}\right]\right). \nonumber \nonumber\] So indeed \(T \) has the two properties of a linear transformation.
Consider the mapping \(T:\mathbb{R}^2\rightarrow\mathbb{R}^2 \) that sends each vector \(\left[\begin{array}{r} x \\ y \end{array}\right] \) in \(\mathbb{R}^2 \) to the vector \(\left[\begin{array}{r} x+y \\ xy \end{array}\right] \): \[ T\!: \left[\begin{array}{r} x \\ y \end{array}\right] \mapsto \left[\begin{array}{r} x+y \\ xy \end{array}\right] \nonumber \nonumber\] This mapping is not a linear transformation. \[ T\left(\left[\begin{array}{r} 1 \\ 1 \end{array}\right] + \left[\begin{array}{r} 1 \\ 2 \end{array}\right]\right) = T\left(\left[\begin{array}{r}2 \\ 3 \end{array}\right]\right) = \left[\begin{array}{r} 5 \\ 6\end{array}\right], \nonumber \nonumber\] whereas \[ T\left(\left[\begin{array}{r} 1 \\ 1 \end{array}\right]\right) + T\left(\left[\begin{array}{r} 1 \\ 2 \end{array}\right]\right) = \left[\begin{array}{r} 2 \\ 1 \end{array}\right] + \left[\begin{array}{r} 3 \\ 2 \end{array}\right] = \left[\begin{array}{r} 5 \\ 3 \end{array}\right] \neq \left[\begin{array}{r} 5 \\ 6\end{array}\right] . \nonumber \nonumber\] The second requirement of a linear transformation is violated as well: \[ T\left(3\left[\begin{array}{r} 1 \\ 1 \end{array}\right]\right) = T\left(\left[\begin{array}{r} 3 \\ 3 \end{array}\right]\right) = \left[\begin{array}{r} 6 \\ 9 \end{array}\right] \quad\neq\quad 3 T\left(\left[\begin{array}{r} 1 \\ 1 \end{array}\right] \right) = 3 \left[\begin{array}{r} 2 \\ 1 \end{array}\right] = \left[\begin{array}{r} 6 \\ 3 \end{array}\right]. \nonumber \nonumber\]
Let \(\mathbf{p } \) be a nonzero vector in \(\mathbb{R}^2 \). Is the translation \[ T\!:\mathbb{R}^2 \to \mathbb{R}^2, \quad \mathbf{x } \mapsto \mathbf{x } + \mathbf{p } \nonumber \nonumber\] a linear transformation?
Each matrix transformation is a linear transformation.
Skip/Read the proof -
This is a direct consequence of two properties of the matrix-vector product (Proposition Prop:MatVecProduct:Linearity) \[ A (\mathbf{x }+\mathbf{y } ) = A\mathbf{x } + A\mathbf{y } \quad \text{and} \quad A (c\mathbf{x }) = c A\mathbf{x }. \nonumber \nonumber\]
Suppose \(T: \mathbb{R}^n\to\mathbb{R}^m \) and \(S:\mathbb{R}^m\to\mathbb{R}^p \) are linear transformations. Then the transformation \(ST:\mathbb{R}^n\to\mathbb{R}^p \) defined by \[ ST(\mathbf{x }) = S(T(\mathbf{x })) \nonumber \nonumber\] is a linear transformation from \(\mathbb{R}^n \) to \(\mathbb{R}^p \).
Skip/Read the proof -
Suppose that \[ T(\mathbf{x }+\mathbf{y }) = T(\mathbf{x })+T(\mathbf{y })\quad \text{and} \quad T(c\mathbf{x }) = cT(\mathbf{x }) \quad \text{ for } \mathbf{x }, \mathbf{y } \text{ in } \mathbb{R}^n, c \text{ in } \mathbb{R} \nonumber\] and likewise for \(S \). Then \[ \begin{array}{rl} ST(\mathbf{x }+\mathbf{y }) = S(T(\mathbf{x }+\mathbf{y })) = S( T(\mathbf{x })+T(\mathbf{y })) & = S(T(\mathbf{x })) + S(T(\mathbf{y })) \\ & = ST(\mathbf{x }) + ST(\mathbf{y }) \end{array} \nonumber\] and \[ ST(c\mathbf{x }) = S(T(c\mathbf{x })) = S(cT(\mathbf{x })) = cS(T(\mathbf{x })) = c ST(\mathbf{x }). \nonumber\] Hence \(ST \) satisfies the two requirements of a linear transformation.
There are other ways to combine linear transformations. The sum \(S = T_1 + T_2 \) of two linear transformation \(T_1,T_2: \mathbb{R}^n \to \mathbb{R}^m \) is defined as follows \[ S: \mathbb{R}^n \to \mathbb{R}^m, \quad S(\mathbf{x }) = T_1(\mathbf{x }) + T_2(\mathbf{x }). \nonumber\] And the (scalar) multiple \(T_3 = cT_1 \) is the transformation \[ T_3: \mathbb{R}^n \to \mathbb{R}^m, \quad T_3(\mathbf{x }) = cT_1(\mathbf{x }). \nonumber\] Show that \(S \) and \(T_3 \) are again linear transformations.
Standard Matrix for a Linear Transformation
We have seen that every matrix transformation is a linear transformation. In this subsection we will show that conversely every linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m \) is can be represented as a matrix transformation. The key to construct a matrix that represents a given linear transformation lies in the following proposition.Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m \) is a linear transformation. Then the following property holds: for each set of vectors \(\mathbf{x }_1, \ldots, \mathbf{x }_k \) in \(\mathbb{R}^n \) and each set of numbers \(c_1,\ldots,c_k \) in \(\mathbb{R} \): \[ T(c_1\mathbf{x }_1+c_2 \mathbf{x }_2+\ldots +c_k \mathbf{x }_k) = c_1T(\mathbf{x }_1)+c_2T(\mathbf{x }_2)+\ldots +c_kT( \mathbf{x }_k). \nonumber \nonumber\] In words: for any linear transformation \[ \begin{array}{l} \text{the image of a linear combination of vectors is equal to }\\ \text{the linear combination of their images.} \end{array} \nonumber \nonumber\]
Skip/Read the proof -
Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m \) is a linear transformation. So we have \[ \text{(i) }T(\mathbf{x }+\mathbf{y }) = T(\mathbf{x })+T(\mathbf{y }) \quad\text{and} \quad \text{(ii) } T(c\mathbf{x }) = c T(\mathbf{x }). \nonumber \nonumber\] First apply rule (i) to split the term on the left into \(k \) terms: \[ \begin{array}{ccl} T(c_1\mathbf{x }_1+c_2 \mathbf{x }_2+\ldots +c_k \mathbf{x }_k) &=& T(c_1\mathbf{x }_1)+T(c_2 \mathbf{x }_2+\ldots +c_k \mathbf{x }_k) \\ &=& \quad \ldots \\ &=& T(c_1\mathbf{x }_1)+T(c_2 \mathbf{x }_2)+\ldots + T(c_k \mathbf{x }_k) \end{array} \nonumber \nonumber\] and then apply rule (ii) to each term.
Suppose \(T: \mathbb{R}^3 \to \mathbb{R}^2 \) is a linear transformation, and we know that for \[ \mathbf{a }_1 = \left[\begin{array}{r} 1 \\ 0 \\ 0 \end{array}\right], \quad \mathbf{a }_2 = \left[\begin{array}{r} 1 \\ 1 \\ 0 \end{array}\right], \quad \mathbf{a }_3 = \left[\begin{array}{r} 1 \\ 1 \\ 1 \end{array}\right] \nonumber \nonumber\] the images under \(T \) are given by \[ T(\mathbf{a }_1) = b_1 = \left[\begin{array}{r} 1 \\ 2 \end{array}\right], \quad T(\mathbf{a }_2) = b_2 = \left[\begin{array}{r} 3 \\ -1 \end{array}\right], \quad \text{and} \quad T(\mathbf{a }_3) = b_3 = \left[\begin{array}{r} 2 \\ -2 \end{array}\right]. \nonumber \nonumber\] Then for the vector \[ \mathbf{v } = \left[\begin{array}{r} 4 \\ 1 \\ -1 \end{array}\right] = 3 \mathbf{a }_1 + 2 \mathbf{a }_2 -1 \mathbf{a }_3 \nonumber \nonumber\] it follows that \[ T(\mathbf{v }) = 3 \mathbf{b }_1 + 2 \mathbf{b }_2 + (-1) \mathbf{b }_3 = 3 \left[\begin{array}{r} 1 \\ 2 \end{array}\right] + 2 \left[\begin{array}{r} 3 \\ -1 \end{array}\right] + (-1) \left[\begin{array}{r} 2 \\ -2 \end{array}\right]= \left[\begin{array}{r} 7 \\ 6 \end{array}\right]. \nonumber \nonumber\]
Suppose \(T \) is a linear transformation from \(\mathbb{R}^2 \) to \(\mathbb{R}^2 \) for which \[ T(\mathbf{e }_1) = \mathbf{a }_1 = \left[\begin{array}{r}1 \\2 \end{array}\right], \quad T(\mathbf{e }_2) = \mathbf{a }_2 = \left[\begin{array}{r}4 \\3 \end{array}\right]. \nonumber \nonumber\] Then for an arbitrary vector \[ \mathbf{x } = \left[\begin{array}{r}x_1\\x_2\end{array}\right] = x_1 \left[\begin{array}{r}1\\0 \end{array}\right] + x_2 \left[\begin{array}{r}0\\1 \end{array}\right] = x_1\mathbf{e }_1 + x_2\mathbf{e }_2, \nonumber \nonumber\] it follows that \[ \begin{array}{rcl} T(\mathbf{x }) &=& x_1T(\mathbf{e }_1) + x_2T(\mathbf{e }_2) \\ &=& x_1\left[\begin{array}{rr}1 \\2 \end{array}\right] + x_2\left[\begin{array}{r}4 \\3 \end{array}\right] = \left[\begin{array}{r}1 &4 \\2 &3\end{array}\right]\mathbf{x }. \end{array} \nonumber \nonumber\] So we see that \[ T(\mathbf{x }) = A \mathbf{x }, \quad\text{where} \quad A = [ T(\mathbf{e }_1) T(\mathbf{e }_2) ]. \nonumber \nonumber\]
Each linear transformation \(T \) from \(\mathbb{R}^n \) to \(\mathbb{R}^m \) is a matrix transformation. More specific: if \(T: \mathbb{R}^n \to \mathbb{R}^m \) is linear, then for each \(\mathbf{x } \) in \(\mathbb{R}^n \) \[ T(\mathbf{x }) = A\mathbf{x }, \quad \text{where} \quad A = \left[\begin{array}{rrr} T(\mathbf{e }_1) & T(\mathbf{e }_2) & \ldots & T(\mathbf{e }_n) \end{array}\right]. \nonumber \nonumber\]
Skip/Read the proof -
We can more or less copy the derivation in Example 19. First of all, any vector \(\mathbf{x } \) is a linear combination of the standard basis: \[ \mathbf{x } = \left[\begin{array}{r}x_1\\x_2\\ \vdots \\ x_n \end{array}\right] = x_1 \left[\begin{array}{r}1 \\ 0 \\ \vdots \\ 0 \end{array}\right] + x_2 \left[\begin{array}{r}0 \\ 1 \\ \vdots \\ 0 \end{array}\right] + \ldots + x_n \left[\begin{array}{r}0 \\ 0 \\ \vdots \\ 1 \end{array}\right], \nonumber \nonumber\] i.e., \[ \mathbf{x } = x_1 \mathbf{e }_1 + x_2 \mathbf{e }_2 + \ldots + x_n \mathbf{e }_n. \nonumber \nonumber\] From Proposition 17 it follows that \[ T( \mathbf{x }) = x_1 T(\mathbf{e }_1) + x_2 T(\mathbf{e }_2) + \ldots + x_n T(\mathbf{e }_n). \nonumber \nonumber\] The last expression is a linear combination of \(n \) vectors in \(\mathbb{R}^m \), and this can be written as a matrix-vector product: \[ x_1 T(\mathbf{e }_1) + x_2 T(\mathbf{e }_2) + \ldots + x_n T(\mathbf{e }_n) = \left[\begin{array}{rrr} T(\mathbf{e }_1) & T(\mathbf{e }_2) & \ldots & T(\mathbf{e }_n) \end{array}\right] \mathbf{x }. \nonumber \nonumber\]