Skip to main content
Mathematics LibreTexts

3.1 Linear Transformations

  • Page ID
    112602
  • This page is a draft and is under active development. 

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
    \( \def\Span#1{\text{Span}\left\lbrace #1\right\rbrace} \)

    Introduction

    Until now we have used matrices in the context of linear systems. The equation \[ A\mathbf{x } = \mathbf{b }, \nonumber \nonumber\] where \(A \) is an \(m \times n \) matrix, is just a concise way to write down a system of \(m \) linear equations in \(n \) unknowns. A different way to look at this matrix equation is to consider it as an input-output system: the left-hand side \(A\mathbf{x } \) can be seen as a mapping that sends an `input' \(\mathbf{x } \) to an `output' \(\mathbf{y }= A\mathbf{x } \). For instance, in computer graphics, typically points describing a 3D object have to be converted to points in 2D, to be able to visualize them on a screen. Or, in a dynamical system, a matrix \(A \) may describe how a system evolves from a `state' \(\mathbf{x }_{k} \) at time \(k \) to a state \(\mathbf{x }_{k+1} \)at time \(k+1 \) via : \[ \mathbf{x }_{k+1} = A\mathbf{x }_{k}. \nonumber \nonumber\] A 'state' may be anything ranging from a set of particles at certain positions, a set of pixels describing a minion, concentrations of chemical substances in a reactor tank, to population sizes of different species. Thinking mathematically we would describe such an input-output interpretation as a transformation (or: function, map, mapping, operator) \[ T: \mathbb{R}^n \to \mathbb{R}^m. \nonumber \nonumber\] We will see that these matrix transformations have two characteristic properties which makes them the protagonists of the more general linear algebra concept of a linear transformation.

    Matrix transformations

    Let \(A \) be an \(m\times n \) matrix. We can in a natural way associate a transformation \(T_A:\mathbb{R}^n \to \mathbb{R}^m \) to the matrix \(A \).
    Definition
    The transformation \(T_A \) corresponding to the \(m\times n \) matrix \(A \) is the mapping defined by \[ T_A(\mathbf{x }) = A\mathbf{x } \quad \text{or } \quad T:\mathbf{x } \mapsto A\mathbf{x }, \nonumber \nonumber\] where \(\mathbf{x } \in \mathbb{R}^n \). We call such a mapping a matrix transformation. Conversely we say that the matrix \(A \) represents the transformation \(T \).
    As a first example consider the following.
    Example
    The transformation corresponding to the matrix \(A = \left[\begin{array}{rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right] \) is defined by \[ T_A(\mathbf{x }) = \left[\begin{array} {rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right]\mathbf{x }. \nonumber \nonumber\] We have, for instance \[ \left[\begin{array} {rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right] \left[\begin{array} {r} 1\\1\\1 \end{array}\right] = \left[\begin{array} {r} 3 \\ 4 \end{array}\right] \quad \text{and} \quad \left[\begin{array} {rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right] \left[\begin{array} {r} 2\\-1\\0 \end{array}\right] = \left[\begin{array} {r} 0\\ 0 \end{array}\right]. \nonumber \nonumber\] According to the definition of the matrix-vector product we can also write \[ A\mathbf{x } = \left[\begin{array} {rr} 1 & 2 & 0\\ 1 & 2 & 1 \end{array}\right] \left[\begin{array} {r} x_1\\x_2\\x_3 \end{array}\right] = x_1 \left[\begin{array} {r} 1\\ 1 \end{array}\right]+ x_2 \left[\begin{array} {r} 2 \\ 2 \end{array}\right]+ x_3 \left[\begin{array} {r} 0\\ 1 \end{array}\right]. \label{Eq:LinTrafo:AxIsLinearCombination} \]
    We recall that for a transformation \(T \) from a domain \(D \) to a codomain \(E \) the range \(R= R_T \) is defined as the set of all images of elements of \(D \) in \(E \): \[ R_T = \{\text{ all images } T(x) \text{ for } x \text{ in }D\}. \nonumber \nonumber\]
    Note
    From Equation \eqref{Eq:LinTrafo:AxIsLinearCombination} it is clear that the range of the matrix transformation in Example 2 consists of all linear combinations of the three columns of \(A \): \[ \text{Range}(T_A) = \Span{\left[\begin{array} {r} 1\\ 1 \end{array}\right], \left[\begin{array} {r} 2 \\ 2 \end{array}\right], \left[\begin{array} {r} 0\\ 1 \end{array}\right]}. \nonumber \nonumber\] In a later chapter (Sec:SubspaceRn Subspaces) we will call this the column space of the matrix \(A \).
    The first example leads to a first property of matrix transformations:
    Proposition
    Suppose \[ A = \left[\begin{array} \mathbf{a }_1 & \mathbf{a }_2 & \ldots & \mathbf{a }_n \end{array}\right] \nonumber \nonumber\] is an \(m\times n \) matrix. Then the range of the matrix transformation corresponding to \(A \) is the span of the columns of \(A \): \[ \text{Range}(T_A) = \Span{\mathbf{a }_1, \mathbf{a }_2,\ldots,\mathbf{a }_n }. \nonumber \nonumber\]
    Example
    The matrix \[ A = \left[\begin{array}{r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array}\right] \nonumber \nonumber\] leads to the transformation \[ T_A: \mathbb{R}^2 \to \mathbb{R}^3, \quad T_A\left(\left[\begin{array} {r} x \\ y \end{array}\right]\right) = \left[\begin{array} {r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array}\right] \left[\begin{array} {r} x \\ y \end{array}\right] = \left[\begin{array} {r} x \\ y \\0 \end{array}\right]. \nonumber \nonumber\] This transformation `embeds' the plane \(\mathbb{R}^2 \) into the space \(\mathbb{R}^3 \), as depicted in Figure 1.
     
    Fig-LinTrafo-EmbedR2R3.svg
    Figure 1. The embedding of \(\mathbb{R}^2 \) into \(\mathbb{R}^3 \)

    The range of this transformation is the the span of the two vectors \[ \mathbf{e }_1 = \left[\begin{array}{r} 1\\ 0 \\ 0 \end{array}\right] \quad \text{and} \quad \mathbf{e }_2 = \left[\begin{array}{r} 0\\ 1 \\ 0 \end{array}\right], \nonumber \nonumber\] which is the the \(xy \)-plane in \(\mathbb{R}^3 \).
    For \(2\times2 \) and \(3\times3 \) matrices the transformations often have a geometric interpretation, as the following example illustrates.
    Example
    The transformation corresponding to the matrix \[ A = \left[\begin{array} {r} 1 & 1 \\ 0 & 0 \end{array}\right] \nonumber \nonumber\] is the mapping \[ T_A: \mathbb{R}^2 \to \mathbb{R}^2, \quad T_A\left( \left[\begin{array} {r} x \\ y \end{array}\right]\right) = \left[\begin{array} {r} x +y \\ 0 \end{array}\right]. \nonumber \nonumber\] First we observe that the range of this transformation consists of all multiples of the vector \( \left[\begin{array}{r} 1 \\ 0 \end{array}\right] \), i.e. the \(x \)-axis in the plane. Second, let us find the set of points/vectors that is mapped to an arbitrary point \(\left[\begin{array}{r} c \\ 0 \end{array}\right] \) in the range. For this we solve \[ A\mathbf{x } = \left[\begin{array} {r} 1 & 1 \\ 0 & 0 \end{array}\right] \left[\begin{array} {r} x \\ y \end{array}\right] = \left[\begin{array} {r} c \\ 0 \end{array}\right] \quad \Longleftrightarrow \quad \left[\begin{array} {r} x+y \\ 0 \end{array}\right] = \left[\begin{array} {r} c \\ 0 \end{array}\right]. \nonumber \nonumber\] The points whose coordinates satisfy this equation all lie on the line described by the equation \[ x + y = c. \nonumber \nonumber\] See Figure 2.
    Fig-LinTrafo-SkewProjection.svg
    Figure 2. The transformation of Example 6

    Exercise
    Find out whether the vectors \[ \mathbf{y }_1 = \left[\begin{array}{r} 2 \\ 1 \\ 0\end{array}\right] \quad \text{and} \quad \mathbf{y }_2 = \left[\begin{array}{r} 2 \\ 0 \\ 1\end{array}\right] \nonumber \nonumber\] are in the range of the matrix transformation \[ T(\mathbf{x }) = A\mathbf{x } = \left[\begin{array}{rr} 1 &1&1 \\ 1 &-1&3 \\ -1&2&-4 \end{array}\right]\mathbf{x } \nonumber\]
    We close this subsection with an example of a matrix transformation representing a very elementary dynamical system.
    Example
    Consider a model with two cities between which over a fixed period of time migrations take place. Say in a period of ten years 90\ The following table contains the relevant statistics: \[ \begin{array}{c | cc} & A & B \\ \hline A & 0.9 & 0.2 \\ B & 0.1 & 0.8 \\ \hline \end{array} \nonumber \nonumber\] For instance, if at time 0 the population in city \(A \) amounts to 50 (thousand) and in city \(B \) live 100 (thousand) people, then at the end of one period the population in city \(A \) amounts to \[ 0.9 \times 50 + 0.2 \times 100 = 55. \nonumber \nonumber\] Likewise for city \(B \). If we denote the population sizes after \(k \) periods by a vector \[ \mathbf{x }_k =\left[\begin{array}{r} x_k \\ y_k \end{array}\right] \nonumber \nonumber\] it follows that \[ \left[\begin{array}{r} x_{k+1} \\ y_{k+1} \end{array}\right] = \left[\begin{array} 0.9x_k + 0.2y_k \\0.1x_k + 0.8y_k \end{array}\right], \quad \text{i.e., } \mathbf{x }_{k+1} = \left[\begin{array}{r} 0.9 & 0.2 \\ 0.1 & 0.8 \end{array}\right]\left[\begin{array} x_k \\ y_k \end{array}\right] = M \mathbf{x }_{k}. \nonumber \nonumber\] The \(M \) stands for migration matrix. Obviously this model can be generalized to a `world' with any number of cities.

    Linear Transformations

    In the previous section we saw that the matrix transformation \(\mathbf{y }=A\mathbf{x } \) can also be seen as a mapping \(T(\mathbf{x }) = A\mathbf{x } \). This mapping has special properties that are defined in this section.
    Definition
    A linear transformation is a function \(T \) from \(\mathbb{R}^n \) to \(\mathbb{R}^m \) that has the following properties
    1. For all vectors \(\mathbf{v }_1, \mathbf{v }_2 \) in \(\mathbb{R}^n \): \[ T(\mathbf{v }_1+\mathbf{v }_2) = T(\mathbf{v }_1) + T(\mathbf{v }_2). \nonumber \nonumber\]
    2. For all vectors \(\mathbf{v } \) in \(\mathbb{R}^n \) and all scalars \(c \) in \(\mathbb{R} \): \[ T(c\mathbf{v }) = c T(\mathbf{v }). \nonumber \nonumber\]
    Exercise
    Show that a linear transformation from \(\mathbb{R}^n \) to \(\mathbb{R}^m \) always sends the zero vector to the zero vector, i.e. \[ \text{if } T:\mathbb{R}^n \to\mathbb{R}^m \text{ is a linear transformation, then } T(\mathbf{0 }_n) = \mathbf{0 }_m. \nonumber \nonumber\]
    Example
    Consider the map \(T:\mathbb{R}^2\rightarrow\mathbb{R}^3 \) that sends each vector \(\left[\begin{array}{r} x \\ y \end{array}\right] \) in \(\mathbb{R}^2 \) to the vector \(\left[\begin{array}{r} x \\ y \\ 0 \end{array}\right] \) in \(\mathbb{R}^3 \). Let us check that this a linear map. For that, we need to check the two properties in the definition. For property (i) we take two vectors \[ \left[\begin{array} {r} x_1 \\ y_1 \end{array}\right] \quad \text{ and }\quad \left[\begin{array}{r} x_2 \\ y_2 \end{array}\right] \quad \text{in } \mathbb{R}^2, \nonumber \nonumber\] and see: \[ T\left( \left[\begin{array} {r} x_1 \\ y_1 \end{array}\right] + \left[\begin{array} {r} x_2 \\ y_2 \end{array}\right] \right) = T\left(\left[\begin{array} {r} x_1+x_2 \\ y_1+y_2 \end{array}\right]\right) = \left[\begin{array} {r} x_1 + x_2 \\ y_1 + y_2 \\ 0 \end{array}\right] = \left[\begin{array} {r} x_1 \\ y_1 \\ 0 \end{array}\right] + \left[\begin{array} {r} x_2 \\ y_2 \\ 0 \end{array}\right]. \nonumber \nonumber\] This last vector indeed equals \[ T\left( \left[\begin{array} {r} x_1 \\ y_1 \end{array}\right]\right) + T\left( \left[\begin{array} {r} x_2 \\ y_2 \end{array}\right]\right). \nonumber \nonumber\] Similarly, for the second property, given any scalar \(c \), \[ T\left(c\left[\begin{array} {r} x_1 \\ y_1 \end{array}\right]\right) = T\left(\left[\begin{array} {r} c x_1 \\ cy_1 \end{array}\right]\right) = \left[\begin{array} {r} c x_1 \\ c y_1 \\ 0 \end{array}\right] = c\left[\begin{array} {r} x_1 \\ y_1 \\ 0 \end{array}\right]= cT\left(\left[\begin{array} {r} x_1 \\ y_1 \end{array}\right]\right). \nonumber \nonumber\] So indeed \(T \) has the two properties of a linear transformation.
    Example
    Consider the mapping \(T:\mathbb{R}^2\rightarrow\mathbb{R}^2 \) that sends each vector \(\left[\begin{array}{r} x \\ y \end{array}\right] \) in \(\mathbb{R}^2 \) to the vector \(\left[\begin{array}{r} x+y \\ xy \end{array}\right] \): \[ T\!: \left[\begin{array}{r} x \\ y \end{array}\right] \mapsto \left[\begin{array}{r} x+y \\ xy \end{array}\right] \nonumber \nonumber\] This mapping is not a linear transformation. \[ T\left(\left[\begin{array}{r} 1 \\ 1 \end{array}\right] + \left[\begin{array}{r} 1 \\ 2 \end{array}\right]\right) = T\left(\left[\begin{array}{r}2 \\ 3 \end{array}\right]\right) = \left[\begin{array}{r} 5 \\ 6\end{array}\right], \nonumber \nonumber\] whereas \[ T\left(\left[\begin{array}{r} 1 \\ 1 \end{array}\right]\right) + T\left(\left[\begin{array}{r} 1 \\ 2 \end{array}\right]\right) = \left[\begin{array}{r} 2 \\ 1 \end{array}\right] + \left[\begin{array}{r} 3 \\ 2 \end{array}\right] = \left[\begin{array}{r} 5 \\ 3 \end{array}\right] \neq \left[\begin{array}{r} 5 \\ 6\end{array}\right] . \nonumber \nonumber\] The second requirement of a linear transformation is violated as well: \[ T\left(3\left[\begin{array}{r} 1 \\ 1 \end{array}\right]\right) = T\left(\left[\begin{array}{r} 3 \\ 3 \end{array}\right]\right) = \left[\begin{array}{r} 6 \\ 9 \end{array}\right] \quad\neq\quad 3 T\left(\left[\begin{array}{r} 1 \\ 1 \end{array}\right] \right) = 3 \left[\begin{array}{r} 2 \\ 1 \end{array}\right] = \left[\begin{array}{r} 6 \\ 3 \end{array}\right]. \nonumber \nonumber\]
    Exercise
    Let \(\mathbf{p } \) be a nonzero vector in \(\mathbb{R}^2 \). Is the translation \[ T\!:\mathbb{R}^2 \to \mathbb{R}^2, \quad \mathbf{x } \mapsto \mathbf{x } + \mathbf{p } \nonumber \nonumber\] a linear transformation?
    Note that Example 11 was in fact the first example of a matrix transformation in Subsection 1: \[ \left[\begin{array}{r} x \\ y \end{array}\right] \mapsto \left[\begin{array}{r} x \\ y \\ 0 \end{array}\right] = \left[\begin{array}{r} 1 & 0 \\ 0&1 \\ 0&0 \end{array}\right] \left[\begin{array}{r} x \\ y \end{array}\right] \nonumber \nonumber\] As we will see: any linear transformation from \(\mathbb{R}^n \) to \(\mathbb{R}^m \) is a matrix transformation. The converse is true as well. This is the context of the next proposition.
    Proposition
    Each matrix transformation is a linear transformation.
    Skip/Read the proof
    Proof
    This is a direct consequence of two properties of the matrix-vector product (Proposition Prop:MatVecProduct:Linearity) \[ A (\mathbf{x }+\mathbf{y } ) = A\mathbf{x } + A\mathbf{y } \quad \text{and} \quad A (c\mathbf{x }) = c A\mathbf{x }. \nonumber \nonumber\]
    Proposition
    Suppose \(T: \mathbb{R}^n\to\mathbb{R}^m \) and \(S:\mathbb{R}^m\to\mathbb{R}^p \) are linear transformations. Then the transformation \(ST:\mathbb{R}^n\to\mathbb{R}^p \) defined by \[ ST(\mathbf{x }) = S(T(\mathbf{x })) \nonumber \nonumber\] is a linear transformation from \(\mathbb{R}^n \) to \(\mathbb{R}^p \).
    Skip/Read the proof
    Proof
    Suppose that \[ T(\mathbf{x }+\mathbf{y }) = T(\mathbf{x })+T(\mathbf{y })\quad \text{and} \quad T(c\mathbf{x }) = cT(\mathbf{x }) \quad \text{ for } \mathbf{x }, \mathbf{y } \text{ in } \mathbb{R}^n, c \text{ in } \mathbb{R} \nonumber\] and likewise for \(S \). Then \[ \begin{array}{rl} ST(\mathbf{x }+\mathbf{y }) = S(T(\mathbf{x }+\mathbf{y })) = S( T(\mathbf{x })+T(\mathbf{y })) & = S(T(\mathbf{x })) + S(T(\mathbf{y })) \\ & = ST(\mathbf{x }) + ST(\mathbf{y }) \end{array} \nonumber\] and \[ ST(c\mathbf{x }) = S(T(c\mathbf{x })) = S(cT(\mathbf{x })) = cS(T(\mathbf{x })) = c ST(\mathbf{x }). \nonumber\] Hence \(ST \) satisfies the two requirements of a linear transformation.
    In words: the composition/concatenation of two linear transformations is itself a linear transformation.
    Exercise
    There are other ways to combine linear transformations. The sum \(S = T_1 + T_2 \) of two linear transformation \(T_1,T_2: \mathbb{R}^n \to \mathbb{R}^m \) is defined as follows \[ S: \mathbb{R}^n \to \mathbb{R}^m, \quad S(\mathbf{x }) = T_1(\mathbf{x }) + T_2(\mathbf{x }). \nonumber\] And the (scalar) multiple \(T_3 = cT_1 \) is the transformation \[ T_3: \mathbb{R}^n \to \mathbb{R}^m, \quad T_3(\mathbf{x }) = cT_1(\mathbf{x }). \nonumber\] Show that \(S \) and \(T_3 \) are again linear transformations.
    And now, let us return to matrix transformations.

    Standard Matrix for a Linear Transformation

    We have seen that every matrix transformation is a linear transformation. In this subsection we will show that conversely every linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m \) is can be represented as a matrix transformation. The key to construct a matrix that represents a given linear transformation lies in the following proposition.
    Proposition
    Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m \) is a linear transformation. Then the following property holds: for each set of vectors \(\mathbf{x }_1, \ldots, \mathbf{x }_k \) in \(\mathbb{R}^n \) and each set of numbers \(c_1,\ldots,c_k \) in \(\mathbb{R} \): \[ T(c_1\mathbf{x }_1+c_2 \mathbf{x }_2+\ldots +c_k \mathbf{x }_k) = c_1T(\mathbf{x }_1)+c_2T(\mathbf{x }_2)+\ldots +c_kT( \mathbf{x }_k). \nonumber \nonumber\] In words: for any linear transformation \[ \begin{array}{l} \text{the image of a linear combination of vectors is equal to }\\ \text{the linear combination of their images.} \end{array} \nonumber \nonumber\]
    Skip/Read the proof
    Proof
    Suppose \(T:\mathbb{R}^n\rightarrow\mathbb{R}^m \) is a linear transformation. So we have \[ \text{(i) }T(\mathbf{x }+\mathbf{y }) = T(\mathbf{x })+T(\mathbf{y }) \quad\text{and} \quad \text{(ii) } T(c\mathbf{x }) = c T(\mathbf{x }). \nonumber \nonumber\] First apply rule (i) to split the term on the left into \(k \) terms: \[ \begin{array}{ccl} T(c_1\mathbf{x }_1+c_2 \mathbf{x }_2+\ldots +c_k \mathbf{x }_k) &=& T(c_1\mathbf{x }_1)+T(c_2 \mathbf{x }_2+\ldots +c_k \mathbf{x }_k) \\ &=& \quad \ldots \\ &=& T(c_1\mathbf{x }_1)+T(c_2 \mathbf{x }_2)+\ldots + T(c_k \mathbf{x }_k) \end{array} \nonumber \nonumber\] and then apply rule (ii) to each term.
    Example
    Suppose \(T: \mathbb{R}^3 \to \mathbb{R}^2 \) is a linear transformation, and we know that for \[ \mathbf{a }_1 = \left[\begin{array}{r} 1 \\ 0 \\ 0 \end{array}\right], \quad \mathbf{a }_2 = \left[\begin{array}{r} 1 \\ 1 \\ 0 \end{array}\right], \quad \mathbf{a }_3 = \left[\begin{array}{r} 1 \\ 1 \\ 1 \end{array}\right] \nonumber \nonumber\] the images under \(T \) are given by \[ T(\mathbf{a }_1) = b_1 = \left[\begin{array}{r} 1 \\ 2 \end{array}\right], \quad T(\mathbf{a }_2) = b_2 = \left[\begin{array}{r} 3 \\ -1 \end{array}\right], \quad \text{and} \quad T(\mathbf{a }_3) = b_3 = \left[\begin{array}{r} 2 \\ -2 \end{array}\right]. \nonumber \nonumber\] Then for the vector \[ \mathbf{v } = \left[\begin{array}{r} 4 \\ 1 \\ -1 \end{array}\right] = 3 \mathbf{a }_1 + 2 \mathbf{a }_2 -1 \mathbf{a }_3 \nonumber \nonumber\] it follows that \[ T(\mathbf{v }) = 3 \mathbf{b }_1 + 2 \mathbf{b }_2 + (-1) \mathbf{b }_3 = 3 \left[\begin{array}{r} 1 \\ 2 \end{array}\right] + 2 \left[\begin{array}{r} 3 \\ -1 \end{array}\right] + (-1) \left[\begin{array}{r} 2 \\ -2 \end{array}\right]= \left[\begin{array}{r} 7 \\ 6 \end{array}\right]. \nonumber \nonumber\]
    The central idea illustrated in Example 18, which is in fact a direct consequence of Proposition 17, is the following: a linear transformation \(T \) from \(\mathbb{R}^n \) to \(\mathbb{R}^m \) is completely specified by the images \( T(\mathbf{a }_1), T(\mathbf{a }_2), \ldots , T(\mathbf{a }_n) \) of a set of vectors \(\{\mathbf{a }_1, \mathbf{a }_2, \ldots, \mathbf{a }_n\} \) that spans \(\mathbb{R}^n \). The simplest set of vectors that spans the whole space \(\mathbb{R}^n \) is the the standard basis for \(\mathbb{R}^n \) which was introduced in Section (...). Recall that this is the set of vectors \[ \label{Eq:LinTrafo:StandardBasis} \left(\mathbf{e }_1,\mathbf{e }_2, \ldots, \mathbf{e }_n\right) = \left(\left[\begin{array}{r} 1 \\ 0 \\ 0 \\ \vdots \\ 0 \end{array}\right], \left[\begin{array}{r} 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{array}\right], \quad \cdots \quad , \left[\begin{array}{r} 0 \\ 0 \\ 0 \\ \vdots \\ 1 \end{array}\right]\right). \] The next example gives an illustration of the above, and it also leads the way to the construction of a matrix for an arbitrary linear transformation.
    Example
    Suppose \(T \) is a linear transformation from \(\mathbb{R}^2 \) to \(\mathbb{R}^2 \) for which \[ T(\mathbf{e }_1) = \mathbf{a }_1 = \left[\begin{array}{r}1 \\2 \end{array}\right], \quad T(\mathbf{e }_2) = \mathbf{a }_2 = \left[\begin{array}{r}4 \\3 \end{array}\right]. \nonumber \nonumber\] Then for an arbitrary vector \[ \mathbf{x } = \left[\begin{array}{r}x_1\\x_2\end{array}\right] = x_1 \left[\begin{array}{r}1\\0 \end{array}\right] + x_2 \left[\begin{array}{r}0\\1 \end{array}\right] = x_1\mathbf{e }_1 + x_2\mathbf{e }_2, \nonumber \nonumber\] it follows that \[ \begin{array}{rcl} T(\mathbf{x }) &=& x_1T(\mathbf{e }_1) + x_2T(\mathbf{e }_2) \\ &=& x_1\left[\begin{array}{rr}1 \\2 \end{array}\right] + x_2\left[\begin{array}{r}4 \\3 \end{array}\right] = \left[\begin{array}{r}1 &4 \\2 &3\end{array}\right]\mathbf{x }. \end{array} \nonumber \nonumber\] So we see that \[ T(\mathbf{x }) = A \mathbf{x }, \quad\text{where} \quad A = [ T(\mathbf{e }_1) T(\mathbf{e }_2) ]. \nonumber \nonumber\]
    Exercise
    Show that the procedure of Example 19 applied to the linear transformation of Example 11 indeed yields the matrix \[ A = \left[\begin{array}{r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array}\right]. \nonumber \nonumber\]
    The reasoning of Example 19 can be generalized. This is the content of the next theorem.
    Theorem
    Each linear transformation \(T \) from \(\mathbb{R}^n \) to \(\mathbb{R}^m \) is a matrix transformation. More specific: if \(T: \mathbb{R}^n \to \mathbb{R}^m \) is linear, then for each \(\mathbf{x } \) in \(\mathbb{R}^n \) \[ T(\mathbf{x }) = A\mathbf{x }, \quad \text{where} \quad A = \left[\begin{array}{rrr} T(\mathbf{e }_1) & T(\mathbf{e }_2) & \ldots & T(\mathbf{e }_n) \end{array}\right]. \nonumber \nonumber\]
    Skip/Read the proof
    Proof
    We can more or less copy the derivation in Example 19. First of all, any vector \(\mathbf{x } \) is a linear combination of the standard basis: \[ \mathbf{x } = \left[\begin{array}{r}x_1\\x_2\\ \vdots \\ x_n \end{array}\right] = x_1 \left[\begin{array}{r}1 \\ 0 \\ \vdots \\ 0 \end{array}\right] + x_2 \left[\begin{array}{r}0 \\ 1 \\ \vdots \\ 0 \end{array}\right] + \ldots + x_n \left[\begin{array}{r}0 \\ 0 \\ \vdots \\ 1 \end{array}\right], \nonumber \nonumber\] i.e., \[ \mathbf{x } = x_1 \mathbf{e }_1 + x_2 \mathbf{e }_2 + \ldots + x_n \mathbf{e }_n. \nonumber \nonumber\] From Proposition 17 it follows that \[ T( \mathbf{x }) = x_1 T(\mathbf{e }_1) + x_2 T(\mathbf{e }_2) + \ldots + x_n T(\mathbf{e }_n). \nonumber \nonumber\] The last expression is a linear combination of \(n \) vectors in \(\mathbb{R}^m \), and this can be written as a matrix-vector product: \[ x_1 T(\mathbf{e }_1) + x_2 T(\mathbf{e }_2) + \ldots + x_n T(\mathbf{e }_n) = \left[\begin{array}{rrr} T(\mathbf{e }_1) & T(\mathbf{e }_2) & \ldots & T(\mathbf{e }_n) \end{array}\right] \mathbf{x }. \nonumber \nonumber\]
    Definition
    For a linear transformation \(T:\mathbb{R}^n \to \mathbb{R}^m \), the matrix \[ \label{Eq:LinTrafo:StandardMatrix} \left[\begin{array} {rrr} T(\mathbf{e }_1) & T(\mathbf{e }_2) & \ldots & T(\mathbf{e }_n) \end{array}\right] \] is called the standard matrix of \(T \).
    In Subsection Sec:GeomLinTrans (The Geometry of Linear Transformations) you will learn how to build standard matrices for rotations, reflections and other geometrical mappings. For now let us look at a more 'algebraic' example.
    Example
    Consider the transformation \[ T: \left[\begin{array}{r}x \\ y \\ z \end{array}\right] \mapsto \left[\begin{array}{r}x-y \\ 2y+3z \\ x+y-z \end{array}\right]. \nonumber \nonumber\] It can be checked that the transformation has the two properties of a linear transformation according to the definition. Note that \[ T(\mathbf{e }_1) = \left[\begin{array}{r}1 \\ 0 \\ 1 \end{array}\right], \quad T(\mathbf{e }_2) = \left[\begin{array}{r}-1 \\ 2 \\ 1 \end{array}\right], \quad \text{and} \quad T(\mathbf{e }_3) = \left[\begin{array}{r}0 \\ 3 \\ -1 \end{array}\right]. \nonumber \nonumber\] So we find that \[ [T] = \left[\begin{array}{rr}1 & -1 & 0 \\ 0 &2&3 \\ 1 & 1 & -1 \end{array}\right] \nonumber \nonumber\] is the standard matrix of \(T \).
    Exercise
    In the previous example we could have found the matrix just by inspection. Can you fill in the blanks in the following equation? \[ \left[\begin{array}{r}x-y \\ 2y+3z \\ x+y-z \end{array}\right] = \left[\begin{array}{rr}.. & .. & .. \\ .. & .. & .. \\ .. & .. & .. \end{array}\right] \left[\begin{array}{r}x \\ y \\ z \end{array}\right]. \nonumber \nonumber\] If you can, you will have shown that \(T \) is a matrix transformation, and as a direct consequence \(T \) is a linear transformation.
    To conclude we consider an example that refers back to Proposition 15, and which will to some extent pave the road for the product of two matrices.
    Example
    Suppose \(T:\mathbb{R}^2 \to \mathbb{R}^3 \) and \(S:\mathbb{R}^3 \to \mathbb{R}^3 \) are the matrix transformations given by \[ T(\mathbf{x }) = A\mathbf{x } = \left[\begin{array}{r} 1&2 \\ 3&4 \\ 1&0 \end{array}\right] \mathbf{x } \quad \text{and} \quad S(\mathbf{y }) = B\mathbf{y } =\left[\begin{array}{rr} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{array}\right] \mathbf{x } \nonumber\] From Proposition 15 we know that the composition \(ST: \mathbb{R}^2 \to \mathbb{R}^3 \) is also a linear transformation. What is the (standard) matrix of \(ST \)? For this we need the images of the unit vectors \(\mathbf{e }_1 \) and \(\mathbf{e }_2 \) in \(\mathbb{R}^2 \). For each vector we first apply \(T \) and then \(S \). For \(\mathbf{e }_1 \) this gives \[ T(\mathbf{e }_1) = \left[\begin{array}{r} 1&2 \\ 3&4 \\ 1&0 \end{array}\right] \left[\begin{array}{r} 1\\0 \end{array}\right] = \left[\begin{array}{r} 1 \\ 3 \\ 1\end{array}\right], \nonumber \nonumber\] and then \[ S (T(\mathbf{e }_1)) = \left[\begin{array}{rr} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{array}\right] \left[\begin{array}{r} 1 \\ 3 \\ 1\end{array}\right]= \left[\begin{array}{r} 2 \\ 0 \\ -1\end{array}\right]. \nonumber \nonumber\] Likewise for \(\mathbf{e }_2 \): \[ T(\mathbf{e }_2) = \left[\begin{array}{r} 1&2 \\ 3&4 \\ 1&0 \end{array}\right] \left[\begin{array}{r} 0\\1 \end{array}\right] = \left[\begin{array}{r} 2\\4\\0\end{array}\right] \Longrightarrow S (T(\mathbf{e }_2)) = \left[\begin{array}{rr} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{array}\right] \left[\begin{array}{r} 2 \\ 4 \\ 0\end{array}\right]= \left[\begin{array}{r} 1 \\ -2 \\ 2\end{array}\right]. \nonumber \nonumber\] So the matrix of \(ST \) becomes \[ [ST] = [ ST(\mathbf{e}_1) ST(\mathbf{e}_2) ] = \left[\begin{array}{r} 2 &1 \\ 0&-2 \\ -1&2\end{array}\right]. \nonumber \nonumber\] In the section 'Matrix Operations' we will define the product of two matrices in such a way that \[ \left[\begin{array}{rr} 1&0 &1 \\ 1 & -1 &2 \\ -1&1&-3 \end{array}\right] \left[\begin{array}{r} 1&2 \\ 3&4 \\ 1&0 \end{array}\right] = \left[\begin{array}{r} 2 &1 \\ 0&-2 \\ -1&2\end{array}\right]. \nonumber \nonumber\]

    3.1 Linear Transformations is shared under a CC BY license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?