Skip to main content
Mathematics LibreTexts

2.6: Linear Transformations

  • Page ID
    58839
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    If \(A\) is an \(m \times n\) matrix, recall that the transformation \(T_{A} : \mathbb{R}^n \to \mathbb{R}^m\) defined by

    \[T_{A}(\mathbf{x}) = A\mathbf{x} \quad \mbox{ for all } \mathbf{x} \mbox{ in } \mathbb{R}^n \nonumber \]

    is called the matrix transformation induced by \(A\). In Section [sec:2_2], we saw that many important geometric transformations were in fact matrix transformations. These transformations can be characterized in a different way. The new idea is that of a linear transformation, one of the basic notions in linear algebra. We define these transformations in this section, and show that they are really just the matrix transformations looked at in another way. Having these two ways to view them turns out to be useful because, in a given situation, one perspective or the other may be preferable.

    Linear Transformations

    Linear Transformations \(\mathbb{R}^n \to \mathbb{R}^m\)005688 A transformation \(T : \mathbb{R}^n \to \mathbb{R}^m\) is called a linear transformation if it satisfies the following two conditions for all vectors \(\mathbf{x}\) and \(\mathbf{y}\) in \(\mathbb{R}^n\) and all scalars \(a\):

    1. \(T(\mathbf{x} + \mathbf{y}) = T(\mathbf{x}) + T(\mathbf{y})\)
    2. \(T(a\mathbf{x}) = aT(\mathbf{x})\)

    Of course, \(\mathbf{x} + \mathbf{y}\) and \(a\mathbf{x}\) here are computed in \(\mathbb{R}^n\), while \(T(\mathbf{x}) + T(\mathbf{y})\) and \(aT(\mathbf{x})\) are in \(\mathbb{R}^m\). We say that \(T\) preserves addition if T1 holds, and that \(T\) preserves scalar multiplication if T2 holds. Moreover, taking \(a = 0\) and \(a = -1\) in T2 gives

    \[T(\mathbf{0}) = \mathbf{0} \quad \mbox{ and } \quad T(-\mathbf{x}) = -T(\mathbf{x}) \quad \mbox{ for all } \mathbf{x} \nonumber \]

    Hence \(T\) preserves the zero vector and the negative of a vector. Even more is true.

    Recall that a vector \(\mathbf{y}\) in \(\mathbb{R}^n\) is called a linear combination of vectors \(\mathbf{x}_{1}, \mathbf{x}_{2}, \dots, \mathbf{x}_{k}\) if \(\mathbf{y}\) has the form

    \[\mathbf{y} = a_{1}\mathbf{x}_{1} + a_{2}\mathbf{x}_{2} + \cdots + a_{k}\mathbf{x}_{k} \nonumber \]

    for some scalars \(a_{1}, a_{2}, \dots, a_{k}\). Conditions T1 and T2 combine to show that every linear transformation \(T\) preserves linear combinations in the sense of the following theorem. This result is used repeatedly in linear algebra.

    Linearity Theorem005709 If \(T : \mathbb{R}^n \to \mathbb{R}^m\) is a linear transformation, then for each \(k = 1, 2, \dots\)

    \[T(a_{1}\mathbf{x}_{1} + a_{2}\mathbf{x}_{2} + \cdots + a_{k}\mathbf{x}_{k}) = a_{1}T(\mathbf{x}_{1}) + a_{2}T(\mathbf{x}_{2}) + \cdots + a_{k}T(\mathbf{x}_{k}) \nonumber \]

    for all scalars \(a_{i}\) and all vectors \(\mathbf{x}_{i}\) in \(\mathbb{R}^n\).

    If \(k = 1\), it reads \(T(a_{1}\mathbf{x}_{1}) = a_{1}T(\mathbf{x}_{1})\) which is Condition T1. If \(k = 2\), we have

    \[\begin{array}{llllr} T(a_{1}\mathbf{x}_{1} + a_{2}\mathbf{x}_{2}) & = & T(a_{1}\mathbf{x}_{1}) + T(a_{2}\mathbf{x}_{2}) & & \mbox{ by Condition T1} \\ & = & a_{1}T(\mathbf{x}_{1}) + a_{2}T(\mathbf{x}_{2}) & & \mbox{by Condition T2} \end{array} \nonumber \]

    If \(k = 3\), we use the case \(k = 2\) to obtain

    \[\begin{array}{lllll} T(a_{1}\mathbf{x}_{1} + a_{2}\mathbf{x}_{2} + a_{3}\mathbf{x}_{3}) & = & T \left[(a_{1}\mathbf{x}_{1} + a_{2}\mathbf{x}_{2}) + a_{3}\mathbf{x}_{3} \right] & & \mbox{collect terms} \\ & = & T(a_{1}\mathbf{x}_{1} + a_{2}\mathbf{x}_{2}) + T(a_{3}\mathbf{x}_{3}) & & \mbox{by Condition T1} \\ & = & \left[a_{1}T(\mathbf{x}_{1}) + a_{2}T(\mathbf{x}_{2})\right] + T(a_{3}\mathbf{x}_{3}) & & \mbox{by the case } k = 2 \\ & = & \left[a_{1}T(\mathbf{x}_{1}) + a_{2}T(\mathbf{x}_{2})\right] + a_{3}T(\mathbf{x}_{3}) & & \mbox{by Condition T2} \end{array} \nonumber \]

    The proof for any \(k\) is similar, using the previous case \(k - 1\) and Conditions T1 and T2.

    The method of proof in Theorem [thm:005709] is called mathematical induction (Appendix [chap:appcinduction]).

    Theorem [thm:005709] shows that if \(T\) is a linear transformation and \(T(\mathbf{x}_{1}), T(\mathbf{x}_{2}), \dots, T(\mathbf{x}_{k})\) are all known, then \(T(\mathbf{y})\) can be easily computed for any linear combination \(\mathbf{y}\) of \(\mathbf{x}_{1}, \mathbf{x}_{2}, \dots, \mathbf{x}_{k}\). This is a very useful property of linear transformations, and is illustrated in the next example.

    005737 If \(T : \mathbb{R}^{2} \to \mathbb{R}^{2}\) is a linear transformation, \(T \left[ \begin{array}{r} 1 \\ 1 \end{array} \right] = \left[ \begin{array}{r} 2 \\ -3 \end{array} \right]\) and \(T \left[ \begin{array}{r} 1 \\ -2 \end{array} \right] = \left[ \begin{array}{r} 5 \\ 1 \end{array} \right]\), find \(T \left[ \begin{array}{r} 4 \\ 3 \end{array} \right]\).

    Write \(\mathbf{z} = \left[ \begin{array}{r} 4 \\ 3 \end{array} \right]\), \(\mathbf{x} = \left[ \begin{array}{r} 1 \\ 1 \end{array} \right]\), and \(\mathbf{y} = \left[ \begin{array}{r} 1 \\ -2 \end{array} \right]\) for convenience. Then we know \(T(\mathbf{x})\) and \(T(\mathbf{y})\) and we want \(T(\mathbf{z})\), so it is enough by Theorem [thm:005709] to express \(\mathbf{z}\) as a linear combination of \(\mathbf{x}\) and \(\mathbf{y}\). That is, we want to find numbers \(a\) and \(b\) such that \(\mathbf{z} = a\mathbf{x} + b\mathbf{y}\). Equating entries gives two equations \(4 = a + b\) and \(3 = a - 2b\). The solution is, \(a = \frac{11}{3}\) and \(b = \frac{1}{3}\), so \(\mathbf{z} = \frac{11}{3}\mathbf{x} + \frac{1}{3}\mathbf{y}\). Thus Theorem [thm:005709] gives

    \[T(\mathbf{z}) = \frac{11}{3}T(\mathbf{x}) + \frac{1}{3}T(\mathbf{y}) = \frac{11}{3} \left[ \begin{array}{r} 2 \\ -3 \end{array} \right] + \frac{1}{3} \left[ \begin{array}{r} 5 \\ 1 \end{array} \right] = \frac{1}{3} \left[ \begin{array}{r} 27 \\ -32 \end{array} \right] \nonumber \]

    This is what we wanted.

    005754 If \(A\) is \(m \times n\), the matrix transformation \(T_{A} : \mathbb{R}^n \to \mathbb{R}^m\), is a linear transformation.

    We have \(T_{A}(\mathbf{x}) = A\mathbf{x}\) for all \(\mathbf{x}\) in \(\mathbb{R}^n\), so Theorem [thm:002811] gives

    \[T_{A}(\mathbf{x} + \mathbf{y}) = A(\mathbf{x} + \mathbf{y}) = A\mathbf{x} + A\mathbf{y} = T_{A}(\mathbf{x}) + T_{A}(\mathbf{y}) \nonumber \]

    and

    \[T_{A}(a\mathbf{x}) = A(a\mathbf{x}) = a(A\mathbf{x}) = aT_{A}(\mathbf{x}) \nonumber \]

    hold for all \(\mathbf{x}\) and \(\mathbf{y}\) in \(\mathbb{R}^n\) and all scalars \(a\). Hence \(T_{A}\) satisfies T1 and T2, and so is linear.

    The remarkable thing is that the converse of Example [exa:005754] is true: Every linear transformation
    \(T : \mathbb{R}^n \to \mathbb{R}^m\) is actually a matrix transformation. To see why, we define the standard basis of \(\mathbb{R}^n\) to be the set of columns

    \[\{\mathbf{e}_{1}, \mathbf{e}_{2}, \dots, \mathbf{e}_{n}\} \nonumber \]

    of the identity matrix \(I_{n}\). Then each \(\mathbf{e}_{i}\) is in \(\mathbb{R}^n\) and every vector \(\mathbf{x} = \left[ \begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array} \right]\) in \(\mathbb{R}^n\) is a linear combination of the \(\mathbf{e}_{i}\). In fact:

    \[\mathbf{x} = x_{1}\mathbf{e}_{1} + x_{2}\mathbf{e}_{2} + \cdots + x_{n}\mathbf{e}_{n} \nonumber \]

    as the reader can verify. Hence Theorem [thm:005709] shows that

    \[T(\mathbf{x}) = T(x_{1}\mathbf{e}_{1} + x_{2}\mathbf{e}_{2} + \cdots + x_{n}\mathbf{e}_{n}) = x_{1}T(\mathbf{e}_{1}) + x_{2}T(\mathbf{e}_{2}) + \cdots + x_{n}T(\mathbf{e}_{n}) \nonumber \]

    Now observe that each \(T(\mathbf{e}_{i})\) is a column in \(\mathbb{R}^m\), so

    \[A = \left[ \begin{array}{cccc} T(\mathbf{e}_{1}) & T(\mathbf{e}_{2}) & \cdots & T(\mathbf{e}_{n}) \end{array} \right] \nonumber \]

    is an \(m \times n\) matrix. Hence we can apply Definition [def:002668] to get

    \[T(\mathbf{x}) = x_{1}T(\mathbf{e}_{1}) + x_{2}T(\mathbf{e}_{2}) + \cdots + x_{n}T(\mathbf{e}_{n}) = \left[ \begin{array}{cccc} T(\mathbf{e}_{1}) & T(\mathbf{e}_{2}) & \cdots & T(\mathbf{e}_{n}) \end{array} \right] \left[ \begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array} \right] = A\mathbf{x} \nonumber \]

    Since this holds for every \(\mathbf{x}\) in \(\mathbb{R}^n\), it shows that \(T\) is the matrix transformation induced by \(A\), and so proves most of the following theorem.

    005789 Let \(T : \mathbb{R}^n \to \mathbb{R}^m\) be a transformation.

    1. \(T\) is linear if and only if it is a matrix transformation.
    2. In this case \(T = T_{A}\) is the matrix transformation induced by a unique \(m \times n\) matrix \(A\), given in terms of its columns by

      \[A = \left[ \begin{array}{cccc} T(\mathbf{e}_{1}) & T(\mathbf{e}_{2}) & \cdots & T(\mathbf{e}_{n}) \end{array} \right] \nonumber \]

    It remains to verify that the matrix \(A\) is unique. Suppose that \(T\) is induced by another matrix \(B\). Then \(T(\mathbf{x}) = B\mathbf{x}\) for all \(\mathbf{x}\) in \(\mathbb{R}^n\). But \(T(\mathbf{x}) = A\mathbf{x}\) for each \(\mathbf{x}\), so \(B\mathbf{x} = A\mathbf{x}\) for every \(\mathbf{x}\). Hence \(A = B\) by Theorem [thm:002985].

    Hence we can speak of the matrix of a linear transformation. Because of Theorem [thm:005789] we may (and shall) use the phrases “linear transformation” and “matrix transformation” interchangeably.

    005811 Define \(T : \mathbb{R}^3 \to \mathbb{R}^2\) by \(T \left[ \begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array} \right] = \left[ \begin{array}{c} x_{1} \\ x_{2} \end{array} \right]\) for all \(\left[ \begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array} \right]\) in \(\mathbb{R}^3\). Show that \(T\) is a linear transformation and use Theorem [thm:005789] to find its matrix.

    Write \(\mathbf{x} = \left[ \begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array} \right]\) and \(\mathbf{y} = \left[ \begin{array}{c} y_{1} \\ y_{2} \\ y_{3} \end{array} \right]\), so that \(\mathbf{x} + \mathbf{y} = \left[ \begin{array}{c} x_{1} + y_{1} \\ x_{2} + y_{2} \\ x_{3} + y_{3} \end{array} \right]\). Hence

    \[T(\mathbf{x} + \mathbf{y}) = \left[ \begin{array}{c} x_{1} + y_{1} \\ x_{2} + y_{2} \end{array} \right] = \left[ \begin{array}{c} x_{1} \\ x_{2} \end{array} \right] + \left[ \begin{array}{c} y_{1} \\ y_{2} \end{array} \right] = T(\mathbf{x}) + T(\mathbf{y}) \nonumber \]

    Similarly, the reader can verify that \(T(a\mathbf{x}) = aT(\mathbf{x})\) for all \(a\) in \(\mathbb{R}\), so \(T\) is a linear transformation. Now the standard basis of \(\mathbb{R}^3\) is

    \[\mathbf{e}_{1} = \left[ \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right], \quad \mathbf{e}_{2} = \left[ \begin{array}{c} 0 \\ 1 \\ 0 \end{array} \right], \quad \mbox{ and } \quad \mathbf{e}_{3} = \left[ \begin{array}{c} 0 \\ 0 \\ 1 \end{array} \right] \nonumber \]

    so, by Theorem [thm:005789], the matrix of \(T\) is

    \[A = \left[ \begin{array}{ccc} T(\mathbf{e}_{1}) & T(\mathbf{e}_{2}) & T(\mathbf{e}_{3}) \end{array} \right] = \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \end{array} \right] \nonumber \]

    Of course, the fact that \(T \left[ \begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array} \right] = \left[ \begin{array}{c} x_{1} \\ x_{2} \end{array} \right] = \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \end{array} \right] \left[ \begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array} \right]\) shows directly that \(T\) is a matrix transformation (hence linear) and reveals the matrix.

    To illustrate how Theorem [thm:005789] is used, we rederive the matrices of the transformations in Examples [exa:003028] and [exa:003088].

    005834 Let \(Q_{0} : \mathbb{R}^2 \to \mathbb{R}^2\) denote reflection in the \(x\) axis (as in Example [exa:003028]) and let \(R_{\frac{\pi}{2}} : \mathbb{R}^2 \to \mathbb{R}^2\) denote counterclockwise rotation through \(\frac{\pi}{2}\) about the origin (as in Example [exa:003088]). Use Theorem [thm:005789] to find the matrices of \(Q_{0}\) and \(R_{\frac{\pi}{2}}\).

    Observe that \(Q_{0}\) and \(R_{\frac{\pi}{2}}\) are linear by Example [exa:005754] (they are matrix transformations), so Theorem [thm:005789] applies to them. The standard basis of \(\mathbb{R}^2\) is \(\{\mathbf{e}_{1}, \mathbf{e}_{2}\}\) where \(\mathbf{e}_{1} = \left[ \begin{array}{c} 1 \\ 0 \end{array} \right]\) points along the positive \(x\) axis, and \(\mathbf{e}_{2} = \left[ \begin{array}{c} 0 \\ 1 \end{array} \right]\) points along the positive \(y\) axis (see Figure [fig:005854]).

    The reflection of \(\mathbf{e}_{1}\) in the \(x\) axis is \(\mathbf{e}_{1}\) itself because \(\mathbf{e}_{1}\) points along the \(x\) axis, and the reflection of \(\mathbf{e}_{2}\) in the \(x\) axis is \(-\mathbf{e}_{2}\) because \(\mathbf{e}_{2}\) is perpendicular to the \(x\) axis. In other words, \(Q_{0}(\mathbf{e}_{1}) = \mathbf{e}_{1}\) and \(Q_{0}(\mathbf{e}_{2}) = -\mathbf{e}_{2}\). Hence Theorem [thm:005789] shows that the matrix of \(Q_{0}\) is

    \[\left[ \begin{array}{cc} Q_{0}(\mathbf{e}_{1}) & Q_{0}(\mathbf{e}_{2}) \end{array} \right] = \left[ \begin{array}{rr} \mathbf{e}_{1} & -\mathbf{e}_{2} \end{array} \right] = \left[ \begin{array}{rr} 1 & 0 \\ 0 & -1 \end{array} \right] \nonumber \]

    which agrees with Example [exa:003028].

    Similarly, rotating \(\mathbf{e}_{1}\) through \(\frac{\pi}{2}\) counterclockwise about the origin produces \(\mathbf{e}_{2}\), and rotating \(\mathbf{e}_{2}\) through \(\frac{\pi}{2}\) counterclockwise about the origin gives \(-\mathbf{e}_{1}\). That is, \(R_{\frac{\pi}{2}}(\mathbf{e}_{1}) = \mathbf{e}_{2}\) and \(R_{\frac{\pi}{2}}(\mathbf{e}_{2}) = -\mathbf{e}_{2}\). Hence, again by Theorem [thm:005789], the matrix of \(R_{\frac{\pi}{2}}\) is

    \[\left[ \begin{array}{cc} R_{\frac{\pi}{2}}(\mathbf{e}_{1}) & R_{\frac{\pi}{2}}(\mathbf{e}_{2}) \end{array} \right] = \left[ \begin{array}{rr} \mathbf{e}_{2} & -\mathbf{e}_{1} \end{array} \right] = \left[ \begin{array}{rr} 0 & -1 \\ 1 & 0 \end{array} \right] \nonumber \]

    agreeing with Example [exa:003088].

    005881

    Let \(Q_{1} : \mathbb{R}^2 \to \mathbb{R}^2\) denote reflection in the line \(y = x\). Show that \(Q_{1}\) is a matrix transformation, find its matrix, and use it to illustrate Theorem [thm:005789].

    Figure [fig:005912] shows that \(Q_{1} \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} y \\ x \end{array} \right]\). Hence \(Q_{1} \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{c} y \\ x \end{array} \right]\), so \(Q_{1}\) is the matrix transformation induced by the matrix \(A = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right]\). Hence \(Q_{1}\) is linear (by Example [exa:005754]) and so Theorem [thm:005789] applies. If \(\mathbf{e}_{1} = \left[ \begin{array}{r} 1 \\ 0 \end{array} \right]\) and \(\mathbf{e}_{2} = \left[ \begin{array}{r} 0 \\ 1 \end{array} \right]\) are the standard basis of \(\mathbb{R}^2\), then it is clear geometrically that \(Q_{1}(\mathbf{e}_{1}) = \mathbf{e}_{2}\) and \(Q_{1}(\mathbf{e}_{2}) = \mathbf{e}_{1}\). Thus (by Theorem [thm:005789]) the matrix of \(Q_{1}\) is \(\left[ \begin{array}{cc} Q_{1}(\mathbf{e}_{1}) & Q_{1}(\mathbf{e}_{2}) \end{array} \right] = \left[ \begin{array}{cc} \mathbf{e}_{2} & \mathbf{e}_{1} \end{array} \right] = A\) as before.

    Recall that, given two “linked” transformations

    \[\mathbb{R}^k \xrightarrow{T} \mathbb{R}^n \xrightarrow{S} \mathbb{R}^m \nonumber \]

    we can apply \(T\) first and then apply \(S\), and so obtain a new transformation

    \[S \circ T : \mathbb{R}^k \to \mathbb{R}^m \nonumber \]

    called the composite of \(S\) and \(T\), defined by

    \[(S \circ T)(\mathbf{x}) = S\left[T(\mathbf{x})\right] \mbox{ for all } \mathbf{x} \mbox{ in } \mathbb{R}^k \nonumber \]

    If \(S\) and \(T\) are linear, the action of \(S \circ T\) can be computed by multiplying their matrices.

    005918 Let \(\mathbb{R}^k \xrightarrow{T} \mathbb{R}^n \xrightarrow{S} \mathbb{R}^m\) be linear transformations, and let \(A\) and \(B\) be the matrices of \(S\) and \(T\) respectively. Then \(S \circ T\) is linear with matrix \(AB\).

    \((S \circ T)(\mathbf{x}) = S\left[T(\mathbf{x})\right] = A\left[B\mathbf{x}\right] = (AB)\mathbf{x}\) for all \(\mathbf{x}\) in \(\mathbb{R}^k\).

    Theorem [thm:005918] shows that the action of the composite \(S \circ T\) is determined by the matrices of \(S\) and \(T\). But it also provides a very useful interpretation of matrix multiplication. If \(A\) and \(B\) are matrices, the product matrix \(AB\) induces the transformation resulting from first applying \(B\) and then applying \(A\). Thus the study of matrices can cast light on geometrical transformations and vice-versa. Here is an example.

    005927 Show that reflection in the \(x\) axis followed by rotation through \(\frac{\pi}{2}\) is reflection in the line \(y = x\).

    The composite in question is \(R_{\frac{\pi}{2}} \circ Q_{0}\) where \(Q_{0}\) is reflection in the \(x\) axis and \(R_{\frac{\pi}{2}}\) is rotation through \(\frac{\pi}{2}\). By Example [exa:005834], \(R_{\frac{\pi}{2}}\) has matrix \(A = \left[ \begin{array}{rr} 0 & -1 \\ 1 & 0 \end{array} \right]\) and \(Q_{0}\) has matrix \(B = \left[ \begin{array}{rr} 1 & 0 \\ 0 & -1 \end{array} \right]\). Hence Theorem [thm:005918] shows that the matrix of \(R_{\frac{\pi}{2}} \circ Q_{0}\) is \(AB = \left[ \begin{array}{rr} 0 & -1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{rr} 1 & 0 \\ 0 & -1 \end{array} \right] = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right]\), which is the matrix of reflection in the line \(y = x\) by Example [exa:005811].

    This conclusion can also be seen geometrically. Let \(\mathbf{x}\) be a typical point in \(\mathbb{R}^2\), and assume that \(\mathbf{x}\) makes an angle \(\alpha\) with the positive \(x\) axis. The effect of first applying \(Q_{0}\) and then applying \(R_{\frac{\pi}{2}}\) is shown in Figure [fig:005950]. The fact that \(R_{\frac{\pi}{2}}\left[Q_{0}(\mathbf{x})\right]\) makes the angle \(\alpha\) with the positive \(y\) axis shows that \(R_{\frac{\pi}{2}}\left[Q_{0}(\mathbf{x})\right]\) is the reflection of \(\mathbf{x}\) in the line \(y = x\).

    In Theorem [thm:005918], we saw that the matrix of the composite of two linear transformations is the product of their matrices (in fact, matrix products were defined so that this is the case). We are going to apply this fact to rotations, reflections, and projections in the plane. Before proceeding, we pause to present useful geometrical descriptions of vector addition and scalar multiplication in the plane, and to give a short review of angles and the trigonometric functions.

    Some Geometry

    As we have seen, it is convenient to view a vector \(\mathbf{x}\) in \(\mathbb{R}^2\) as an arrow from the origin to the point \(\mathbf{x}\) (see Section [sec:2_2]). This enables us to visualize what sums and scalar multiples mean geometrically. For example consider \(\mathbf{x} = \left[ \begin{array}{rr} 1 \\ 2 \end{array} \right]\) in \(\mathbb{R}^2\). Then \(2\mathbf{x} = \left[ \begin{array}{rr} 2 \\ 4 \end{array} \right]\), \(\frac{1}{2}\mathbf{x} = \left[ \begin{array}{rr} \frac{1}{2} \\ 1 \end{array} \right]\) and \(-\frac{1}{2}\mathbf{x} = \left[ \begin{array}{rr} -\frac{1}{2} \\ -1 \end{array} \right]\), and these are shown as arrows in Figure [fig:005961].

    Observe that the arrow for \(2\mathbf{x}\) is twice as long as the arrow for \(\mathbf{x}\) and in the same direction, and that the arrows for \(\frac{1}{2}\mathbf{x}\) is also in the same direction as the arrow for \(\mathbf{x}\), but only half as long. On the other hand, the arrow for \(-\frac{1}{2}\mathbf{x}\) is half as long as the arrow for \(\mathbf{x}\), but in the opposite direction. More generally, we have the following geometrical description of scalar multiplication in \(\mathbb{R}^2\):

    Scalar Multiple Lawscalarmultiplelaw Let \(\mathbf{x}\) be a vector in \(\mathbb{R}^2\). The arrow for \(k\mathbf{x}\) is \(|k|\) timesas long as the arrow for \(\mathbf{x}\), and is in the same direction as the arrow for \(\mathbf{x}\) if \(k > 0\), and in the opposite direction if \(k < 0\).

    Now consider two vectors \(\mathbf{x} = \left[ \begin{array}{rr} 2 \\ 1 \end{array} \right]\) and \(\mathbf{y} = \left[ \begin{array}{rr} 1 \\ 3 \end{array} \right]\) in \(\mathbb{R}^2\). They are plotted in Figure [fig:005971] along with their sum \(\mathbf{x} + \mathbf{y} = \left[ \begin{array}{rr} 3 \\ 4 \end{array} \right]\). It is a routine matter to verify that the four points \(\mathbf{0}\), \(\mathbf{x}\), \(\mathbf{y}\), and \(\mathbf{x} + \mathbf{y}\) form the vertices of a parallelogram–that is opposite sides are parallel and of the same length. (The reader should verify that the side from \(\mathbf{0}\) to \(\mathbf{x}\) has slope of \(\frac{1}{2}\), as does the side from \(\mathbf{y}\) to \(\mathbf{x} + \mathbf{y}\), so these sides are parallel.) We state this as follows:

    Parallelogram Lawparallelogramlaw Consider vectors \(\mathbf{x}\) and \(\mathbf{y}\) in \(\mathbb{R}^2\). If the arrows for \(\mathbf{x}\) and \(\mathbf{y}\) are drawn (see Figure [fig:005976]), the arrow for \(\mathbf{x} + \mathbf{y}\) corresponds to the fourth vertex of the parallelogram determined by the points \(\mathbf{x}\), \(\mathbf{y}\), and \(\mathbf{0}\).

    We will have more to say about this in Chapter [chap:4].

    Before proceeding we turn to a brief review of angles and the trigonometric functions. Recall that an angle \(\theta\) is said to be in standard position if it is measured counterclockwise from the positive \(x\) axis (as in Figure [fig:005980]). Then \(\theta\) uniquely determines a point \(\mathbf{p}\) on the unit circle (radius \(1\), centre at the origin). The radian measure of \(\theta\) is the length of the arc on the unit circle from the positive \(x\) axis to \(\mathbf{p}\). Thus \(360^\circ = 2\pi\) radians, \(180^\circ = \pi\), \(90^\circ = \frac{\pi}{2}\), and so on.

    The point \(\mathbf{p}\) in Figure [fig:005980] is also closely linked to the trigonometric functions cosine and sine, written \(\cos \theta\) and \(\sin \theta\) respectively. In fact these functions are defined to be the \(x\) and \(y\) coordinates of \(\mathbf{p}\); that is \(\mathbf{p} = \left[ \begin{array}{r} \cos \theta \\ \sin \theta \end{array} \right]\). This defines \(\cos \theta\) and \(\sin \theta\) for the arbitrary angle \(\theta\) (possibly negative), and agrees with the usual values when \(\theta\) is an acute angle \(\left(0 \leq \theta \leq \frac{\pi}{2}\right)\) as the reader should verify. For more discussion of this, see Appendix [chap:appacomplexnumbers].

    Rotations

    We can now describe rotations in the plane. Given an angle \(\theta\), let

    \[R_{\theta} : \mathbb{R}^2 \to \mathbb{R}^2 \nonumber \]

    denote counterclockwise rotation of \(\mathbb{R}^2\) about the origin through the angle \(\theta\). The action of \(R_{\theta}\) is depicted in Figure [fig:005993]. We have already looked at \(R_{\frac{\pi}{2}}\) (in Example [exa:003088]) and found it to be a matrix transformation. It turns out that \(R_{\theta}\) is a matrix transformation for every angle \(\theta\) (with a simple formula for the matrix), but it is not clear how to find the matrix. Our approach is to first establish the (somewhat surprising) fact that \(R_{\theta}\) is linear, and then obtain the matrix from Theorem [thm:005789].

    Let \(\mathbf{x}\) and \(\mathbf{y}\) be two vectors in \(\mathbb{R}^2\). Then \(\mathbf{x} + \mathbf{y}\) is the diagonal of the parallelogram determined by \(\mathbf{x}\) and \(\mathbf{y}\) as in Figure [fig:006003].

    The effect of \(R_{\theta}\) is to rotate the entire parallelogram to obtain the new parallelogram determined by \(R_{\theta}(\mathbf{x})\) and \(R_{\theta}(\mathbf{y})\), with diagonal \(R_{\theta}(\mathbf{x} + \mathbf{y})\). But this diagonal is \(R_{\theta}(\mathbf{x}) + R_{\theta}(\mathbf{y})\) by the parallelogram law (applied to the new parallelogram). It follows that

    \[R_{\theta}(\mathbf{x} + \mathbf{y}) = R_{\theta}(\mathbf{x}) + R_{\theta}(\mathbf{y}) \nonumber \]

    A similar argument shows that \(R_{\theta}(a\mathbf{x}) = aR_{\theta}(\mathbf{x})\) for any scalar \(a\), so \(R_{\theta} : \mathbb{R}^2 \to \mathbb{R}^2\) is indeed a linear transformation.

    With linearity established we can find the matrix of \(R_{\theta}\). Let \(\mathbf{e}_{1} = \left[ \begin{array}{c} 1 \\ 0 \end{array} \right]\) and \(\mathbf{e}_{2} = \left[ \begin{array}{c} 0 \\ 1 \end{array} \right]\) denote the standard basis of \(\mathbb{R}^2\). By Figure [fig:006016] we see that

    \[R_{\theta}(\mathbf{e}_{1}) = \left[ \begin{array}{r} \cos \theta \\ \sin \theta \end{array} \right] \quad \mbox{ and } \quad R_{\theta}(\mathbf{e}_{2}) = \left[ \begin{array}{r} -\sin \theta \\ \cos \theta \end{array} \right] \nonumber \]

    Hence Theorem [thm:005789] shows that \(R_{\theta}\) is induced by the matrix

    \[\left[ \begin{array}{cc} R_{\theta}(\mathbf{e}_{1}) & R_{\theta}(\mathbf{e}_{2}) \end{array} \right] = \left[ \begin{array}{rr} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{array} \right] \nonumber \]

    We record this as

    006021 The rotation \(R_{\theta} : \mathbb{R}^2 \to \mathbb{R}^2\) is the linear transformation with matrix \(\left[ \begin{array}{rr} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{array} \right]\).

    For example, \(R_{\frac{\pi}{2}}\) and \(R_{\pi}\) have matrices \(\left[ \begin{array}{rr} 0 & -1 \\ 1 & 0 \end{array} \right]\) and \(\left[ \begin{array}{rr} -1 & 0 \\ 0 & -1 \end{array} \right]\), respectively, by Theorem [thm:006021]. The first of these confirms the result in Example [exa:003088]. The second shows that rotating a vector \(\mathbf{x} = \left[ \begin{array}{c} x \\ y \end{array} \right]\) through the angle \(\pi\) results in \(R_{\pi}(\mathbf{x}) = \left[ \begin{array}{rr} -1 & 0 \\ 0 & -1 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} -x \\ -y \end{array} \right] = -\mathbf{x}\). Thus applying \(R_{\pi}\) is the same as negating \(\mathbf{x}\), a fact that is evident without Theorem [thm:006021].

    006036

    Let \(\theta\) and \(\phi\) be angles. By finding the matrix of the composite \(R_{\theta} \circ R_{\phi}\), obtain expressions for \(\cos(\theta + \phi)\) and \(\sin(\theta + \phi)\).

    Consider the transformations \(\mathbb{R}^2 \xrightarrow{R_{\phi}} \mathbb{R}^2 \xrightarrow{R_{\theta}} \mathbb{R}^2\). Their composite \(R_{\theta} \circ R_{\phi}\) is the transformation that first rotates the plane through \(\phi\) and then rotates it through \(\theta\), and so is the rotation through the angle \(\theta + \phi\) (see Figure [fig:006048]).

    In other words

    \[R_{\theta + \phi} = R_{\theta} \circ R_{\phi} \nonumber \]

    Theorem [thm:005918] shows that the corresponding equation holds for the matrices of these transformations, so Theorem [thm:006021] gives:

    \[\left[ \begin{array}{rr} \cos(\theta + \phi) & -\sin(\theta + \phi) \\ \sin(\theta + \phi) & \cos(\theta + \phi) \end{array} \right] = \left[ \begin{array}{rr} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{array} \right] \left[ \begin{array}{rr} \cos \phi & -\sin \phi \\ \sin \phi & \cos \phi \end{array} \right] \nonumber \]

    If we perform the matrix multiplication on the right, and then compare first column entries, we obtain

    \[\begin{aligned} \cos(\theta + \phi) &= \cos \theta \cos \phi - \sin \theta \sin \phi \\ \sin(\theta + \phi) &= \sin \theta \cos \phi + \cos \theta \sin \phi\end{aligned} \nonumber \]

    These are the two basic identities from which most of trigonometry can be derived.

    Reflections

    The line through the origin with slope \(m\) has equation \(y = mx\), and we let \(Q_{m} : \mathbb{R}^2 \to \mathbb{R}^2\) denote reflection in the line \(y = mx\).

    This transformation is described geometrically in Figure [fig:006067]. In words, \(Q_{m}(\mathbf{x})\) is the “mirror image” of \(\mathbf{x}\) in the line \(y = mx\). If \(m = 0\) then \(Q_{0}\) is reflection in the \(x\) axis, so we already know \(Q_{0}\) is linear. While we could show directly that \(Q_{m}\) is linear (with an argument like that for \(R_{\theta}\)), we prefer to do it another way that is instructive and derives the matrix of \(Q_{m}\) directly without using Theorem [thm:005789].

    Let \(\theta\) denote the angle between the positive \(x\) axis and the line \(y = mx\). The key observation is that the transformation \(Q_{m}\) can be accomplished in three steps: First rotate through \(-\theta\) (so our line coincides with the \(x\) axis), then reflect in the \(x\) axis, and finally rotate back through \(\theta\). In other words:

    \[Q_{m} = R_{\theta} \circ Q_{0} \circ R_{-\theta} \nonumber \]

    Since \(R_{-\theta}\), \(Q_{0}\), and \(R_{\theta}\) are all linear, this (with Theorem [thm:005918]) shows that \(Q_{m}\) is linear and that its matrix is the product of the matrices of \(R_{\theta}\), \(Q_{0}\), and \(R_{-\theta}\). If we write \(c = \cos \theta\) and \(s = \sin \theta\) for simplicity, then the matrices of \(R_{\theta}\), \(R_{-\theta}\), and \(Q_{0}\) are

    \[\left[ \begin{array}{rr} c & -s \\ s & c \end{array} \right], \quad \left[ \begin{array}{rr} c & s \\ -s & c \end{array} \right], \quad \mbox{ and } \quad \left[ \begin{array}{rr} 1 & 0 \\ 0 & -1 \end{array} \right] \mbox{ respectively.}\footnote{The matrix of $R_{-\theta}$ comes from the matrix of $R_{\theta}$ using the fact that, for all angles $\theta$, $\cos(-\theta) = \cos \theta$ and \\ $\sin(-\theta) = -\sin(\theta)$.} \nonumber \]

    Hence, by Theorem [thm:005918], the matrix of \(Q_{m} = R_{\theta} \circ Q_{0} \circ R_{-\theta}\) is

    \[\left[ \begin{array}{rr} c & -s \\ s & c \end{array} \right] \left[ \begin{array}{rr} 1 & 0 \\ 0 & -1 \end{array} \right] \left[ \begin{array}{rr} c & s \\ -s & c \end{array} \right] = \left[ \begin{array}{cc} c^2 - s^2 & 2sc \\ 2sc & s^2 - c^2 \end{array} \right] \nonumber \]

    l5cm 2-matrix-algebra/figures/6-linear-transformations/figure2.6.13

    We can obtain this matrix in terms of \(m\) alone. Figure [fig:006095] shows that

    \[\cos \theta = \frac{1}{\sqrt{1 + m^2}} \mbox{ and } \sin \theta = \frac{m}{\sqrt{1 + m^2}} \nonumber \]

    so the matrix \(\left[ \begin{array}{cc} c^2 - s^2 & 2sc \\ 2sc & s^2 - c^2 \end{array} \right]\) of \(Q_{m}\) becomes \(\frac{1}{1 + m^2}\left[ \begin{array}{cc} 1 - m^2 & 2m \\ 2m & m^2 - 1 \end{array} \right]\).

    006096 Let \(Q_{m}\) denote reflection in the line \(y = mx\). Then \(Q_{m}\) is a linear transformation with matrix \(\frac{1}{1 + m^2}\left[ \begin{array}{cc} 1 - m^2 & 2m \\ 2m & m^2 - 1 \end{array} \right]\).

    Note that if \(m = 0\), the matrix in Theorem [thm:006096] becomes \(\left[ \begin{array}{rr} 1 & 0 \\ 0 & -1 \end{array}\right]\), as expected. Of course this analysis fails for reflection in the \(y\) axis because vertical lines have no slope. However it is an easy exercise to verify directly that reflection in the \(y\) axis is indeed linear with matrix \(\left[ \begin{array}{rr} -1 & 0 \\ 0 & 1 \end{array}\right]\).1

    006105 Let \(T : \mathbb{R}^2 \to \mathbb{R}^2\) be rotation through \(-\frac{\pi}{2}\) followed by reflection in the \(y\) axis. Show that \(T\) is a reflection in a line through the origin and find the line.

    The matrix of \(R_{-\frac{\pi}{2}}\) is \(\left[ \def\arraystretch{1.5} \begin{array}{rr} \cos(-\frac{\pi}{2}) & -\sin(-\frac{\pi}{2}) \\ \sin(-\frac{\pi}{2}) & \cos(-\frac{\pi}{2}) \end{array} \right] = \left[ \begin{array}{rr} 0 & 1 \\ -1 & 0 \end{array} \right]\) and the matrix of reflection in the \(y\) axis is \(\left[ \begin{array}{rr} -1 & 0 \\ 0 & 1 \end{array} \right]\). Hence the matrix of \(T\) is \(\left[ \begin{array}{rr} -1 & 0 \\ 0 & 1 \end{array} \right] \left[ \begin{array}{rr} 0 & 1 \\ -1 & 0 \end{array} \right] = \left[ \begin{array}{rr} 0 & -1 \\ -1 & 0 \end{array} \right]\) and this is reflection in the line \(y = -x\) (take \(m = -1\) in Theorem [thm:006096]).

    Projections

    The method in the proof of Theorem [thm:006096] works more generally. Let \(P_{m} : \mathbb{R}^2 \to \mathbb{R}^2\) denote projection on the line \(y = mx\). This transformation is described geometrically in Figure [fig:006136].

    If \(m = 0\), then \(P_{0} \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} x \\ 0 \end{array} \right]\) for all \(\left[ \begin{array}{c} x \\ y \end{array} \right]\) in \(\mathbb{R}^2\), so \(P_{0}\) is linear with matrix \(\left[ \begin{array}{rr} 1 & 0 \\ 0 & 0 \end{array} \right]\). Hence the argument above for \(Q_{m}\) goes through for \(P_{m}\). First observe that

    \[P_{m} = R_{\theta} \circ P_{0} \circ R_{-\theta} \nonumber \]

    as before. So, \(P_{m}\) is linear with matrix

    \[\left[ \begin{array}{rr} c & -s \\ s & c \end{array} \right] \left[ \begin{array}{rr} 1 & 0 \\ 0 & 0 \end{array} \right] \left[ \begin{array}{rr} c & s \\ -s & c \end{array} \right] = \left[ \begin{array}{rr} c^2 & sc \\ sc & s^2 \end{array} \right] \nonumber \]

    where \(c = \cos \theta = \frac{1}{\sqrt{1+ m^2}}\) and \(s = \sin \theta = \frac{m}{\sqrt{1+ m^2}}\).

    This gives:

    006137 Let \(P_{m} : \mathbb{R}^2 \to \mathbb{R}^2\) be projection on the line \(y = mx\). Then \(P_{m}\) is a linear transformation with matrix \(\frac{1}{1 + m^2} \left[ \begin{array}{cc} 1 & m \\ m & m^2 \end{array} \right]\).

    Again, if \(m = 0\), then the matrix in Theorem [thm:006137] reduces to \(\left[ \begin{array}{rr} 1 & 0 \\ 0 & 0 \end{array} \right]\) as expected. As the \(y\) axis has no slope, the analysis fails for projection on the \(y\) axis, but this transformation is indeed linear with matrix \(\left[ \begin{array}{rr} 0 & 0 \\ 0 & 1 \end{array} \right]\) as is easily verified directly.

    Note that the formula for the matrix of \(Q_{m}\) in Theorem [thm:006096] can be derived from the above formula for the matrix of \(P_m\). Using Figure [fig:006067], observe that \(Q_{m}(\mathbf{x}) = \mathbf{x} + 2[P_{m}(\mathbf{x}) - \mathbf{x}]\) so \(Q_{m}(x) = 2P_{m}(\mathbf{x})-\mathbf{x}\). Substituting the matrices for \(P_{m}(\mathbf{x})\) and \(1_{\mathbb{R}^2}(\mathbf{x})\) gives the desired formula.

    006148 Given \(\mathbf{x}\) in \(\mathbb{R}^2\), write \(\mathbf{y} = P_{m}(\mathbf{x})\). The fact that \(\mathbf{y}\) lies on the line \(y = mx\) means that \(P_{m}(\mathbf{y}) = \mathbf{y}\). But then

    \[(P_{m} \circ P_{m})(\mathbf{x}) = P_{m}(\mathbf{y}) = \mathbf{y} = P_{m}(\mathbf{x}) \mbox{ for all } \mathbf{x} \mbox{ in } \mathbb{R}^2, \mbox{ that is, } P_{m} \circ P_{m} = P_{m}. \nonumber \]

    In particular, if we write the matrix of \(P_{m}\) as \(A = \frac{1}{1 + m^2} \left[ \begin{array}{cc} 1 & m \\ m & m^2 \end{array} \right]\), then \(A^{2} = A\). The reader should verify this directly.


    1. Note that \(\left[ \begin{array}{rr} -1 & 0 \\ 0 & 1 \end{array}\right] = \lim\limits_{m \to \infty}\frac{1}{1 + m^2} \left[ \begin{array}{cc} 1 - m^2 & 2m \\ 2m & m^2 - 1 \end{array} \right]\).↩

    This page titled 2.6: Linear Transformations is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by W. Keith Nicholson (Lyryx Learning Inc.) via source content that was edited to the style and standards of the LibreTexts platform.