12.4: Matrices and linear maps

Last updated
Save as PDF

Page ID: 297

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\dsum}{\displaystyle\sum\limits} \)

\( \newcommand{\dint}{\displaystyle\int\limits} \)

\( \newcommand{\dlim}{\displaystyle\lim\limits} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\(\newcommand{\longvect}{\overrightarrow}\)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

As discussed in Chapter 1, the machinery of Linear Algebra can be used to solve systems of linear equations involving a ﬁnite number of unknowns. This section is devoted to illustrating how linear maps are one of the most fundamental tools for gaining insight into the solutions to such systems. Moreover, by involving these special functions, we are able to work in an arbitrarily high
number of dimensions with little more work than that required for two dimensions.

A.4.1 The canonical matrix of a linear map

Let \(m, n \in \mathbb{Z}_+\) be positive integers. Then, given a choice of bases for the vector spaces \(\mathbb{F}^n\) and \(\mathbb{F}^m\) , there is a duality between matrices and linear maps. In other words, as discussed in Section 6.6, every linear map in the set \(\cal{L} (\mathbb{F}^n , \mathbb{F}m ) \) uniquely corresponds to exactly one \(m \times n\) matrix in \(\mathbb{F}^{m \times n} \). However, you should not take this to mean that matrices and linear maps are interchangeable or indistinguishable ideas. By itself, a matrix in the set \(\mathbb{F}^{m \times n} \) is nothing more than a collection of mn scalars that have been arranged in a rectangular shape. It is only when a matrix appears as part of some larger context that the theory of linear maps becomes applicable. In particular, one can gain insight into the solutions of matrix equations when the coeﬃcient matrix is viewed as the matrix associated to a linear map under a convenient choice of bases for \(\mathbb{F}^n\) and \(\mathbb{F}^m .\)

Given a positive integer, \(k \in\mathbb{Z}_+\) , one particularly convenient choice of basis for \(\mathbb{F}^k\) is the so-called standard basis (a.k.a. the canonical basis) \(e_1 , e_2 , \ldots , e_k ,\) where each \(e_i\) is the \(k\)-tuple having zeros for each of its component other than in the \(i^{\rm{th}}\) position:

\[e_i = (0, 0, \ldots , 0, 1, 0, \ldots , 0). \\ ~~~~~ ~~~\uparrow \\ ~~~ ~~~~~i \]

Then, taking the vector spaces \(\mathbb{F}^n\) and \(\mathbb{F}^m\) under their canonical bases, we say that the matrix \(A \in \mathbb{F}^{m \times n}\) associated to the linear map \(T \in \cal{L} (\mathbb{F}^n , \mathbb{F}^m )\) is the canonical matrix for \(T.\) One reason for this choice of basis is that it gives us the particularly nice formula

\[T (x) = Ax, \forall x \in \mathbb{F}^n . \tag{A.4.1} \]

In other words, one can compute the action of the linear map upon any vector in \(\mathbb{F}^n\) by simply multiplying the vector by the associated canonical matrix \(A.\) There are many circumstances in which one might wish to use non-standard bases for either \(\mathbb{F}^n\) or \(\mathbb{F}^m\) , but the trade-oﬀ is that Equation (A.4.1) will no longer hold as stated. (To modify Equation (A.4.1) for use with non-standard bases, one needs to use coordinate vectors as described in Chapter 10.)

The utility of Equation (A.4.1) cannot be overly emphasized. To get a sense of this, consider once again the generic matrix equation (Equation (A.1.3))

\[Ax = b,\]

which involves a given matrix \(A = (a_{ij} ) \in \mathbb{F}^{m \times n}\) , a given vector \(b \in \mathbb{F}^m\) , and the \(n\)-tuple of unknowns \(x\). To provide a solution to this equation means to provide a vector \(x \in \mathbb{F}^n\) for which the matrix product \(Ax\) is exactly the vector \(b\). In light of Equation (A.4.1), the question of whether such a vector \(x \in \mathbb{F}^n\) exists is equivalent to asking whether or not the vector \(b\) is in the range of the linear map \(T\) .

While the encoding of System (A.1.1) into Equation (A.1.3) might be considered a matter of mere notational equivocation, the above reinterpretation of Equation (A.1.3) using linear maps is a genuine change of viewpoint. Solving System (12.3) (and thus Equation (12.5)) essentially amounts to understanding how m distinct objects interact in an ambient space having \(n\)-dimensions. (In particular, solutions to System (A.1.1) correspond to the points of intersect of \(m\) hyperplanes in \(\mathbb{F}^n\) .) On the other hand, questions about a linear map genuinely involve understanding a single object, i.e., the linear map itself. Such a point of view is both extremely ﬂexible and extremely fruitful, as we illustrate in the next section.

A.4.2 Using linear maps to solve linear systems

Encoding a linear system as a matrix equation is more than just a notational trick. Perhaps most fundamentally, the resulting linear map viewpoint can then be used to provide unparalleled insight into the exact structure of solutions to the original linear system. (In general, the more that can be said with absolute certainty when solving a problem, the better.) We illustrate this in the following series of revisited examples.

Example A.4.1. Consider the following inhomogeneous linear system from Example 1.2.1:

\[ \left. \begin{array}{ccccc} 2x_1 &+ &x_2& =& 0 \\ x_1& -& x_2 &=& 1 \end{array} \right\}, \]

where \(x_1\) and \(x_2\) are unknown real numbers. To solve this system, we can ﬁrst form the matrix \(A \in \mathbb{R}^{2 \times 2}\) and the column vector \(b \in \mathbb{R}^2 \) such that

\[ A\left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]= \left[ \begin{array}{cc} 2 & 1 \\ 1 & -1 \end{array} \right] \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]=\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]= b, \]

In other words, we have reinterpreted solving the original linear system as asking when the column vector

\[ \left[ \begin{array}{cc} 2 & 1 \\ 1 & -1 \end{array} \right] \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]= \left[ \begin{array}{c} 2x_1+x_2 \\ x_1-x_2 \end{array} \right] \]

is equal to the column vector \(b\). Equivalently, this corresponds to asking what input vector results in \(b\) being an element of the range of the linear map \(T : \mathbb{R}^2 \rightarrow \mathbb{R}^2\) deﬁned by

\[ T\left( \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]\right )= \left[ \begin{array}{c} 2x_1+x_2 \\ x_1-x_2 \end{array} \right]. \]

More precisely, \(T\) is the linear map having canonical matrix \(A.\)

It should be clear that \(b\) is in the range of \(T\) , since, from Example 1.2.1,

\[ T\left( \left[ \begin{array}{c} 1/3 \\ -2/3 \end{array} \right]\right )= \left[ \begin{array}{c} 0 \\ 1 \end{array} \right]. \]

In addition, note that \(T\) is a a bijective function. (This can be proven, for example, by noting that the canonical matrix \(A\) for \(T\) is invertible.) Since \(T\) is bijective, this means that

\[ x= \left[ \begin{array}{c} x_1 \\ x_2 \end{array} \right]= \left[ \begin{array}{c} 1/3 \\ -2/3 \end{array} \right] \]

is the only possible input vector that can result in the output vector \(b\), and so we have veriﬁed that \(x\) is the unique solution to the original linear system. Moreover, this technique can be trivially generalized to any number of equations.

Example A.4.2. Consider the matrix \(A\) and the column vectors \(x\) and \(b\) from Example A.3.5:

\[ A= \left[ \begin{array}{ccc} 2 & 5 & 3 \\ 1 & 2 & 3 \\ 1 & 0 & 8 \end{array} \right],x= \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array} \right], \rm{~and~} b= \left[ \begin{array}{c} 4 \\ 5 \\ 9 \end{array} \right]. \]

Here, asking if the matrix equation \(Ax = b\) has a solution corresponds is equivalent to asking if \(b\) is an element of the range of the linear map \(T : \mathbb{F}^3 \rightarrow \mathbb{F}^3\) deﬁned by

\[ T\left( \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3\end{array} \right]\right )= \left[ \begin{array}{c} 2x_1+5x_2+3x_3 \\ x_1+2x_2 +3x_3\\ 2x_1+8x_3\end{array} \right]. \]

In order to answer this corresponding question regarding the range of \(T\) , we take a closer look at the following expression obtained in Example A.3.5:

\[A = E_0^{-1} E_1^{-1} \cdots E_7^{-1} ,\]

Here, we have factored \(A\) into the product of eight elementary matrices. From the linear map point of view, this means that can apply the results of Section 6.6 in order to obtain the factorization

\[T = S_0 \circ S_1 \circ \cdots \circ S_7 ,\]

where \(S_i\) is the (invertible) linear map having canonical matrix \(E_i^{-1}\) for \(i = 0, \ldots , 7.\)

This factorization of the linear map \(T\) into a composition of invertible linear maps furthermore implies that \(T\) itself is invertible. In particular, \(T\) is surjective, and so b must be an element of the range of \(T\) . Moreover, \(T\) is also injective, and so b has exactly one pre-image. Thus, the solution that was found for \(Ax = b\) in Example A.3.5 is unique.

In the above examples, we used the bijectivity of a linear map in order to prove the uniqueness of solutions to linear systems. As discussed in Section A.3, though, many linearsystems that do not have unique solutions. Instead, there are exactly two other possibilities:
if a linear system does not have a unique solution, then it will either have no solution or it will have inﬁnitely many solutions. Fundamentally, this is because ﬁnding solutions to a linear system is equivalent to describing the pre-image (a.k.a. pullback) of an element in the codomain of a linear map.

In particular, based upon the discussion in Section A.3.2, it should be clear that solving a homogeneous linear system corresponds to describing the null space of some corresponding linear map. In other words, given any matrix \(A \in \mathbb{F}^{m \times n}\) , ﬁnding the solution space \(N\) to the matrix equation \(Ax = 0\) (as deﬁned in Section A.3.2) is the same thing as ﬁnding \(null(T ),\) where \(T \in \cal{L} (\mathbb{F}^n , \mathbb{F}^m )\) is the linear map having canonical matrix \(A\). (Recall from Section 6.2 that \(null(T )\) is a subspace of \(\mathbb{F}^n .)\) Thus, the fact that every homogeneous linear system has the trivial solution then is equivalent to the fact that the image of the zero vector under any linear map always results in the zero vector, and determining whether or not the trivial solution is unique can be viewed as a dimensionality question about the null space of a corresponding linear map.

We close this section by illustrating this, along with the case for inhomogeneous systems, in the following examples.

Example A.4.3. The following examples use the same matrices as in Example A.3.10.

1. Consider the matrix equation \(Ax = b,\) where \(A\) is the matrix given by

\[ A= \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array} \right] \]

and \(b \in \mathbb{F}^4\) is a column vector. Here, asking if this matrix equation has a solution corresponds to asking if \(b\) is an element of the range of the linear map \(T : \mathbb{F}^3 \rightarrow \mathbb{F}^4\) deﬁned by

\[ T\left( \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3\end{array} \right]\right )= \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3 \\ 0\end{array} \right]. \]

From the linear map point of view, it should be extremely clear that \(Ax = b\) has a solution if and only if the fourth component of \(b\) is zero. In particular, \(T\) is not surjective, so \(Ax = b\) cannot have a solution for every possible choice of \(b\).

However, it should also be clear that \(T\) is injective, from which \(null(T ) = \{0\}.\) Thus, when \(b = 0,\) the homogeneous matrix equation \(Ax = 0\) has only the trivial solution, and so we can apply Theorem A.3.12 in order to verify that \(Ax = b\) has a unique solution for any \(b\) having fourth component equal to zero.

2. Consider the matrix equation \(Ax = b,\) where \(A\) is the matrix given by

\[ A= \left[ \begin{array}{ccc} 1 & 0 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right] \]

and \(b \in \mathbb{F}^4\) is a column vector. Here, asking if this matrix equation has a solution corresponds to asking if b is an element of the range of the linear map \(T : \mathbb{F}^3 \rightarrow \mathbb{F}^4\) defined by

\[ T\left( \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3\end{array} \right]\right )= \left[ \begin{array}{c} x_1+x_3 \\ x_2+x_3 \\ 0 \\ 0\end{array} \right]. \]

From the linear map point of view, it should be extremely clear that \(Ax = b\) has a solution if and only if the third and fourth components of \(b\) are zero. In particular, \(2 = dim(range (T )) < dim(\mathbb{F}^4 ) = 4\) so that \(T\) cannot be surjective, and so \(Ax = b\) cannot have a solution for every possible choice of \(b.\)

In addition, it should also be clear that \(T\) is not injective. E.g.,

\[ T\left( \left[ \begin{array}{c} -1 \\ -1 \\ 1\end{array} \right]\right )= \left[ \begin{array}{c} 0 \\ 0 \\ 0 \\ 0\end{array} \right]. \]

Thus, \(\{0\} \subsetneq null(T ),\) and so the homogeneous matrix equation \(Ax = 0\) will necessarily have inﬁnitely many solution since \(dim(null(T )) > 0.\) Using the Dimension Formula,

\[dim(null(T )) = dim(\mathbb{F}^3 ) - dim(range (T )) = 3 - 2 = 1,\]

and so the solution space for \(Ax = 0\) is a one-dimensional subspace of \(\mathbb{F}^3 \). Moreover, by applying Theorem A.3.12, we see that that \(Ax = b\) must then also have inﬁnitely many solutions for any \(b\) having third and fourth components equal to zero.

3. Consider the matrix equation \(Ax = b,\) where \(A\) is the matrix given by

\[ A= \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right]. \]

and \(b \in \mathbb{F}^3\) is a column vector. Here, asking if this matrix equation has a solution
corresponds to asking if \(b\) is an element of the range of the linear map \(T : \mathbb{F}^3 \rightarrow \mathbb{F}^3\) deﬁned by

\[ T\left( \left[ \begin{array}{c} x_1 \\ x_2 \\ x_3\end{array} \right]\right )= \left[ \begin{array}{c} x_1+x_2+x_3 \\ 0 \\ 0 \end{array} \right]. \]

From the linear map point of view, it should be extremely clear that \(Ax = b\) has a solution if and only if the second and third components of \(b\) are zero. In particular, \(1 = dim(range (T )) < dim(\mathbb{F}^3 ) = 3\) so that \(T\) cannot be surjective, and so \(Ax = b\) cannot have a solution for every possible choice of \(b.\)

In addition, it should also be clear that \(T\) is not injective. E.g.,

\[ T\left( \left[ \begin{array}{c} 1/2 \\ 1/2 \\ -1\end{array} \right]\right )= \left[ \begin{array}{c} 0 \\ 0 \\ 0 \end{array} \right]. \]

Thus, \(\{0\} \subsetneq ( null(T),\) and so the homogeneous matrix equation \(Ax = 0\) will necessarily have inﬁnitely many solution since \(dim(null(T )) > 0.\) Using the Dimension Formula,

\[dim(null(T )) = dim(\mathbb{F}^3 ) - dim(range (T )) = 3 - 1 = 2,\]

and so the solution space for \(Ax = 0\) is a two-dimensional subspace of \(\mathbb{F}^3 .\) Moreover, by applying Theorem A.3.12, we see that that \(Ax = b\) must then also have inﬁnitely many solutions for any \(b\) having second and third components equal to zero.

Contributors

Both hardbound and softbound versions of this textbook are available online at WorldScientific.com.

Search

Text Color

Text Size

Margin Size

Font Type