6.6: The matrix of a linear map

Last updated
Save as PDF

Page ID: 276

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\dsum}{\displaystyle\sum\limits} \)

\( \newcommand{\dint}{\displaystyle\int\limits} \)

\( \newcommand{\dlim}{\displaystyle\lim\limits} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\(\newcommand{\longvect}{\overrightarrow}\)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Now we will see that every linear map \(T \in \mathcal{L}(V, W) \), with \(V \) and \(W \) finite-dimensional vector spaces, can be encoded by a matrix, and, vice versa, every matrix defines such a linear map.

Let \(V \) and \(W \) be finite-dimensional vector spaces, and let \(T:V\to W \) be a linear map. Suppose that \((v_1,\ldots,v_n) \) is a basis of \(V \) and that \((w_1,\ldots,w_m) \) is a basis for \(W \). We have seen in Theorem 6.1.3 that \(T \) is uniquely determined by specifying the vectors \(Tv_1,\ldots, Tv_n\in W \). Since \((w_1,\ldots,w_m) \) is a basis of \(W \), there exist unique scalars \(a_{ij}\in\mathbb{F} \) such that
\begin{equation}\label{eq:Tv}
Tv_j = a_{1j} w_1 + \cdots + a_{mj} w_m \quad \text{for \(1\le j\le n \).} \tag{6.6.1}
\end{equation}
We can arrange these scalars in an \(m\times n \) matrix as follows:
\begin{equation*}
M(T) = \begin{bmatrix}
a_{11} & \ldots & a_{1n}\\
\vdots && \vdots\\
a_{m1} & \ldots & a_{mn}
\end{bmatrix}.
\end{equation*}
Often, this is also written as \(A=(a_{ij})_{1\le i\le m,1\le j\le n} \). As in Section A.1.1, the set of all \(m\times n \) matrices with entries in \(\mathbb{F} \) is denoted by \(\mathbb{F}^{m\times n} \).

Remark 6.6.1. It is important to remember that \(M(T) \) not only depends on the linear map \(T \) but also on the choice of the basis \((v_1,\ldots,v_n) \) for \(V \) and the choice of basis \((w_1,\ldots,w_m) \) for \(W \). The \(j^{\text{th}} \) column of \(M(T) \) contains the coefficients of the \(j^{\text{th}} \) basis vector \(v_j \) when expanded in terms of the basis \((w_1,\ldots,w_m) \), as in Equation 6.6.1.

Example 6.6.2. Let \(T:\mathbb{R}^2\to \mathbb{R}^2 \) be the linear map given by \(T(x,y)=(ax+by,cx+dy) \) for some \(a,b,c,d\in\mathbb{R} \). Then, with respect to the canonical basis of \(\mathbb{R}^2 \) given by \(((1,0),(0,1)) \), the corresponding matrix is
\begin{equation*}
M(T) = \begin{bmatrix} a&b\\ c&d \end{bmatrix}
\end{equation*}
since \(T(1,0) = (a,c) \) gives the first column and \(T(0,1)=(b,d) \) gives the second column.

More generally, suppose that \(V=\mathbb{F}^n \) and \(W=\mathbb{F}^m \), and denote the standard basis for \(V \) by \((e_1,\ldots,e_n) \) and the standard basis for \(W \) by \((f_1,\ldots,f_m) \). Here, \(e_i \) (resp. \(f_i\)) is the \(n\)-tuple (resp. \(m\)-tuple) with a one in position \(i \) and zeroes everywhere else. Then the matrix \(M(T)=(a_{ij}) \) is given by

\begin{equation*}
a_{ij} = (Te_j)_i,
\end{equation*}
where \((Te_j)_i \) denotes the \(i^{\text{th}} \) component of the vector \(Te_j \).

Example 6.6.3. Let \(T:\mathbb{R}^2\to\mathbb{R}^3 \) be the linear map defined by \(T(x,y)=(y,x+2y,x+y) \). Then, with respect to the standard basis, we have \(T(1,0)=(0,1,1) \) and \(T(0,1)=(1,2,1) \) so that
\begin{equation*}
M(T) = \begin{bmatrix} 0&1\\ 1& 2 \\ 1&1 \end{bmatrix}.
\end{equation*}
However, if alternatively we take the bases \(((1,2),(0,1)) \) for \(\mathbb{R}^2 \) and
\(((1,0,0),(0,1,0),(0,0,1)) \) for \(\mathbb{R}^3 \), then \(T(1,2)=(2,5,3) \) and \(T(0,1)=(1,2,1) \) so that
\begin{equation*}
M(T) = \begin{bmatrix} 2&1\\ 5&2 \\ 3&1 \end{bmatrix}.
\end{equation*}

Example 6.6.4. Let \(S:\mathbb{R}^2\to \mathbb{R}^2 \) be the linear map \(S(x,y)=(y,x) \). With respect to the basis \(((1,2),(0,1)) \) for \(\mathbb{R}^2 \), we have
\begin{equation*}
S(1,2) = (2,1) = 2(1,2) -3(0,1) \quad \text{and} \quad
S(0,1) = (1,0) = 1(1,2)-2(0,1),
\end{equation*}
and so
\[ M(S) = \begin{bmatrix} 2&1\\- 3& -2 \end{bmatrix}. \]

Given vector spaces \(V \) and \(W \) of dimensions \(n \) and \(m \), respectively, and given a fixed choice of bases, note that there is a one-to-one correspondence between linear maps in \(\mathcal{L}(V,W)\) and matrices in \(\mathbb{F}^{m\times n} \). If we start with the linear map \(T \), then the matrix \(M(T)=A=(a_{ij})\) is defined via Equation 6.6.1. Conversely, given the matrix \(A=(a_{ij})\in \mathbb{F}^{m\times n} \), we can define a linear map \(T:V\to W \) by setting

\[ Tv_j = \sum_{i=1}^m a_{ij} w_i. \]

Recall that the set of linear maps \(\mathcal{L}(V,W) \) is a vector space. Since we have a one-to-one correspondence between linear maps and matrices, we can also make the set of matrices \(\mathbb{F}^{m\times n} \) into a vector space. Given two matrices \(A=(a_{ij}) \) and \(B=(b_{ij}) \) in \(\mathbb{F}^{m\times n} \) and given a scalar \(\alpha\in \mathbb{F} \), we define the matrix addition and scalar multiplication component-wise:

\begin{equation*}
\begin{split}
A+B &= (a_{ij}+b_{ij}),\\
\alpha A &= (\alpha a_{ij}).
\end{split}
\end{equation*}

Next, we show that the composition of linear maps imposes a product on matrices, also called matrix multiplication. Suppose \(U,V,W \) are vector spaces over \(\mathbb{F} \) with bases \((u_1,\ldots,u_p) \), \((v_1,\ldots,v_n) \) and \((w_1,\ldots,w_m) \), respectively. Let \(S:U\to V \) and \(T:V\to W \) be linear maps. Then the product is a linear map \(T\circ S:U\to W \).

Each linear map has its corresponding matrix \(M(T)=A, M(S)=B \) and \(M(TS)=C \). The question is whether \(C \) is determined by \(A \) and \(B \). We have, for each \(j\in \{1,2,\ldots p\} \), that

\begin{equation*}
\begin{split}
(T\circ S) u_j &= T(b_{1j}v_1 + \cdots + b_{nj} v_n) = b_{1j} Tv_1 + \cdots + b_{nj} Tv_n\\
&= \sum_{k=1}^n b_{kj} Tv_k
= \sum_{k=1}^n b_{kj} \bigl( \sum_{i=1}^m a_{ik} w_i \bigr)\\
&= \sum_{i=1}^m \bigl(\sum_{k=1}^n a_{ik} b_{kj} \bigr) w_i.
\end{split}
\end{equation*}

Hence, the matrix \(C=(c_{ij}) \) is given by
\begin{equation} \label{eq:c}
c_{ij} = \sum_{k=1}^n a_{ik} b_{kj}. \tag{6.6.2}
\end{equation}

Equation 6.6.2 can be used to define the \(m\times p \) matrix \(C\) as the product of a \(m\times n \) matrix \(A\) and a \(n\times p \) matrix \(B \), i.e.,
\begin{equation}
C = AB. \tag{6.6.3}
\end{equation}

Our derivation implies that the correspondence between linear maps and matrices respects the product structure.

Proposition 6.6.5. Let \(S:U\to V \) and \(T:V\to W \) be linear maps. Then

\[ M(TS) = M(T)M(S).\]

Example 6.6.6. With notation as in Examples 6.6.3 and 6.6.4, you should be able to verify that
\begin{equation*}
M(TS) = M(T)M(S) = \begin{bmatrix} 2&1\\ 5&2 \\ 3&1 \end{bmatrix}
\begin{bmatrix} 2&1\\- 3& -2 \end{bmatrix}
= \begin{bmatrix} 1&0\\ 4&1\\ 3&1 \end{bmatrix}.
\end{equation*}

Given a vector \(v\in V \), we can also associate a matrix \(M(v) \) to \(v \) as follows. Let \((v_1,\ldots,v_n) \) be a basis of \(V \). Then there are unique scalars \(b_1,\ldots,b_n\) such that

\[ v= b_1 v_1 + \cdots b_n v_n. \]

The matrix of \(v \) is then defined to be the \(n\times 1 \) matrix

\[ M(v) = \begin{bmatrix} b_1 \\ \vdots \\ b_n \end{bmatrix}. \]

Example 6.6.7 The matrix of a vector \(x=(x_1,\ldots,x_n) \in \mathbb{F}^n \) in the standard basis \((e_1,\ldots,e_n)\) is the column vector or \(n \times 1 \) matrix
\begin{equation*}
M(x) = \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix}
\end{equation*}
since \(x=(x_1,\ldots,x_n) = x_1 e_1 + \cdots + x_n e_n \).

The next result shows how the notion of a matrix of a linear map \(T:V\to W \) and the matrix of a vector \(v\in V \) fit together.

Proposition 6.6.8. Let \(T:V\to W \) be a linear map. Then, for every \(v\in V \),
\begin{equation*}
M(Tv) = M(T) M(v).
\end{equation*}

Proof.

Let \((v_1,\ldots,v_n) \) be a basis of \(V \) and \((w_1,\ldots,w_m) \) be a basis for \(W \). Suppose that, with respect to these bases, the matrix of \(T \) is \(M(T)=(a_{ij})_{1\le i\le m, 1\le j\le n} \). This means that, for all \(j\in \{1,2,\ldots,n\} \),

\[ \begin{equation*}
Tv_j = \sum_{k=1}^m a_{kj} w_k.
\end{equation*} \]

The vector \(v\in V \) can be written uniquely as a linear combination of the basis vectors as

\[ v = b_1 v_1 + \cdots + b_n v_n. \]

Hence,

\begin{equation*}
\begin{split}
Tv &= b_1 T v_1 + \cdots + b_n T v_n\\
&= b_1 \sum_{k=1}^m a_{k1} w_k + \cdots + b_n \sum_{k=1}^m a_{kn} w_k\\
&= \sum_{k=1}^m (a_{k1} b_1 + \cdots + a_{kn} b_n) w_k.
\end{split}
\end{equation*}

This shows that \(M(Tv) \) is the \(m\times 1 \) matrix

\begin{equation*}
M(Tv) = \begin{bmatrix} a_{11}b_1 + \cdots + a_{1n} b_n \\ \vdots \\
a_{m1}b_1 + \cdots + a_{mn} b_n \end{bmatrix}.
\end{equation*}

It is not hard to check, using the formula for matrix multiplication, that \(M(T)M(v)\) gives the same result.

Example 6.6.9. Take the linear map \(S \) from Example 6.6.4 with basis \(((1,2),(0,1)) \) of \(\mathbb{R}^2 \). To determine the action on the vector \(v=(1,4)\in \mathbb{R}^2 \), note that \(v=(1,4)=1(1,2)+2(0,1) \). Hence,
\begin{equation*}
M(Sv) = M(S)M(v) = \begin{bmatrix} 2&1\\-3&-2 \end{bmatrix}
\begin{bmatrix} 1\\2 \end{bmatrix}
= \begin{bmatrix} 4\\ -7 \end{bmatrix}.
\end{equation*}

This means that

\[ Sv= 4(1,2)-7(0,1)=(4,1), \]

which is indeed true.

Contributors

Both hardbound and softbound versions of this textbook are available online at WorldScientific.com.

Search

Text Color

Text Size

Margin Size

Font Type