Search

Text Color

Margin Size

Font Type

Enable Dyslexic Font

12.2: Matrix arithmetic

Last updated

Mar 5, 2021
Save as PDF
- 12.1: From linear systems to matrix equations
- 12.3: Solving linear systems by factoring the coefficient matrix

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

In this section, we examine algebraic properties of the set $\mathbb{F}^{m \times n}$ (where $m, n \in \mathbb{Z}_+$ ). Specifically, $\mathbb{F}^{m \times n}$ forms a vector space under the operations of component-wise addition and scalar multiplication, and it is isomorphic to $\mathbb{F}^{mn}$ as a vector space.

We also deﬁne a multiplication operation between matrices of compatible size and show that this multiplication operation interacts with the vector space structure on $\mathbb{F}^{m \times n}$ in a natural way. In particular, $\mathbb{F}^{n \times n}$ forms an algebra over $\mathbb{F}$ with respect to these operations. (See Section C.3 for the deﬁnition of an algebra.)

A.2.1 Addition and scalar multiplication

Let $A = (a_{ij} )$ and $B = (b_{ij} )$ be $m \times n$ matrices over $\mathbb{F}$ (where $m, n \in \mathbb{Z}_+$ ), and let $\alpha \in \mathbb{F}$ .Then matrix addition $A+B = ((a+b)_{ij} )_{m \times n}$ and scalar multiplication $\alpha A = ((\alpha a)_{ij} )_{m \times n}$ are both deﬁned component-wise, meaning

$(a + b)_{ij} = a_{ij} + b_{ij} \rm{~and~} (\alpha a)_{ij} = \alpha a_{ij}.$

Equivalently, $A + B$ is the $m \times n$ matrix given by

$\left[ \begin{array}{ccc} a_{1 1} & \cdots & a_{1 n} \\ \vdots & \ddots & \vdots \\ a_{m 1} & \cdots & a_{m n} \end{array} \right] + \left[ \begin{array}{ccc} b_{1 1} & \cdots & b_{1 n} \\ \vdots & \ddots & \vdots \\ b_{m 1} & \cdots & b_{m n} \end{array} \right] = \left[ \begin{array}{ccc} a_{1 1}+b_{1 1} & \cdots & a_{1 n}+b_{1 n} \\ \vdots & \ddots & \vdots \\ a_{m 1}+b_{m 1} & \cdots & a_{m n}+b_{m n} \end{array} \right],$

and $\alpha A$ is the $m \times n$ matrix given by

$\alpha \left[ \begin{array}{ccc} a_{1 1} & \cdots & a_{1 n} \\ \vdots & \ddots & \vdots \\ a_{m 1} & \cdots & a_{m n} \end{array} \right]= \left[ \begin{array}{ccc} \alpha a_{1 1} & \cdots & \alpha a_{1 n} \\ \vdots & \ddots & \vdots \\ \alpha a_{m 1} & \cdots & \alpha a_{m n} \end{array} \right].$

Example A.2.1. With notation as in Example A.1.3,

$D+E= \left[ \begin{array}{ccc} 7 & 6 & 5 \\ -2 & 1 & 3 \\ 7 & 3 & 7 \end{array} \right],$

and no two other matrices from Example A.1.3 can be added since their sizes are not compatible. Similarly, we can make calculations like

$D-E=D+(-1)E= \left[ \begin{array}{ccc} -5 & 4 & -1 \\ 0 & -1 & -1 \\ -1 & 1 & 1 \end{array} \right] \rm{~and~} 0D=0E=\left[ \begin{array}{ccc} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right]=0_{3\times3}.$

It is important to note that, while these are not the only ways of deﬁning addition and scalar multiplication operations on $\mathbb{F}_{m \times n}$ , the above operations have the advantage of endowing $\mathbb{F}_{m \times n}$ with a reasonably natural vector space structure. As a vector space, $\mathbb{F}_{m \times n}$ is seen to have dimension $mn$ since we can build the standard basis matrices

$E_{1 1} , E_{1 2} , \ldots, E_{1 n} , E_{2 1}, E_{2 2} ,\ldots, E_{2 n} , \ldots, E_{m 1}, E_{m 2} , \ldots, E_{m n}$

by analogy to the standard basis for $\mathbb{F}^{mn}$ . That is, each $E_{kl} = ((e^{(k,l)} )_{ij} )$ satisﬁes

$(e^{(k,l)} )_{ij} = \left\{ \begin{array}{cc} 1, & \rm{~if~} i = k \rm{~and~} j = l \\ 0, & \rm{~otherwise~} \end{array} \right. .$

This allows us to build a vector space isomorphism $\mathbb{F}^{m \times n} \rightarrow \mathbb{F}^{mn}$ using a bijection that simply “lays each matrix out ﬂat”. In other words, given $A = (a_{ij} ) \in \mathbb{F}^{m \times n} ,$

$\left[ \begin{array}{ccc} a_{1 1} & \cdots & a_{1 n} \\ \vdots & \ddots & \vdots \\ a_{m 1} & \cdots & a_{m n} \end{array} \right] \mapsto (a_{1 1} , a_{1 2} , \ldots, a_{1 n} , a_{2 1}, a_{2 2} ,\ldots, a_{2 n} , \ldots, a_{m 1}, a_{m 2} , \ldots, a_{m n}) \in \mathbb{F}^{mn} .$

Example A.2.2. The vector space $\mathbb{R}^{2 \times 3}$ of $2 \times 3$ matrices over $\mathbb{R}$ has standard basis

$\left\{ E_{1 1} = \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right] , E_{1 2}= \left[ \begin{array}{ccc} 0 & 1 & 0 \\ 0 & 0 & 0 \end{array} \right], E_{1 3} = \left[ \begin{array}{ccc} 0 & 0 & 1 \\ 0 & 0 & 0 \end{array} \right], E_{2 1} = \left[ \begin{array}{ccc} 0 & 0 & 0 \\ 1 & 0 & 0 \end{array} \right], E_{2 2}= \left[ \begin{array}{ccc} 0 & 0 & 0 \\ 0 & 1 & 0 \end{array} \right], E_{2 3}= \left[ \begin{array}{ccc} 0 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right] \right\},$

which is seen to naturally correspond with the standard basis $\{e_1 , \ldots, e_6 \}$ for $\mathbb{R}_6$ , where

$e_1 = (1, 0, 0, 0, 0, 0), e_2 = (0, 1, 0, 0, 0, 0), \ldots , e_6 = (0, 0, 0, 0, 0, 1).$

Of course, it is not enough to just assert that $\mathbb{F}^{m \times n}$ is a vector space since we have yet to verify that the above deﬁned operations of addition and scalar multiplication satisfy the vector space axioms. The proof of the following theorem is straightforward and something that you should work through for practice with matrix notation.

Theorem A.2.3. Given positive integers $m, n \in \mathbb{Z}_+$ and the operations of matrix addition and scalar multiplication as deﬁned above, the set $\mathbb{F}^{m \times n}$ of all $m \times n$ matrices satisﬁes each of the following properties.

(associativity of matrix addition) Given any three matrices $A, B, C \in \mathbb{F}^{m \times n} ,$ $(A + B) + C = A + (B + C).$
(additive identity for matrix addition) Given any matrix $A \in \mathbb{F}^{m \times n},$ $A + 0_{m \times n} = 0_{m \times n} + A = A.$
(additive inverses for matrix addition) Given any matrix $A \in \mathbb{F}^{m \times n} ,$ there exists a matrix $−A \in \mathbb{F}^{m \times n}$ such that $A + (−A) = (−A) + A = 0_{m \times n} .$
(commutativity of matrix addition) Given any two matrices $A, B \in \mathbb{F}^{m \times n} ,$ $A + B = B + A.$
(associativity of scalar multiplication) Given any matrix $A \in \mathbb{F}^{m \times n}$ and any two scalars $\alpha,\beta \in \mathbb{F},$ $(\alpha \beta)A = \alpha(\beta A).$
(multiplicative identity for scalar multiplication) Given any matrix $A \in \mathbb{F}^{m \times n}$ and denoting by $1$ the multiplicative identity of $\mathbb{F},$ $1A = A.$
(distributivity of scalar multiplication) Given any two matrices $A, B \in \mathbb{F}^{m \times n}$ and any two scalars $\alpha , \beta \in \mathbb{F},$ $(\alpha + \beta)A = \alpha A + \beta A \rm{~and~} \alpha (A + B) = \alpha A + \alpha B.$

In other words, $\mathbb{F}^{m \times n}$ forms a vector space under the operations of matrix addition and scalar multiplication.

As a consequence of Theorem A.2.3, every property that holds for an arbitrary vector space can be taken as a property of $\mathbb{F}^{m \times n}$ speciﬁcally. We highlight some of these properties in the following corollary to Theorem A.2.3.

Corollary A.2.4. Given positive integers $m, n \in \mathbb{Z}_+$ and the operations of matrix addition and scalar multiplication as deﬁned above, the set $\mathbb{F}^{m \times n}$ of all $m \times n$ matrices satisﬁes each of the following properties:

Given any matrix $A \in \mathbb{F}^{m \times n} ,$ given any scalar $\alpha \in \mathbb{F}$ , and denoting by $0$ the additive identity of $\mathbb{F},$ $0A = A \rm{~and~} \alpha 0_{m \times n} = 0_{m \times n} .$
Given any matrix $A \in \mathbb{F}^{m \times n}$ and any scalar $\alpha \in \mathbb{F},$ $\alpha A = 0 \Longrightarrow \rm{~either~} \alpha = 0 \rm{~or~} A = 0_{m \times n}.$
Given any matrix $A \in \mathbb{F}^{m \times n}$ and any scalar $\alpha \in \mathbb{F},$ $−( \alpha A) = (− \alpha )A = \alpha (−A).$

In particular, the additive inverse $−A$ of $A$ is given by $−A = (−1)A,$ where $−1$ denoted the additive inverse for the additivity identity of $\mathbb{F}.$

While one could prove Corollary A.2.4 directly from deﬁnitions, the point of recognizing $\mathbb{F}^{m \times n}$ as a vector space is that you get to use these results without worrying about their proof. Moreover, there is no need to separate prove for each of $\mathbb{R}^{m \times n}$ and $\mathbb{C}^{m \times n} .$

A.2.2 Multiplying and matrices

Let $r, s, t \in \mathbb{Z}_+$ be positive integers, $A = (a_{ij} ) \in \mathbb{F}^{r \times s}$ be an $r \times s$ matrix, and $B = (b_{ij} ) \in \mathbb{F}^{s \times t}$ be an $s \times t$ matrix. Then matrix multiplication $AB = ((ab)_{ij} )_{r \times t}$ is deﬁned by

$(ab)_{ij} = \sum_{k=1}^{s} a_{ik} b_{kj}.$

In particular, note that the “ $i, j$ entry” of the matrix product $AB$ involves a summation over the positive integer $k = 1, \ldots , s$ , where $s$ is both the number of columns in $A$ and the number of rows in $B$ . Thus, this multiplication is only deﬁned when the “middle” dimension of each matrix is the same:

$(a_{ij} )_{r \times s} (b_{ij} )_{s \times t} =r\left\{ \underset{s}{\underbrace{\left[ \begin{array}{ccc} a_{1 1} & \cdots & a_{1 s} \\ \vdots & \ddots & \vdots \\ a_{r 1} & \cdots & a_{r s} \end{array} \right]}} \underset{t}{\underbrace{\left[ \begin{array}{ccc} b_{1 1} & \cdots & b_{1 t} \\ \vdots & \ddots & \vdots \\ b_{s 1} & \cdots & b_{s t} \end{array} \right]}} \right\} s \\ = \underset{t}{\underbrace{\left.\left[ \begin{array}{ccc} \sum_{k=1}^{s}a_{1k}b_{k1} & \cdots & \sum_{k=1}^{s}a_{1k}b_{kt} \\ \vdots & \ddots & \vdots \\ \sum_{k=1}^{s}a_{rk}b_{k1} & \cdots & \sum_{k=1}^{s}a_{rk}b_{kt} \end{array} \right] \right\}}} r$

Alternatively, if we let $n \in \mathbb{Z}_+$ be a positive integer, then another way of viewing matrixmultiplication is through the use of the standard inner product on $\mathbb{F}^n = \mathbb{F}^{1 \times n} = \mathbb{F}^{n \times 1}$ . In particular, we deﬁne the dot product (a.k.a. Euclidean inner product) of the row vector $x = (x_{1j} ) \in \mathbb{F}^{1 \times n}$ and the column vector $y = (y_{i1} ) \in \mathbb{F}^{n \times 1}$ to be

$x \cdot y = \left[ \begin{array}{ccc} x_{1 1}, & \cdots, & x_{1 n} \end{array} \right] \cdot \left[ \begin{array}{c} y_{1 1}\\ \vdots \\ y_{n 1} \end{array} \right]=\sum_{k=1}^{n}x_{1 k}y_{k 1} \in \mathbb{F}.$

We can then decompose matrices $A = (a_{ij} )_{r \times s}$ and $B = (b_{ij} )_{s \times t}$ into their constituent row vectors by ﬁxing a positive integer $k \in \mathbb{Z}_+$ and setting

$A^{(k,\cdot)} = \left[ \begin{array}{ccc} a_{k 1}, & \cdots, & a_{k s} \end{array} \right] \in \mathbb{F}^{1 \times s} \rm{~and~} B^{(k,\cdot)} = \left[ \begin{array}{ccc} b_{k 1}, & \cdots, & b_{k t} \end{array} \right] \in \mathbb{F}^{1 \times t}.$

Similarly, ﬁxing $l \in \mathbb{Z}_+$ , we can also decompose $A$ and $B$ into the column vectors

$A^{(\cdot, l)} = \left[ \begin{array}{c} a_{1 l} \\ \vdots \\ a_{r l} \end{array} \right] \in \mathbb{F}^{r \times 1} \rm{~and~} B^{(\cdot, l)} = \left[ \begin{array}{c} b_{1 l} \\ \vdots \\ b_{s l} \end{array} \right] \in \mathbb{F}^{s \times 1}.$

It follows that the product $AB$ is the following matrix of dot products:

$AB= \left[ \begin{array}{ccc} A^{(1,\cdot)}\cdot B^{(\cdot,1)} & \cdots & A^{(1,\cdot)}\cdot B^{(\cdot,t)} \\ \vdots & \ddots & \vdots \\ A^{(r,\cdot)}\cdot B^{(\cdot,1)} & \cdots & A^{(r,\cdot)}\cdot B^{(\cdot,t)}\end{array} \right] \in \mathbb{F}^{r \times t}.$

Example A.2.5. With notation as in Example A.1.3, you should sit down and use the above deﬁnitions in order to verify that the following matrix products hold.

$AC= \left[ \begin{array}{c} 3 \\ -1 \\ 1 \end{array} \right] \left[ \begin{array}{ccc} 1, & 4, & 2 \end{array} \right] = \left[ \begin{array}{ccc} 3 & 12 & 6 \\ -1 & -4 & -2 \\ 1 & 4 & 2 \end{array} \right] \in \mathbb{F}^{3 \times 3},$

$CA = \left[ \begin{array}{ccc} 1, & 4, & 2 \end{array} \right] \left[ \begin{array}{ccc} 3 \\ -1 \\ 1 \end{array} \right] = 3-4+2=1 \in \mathbb{F},$

$B^2=BB= \left[ \begin{array}{cc} 4 & -1 \\ 0 & 2 \end{array} \right] \left[ \begin{array}{cc} 4 & -1 \\ 0 & 2 \end{array} \right] = \left[ \begin{array}{cc} 16 & -6 \\ 0 & 4 \end{array} \right] \in \mathbb{F}^{2 \times 2},$

$CE= \left[ \begin{array}{ccc} 1, & 4, & 2 \end{array} \right] \left[ \begin{array}{ccc} 6 & 1 & 3 \\ -1 & 1 & 2 \\ 4 & 1 & 3 \end{array} \right] = \left[ \begin{array}{ccc} 10, & 7, & 17 \end{array} \right] \in \mathbb{F}^{1 \times 3}, \rm{~and~}$

$DA = \left[ \begin{array}{ccc} 1 & 5 & 2 \\ -1 & 0 & 1 \\ 3 & 2 & 4 \end{array} \right] \left[ \begin{array}{c} 3 \\ -1 \\ 1 \end{array} \right] = \left[ \begin{array}{c} 0 \\ -2 \\ 11 \end{array} \right] \in \mathbb{F}^{3 \times 1}.$

Note, though, that $B$ cannot be multiplied by any of the other matrices, nor does it make sense to try to form the products $AD, AE, DC, \rm{~and~} EC$ due to the inherent size mismatches.

As illustrated in Example A.2.5 above, matrix multiplication is not a commutative operation (since, e.g., $AC \in \mathbb{F}^{3 \times 3} \rm{~while~} CA \in \mathbb{F}^{1 \times 1}$ ). Nonetheless, despite the complexity of its deﬁnition, the matrix product otherwise satisﬁes many familiar properties of a multiplication operation. We summarize the most basic of these properties in the following theorem.

Theorem A.2.6. Let $r, s, t, u \in \mathbb{Z}_+$ be positive integers.

(associativity of matrix multiplication) Given $A \in \mathbb{F}^{r \times s} , B \in \mathbb{F}^{s \times t}$ , and $C \in \mathbb{F}^{t×u} ,$ $A(BC) = (AB)C.$
(distributivity of matrix multiplication) Given $A \in \mathbb{F}^{r \times s} , B, C \in \mathbb{F}^{s \times t}$ , and $D \in \mathbb{F}^{t \times u} ,$ $A(B + C) = AB + AC \rm{~and ~} (B + C)D = BD + CD.$
(compatibility with scalar multiplication) Given $A \in \mathbb{F}^{r \times s} , B \in \mathbb{F}^{s \times t}$ , and $\alpha \in \mathbb{F},$ $\alpha (AB) = (\alpha A)B = A(\alpha B).$

Moreover, given any positive integer $n \in \mathbb{Z}_+ , \mathbb{F}^{n \times n}$ is an algebra over $\mathbb{F}.$

As with Theorem A.2.3, you should work through a proof of each part of Theorem A.2.6 (and especially of the ﬁrst part) in order to practice manipulating the indices of entries correctly. We state and prove a useful followup to Theorems A.2.3 and A.2.6 as an illustration.

Theorem A.2.7. Let $A, B \in \mathbb{F}^{n \times n}$ be upper triangular matrices and $c \in \mathbb{F}$ be any scalar.

Then each of the following properties hold:

$cA$ is upper triangular.
$A + B$ is upper triangular.
$AB$ is upper triangular.

In other words, the set of all $m \times n$ upper triangular matrices forms an algebra over $\mathbb{F}.$

Moreover, each of the above statements still holds when upper triangular is replaced by
lower triangular.

Proof. The proofs of Parts 1 and 2 are straightforward and follow directly form the appropriate deﬁnitions. Moreover, the proof of the case for lower triangular matrices follows from the fact that a matrix $A$ is upper triangular if and only if $A^T$ is lower triangular, where $A^T$ denotes the transpose of $A.$ (See Section A.5.1 for the deﬁnition of transpose.)

To prove Part 3, we start from the deﬁnition of the matrix product. Denoting $A = (a_{ij} )$ and $B = (b_{ij} ),$ note that $AB = ((ab)_{ij} )$ is an $n \times n$ matrix having “ $i-j$ entry” given by

$(ab)_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}.$

Since $A$ and $B$ are upper triangular, we have that $a_{ik} = 0$ when $i > k$ and that $b_{kj} = 0$ when $k > j.$ Thus, to obtain a non-zero summand $a_{ik} b_{kj} \neq 0,$ we must have both $a_{ik} \neq 0,$ which implies that $i ≤ k,$ and $b_{kj} \neq 0,$ which implies that $k ≤ j.$ In particular, these two conditions are simultaneously satisﬁable only when $i ≤ j.$ Therefore, $(ab)_{ij} = 0$ when $i > j,$ from which $AB$ is upper triangular.

At the same time, you should be careful not to blithely perform operations on matrices as you would with numbers. The fact that matrix multiplication is not a commutative operation should make it clear that signiﬁcantly more care is required with matrix arithmetic. As another example, given a positive integer $n \in \mathbb{Z}_+$ , the set $\mathbb{F}^{n \times n}$ has what are called zero divisors. That is, there exist non-zero matrices $A, B \in \mathbb{F}^{n \times n}$ such that $AB = 0_{n \times n}:$

$\left[ \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right]^2= \left[ \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right] \left[ \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right]= \left[ \begin{array}{cc} 0 & 0 \\ 0 & 0 \end{array} \right]= 0_{2 \times 2}.$

Moreover, note that there exist matrices $A, B, C \in \mathbb{F}^{n \times n}$ such that $AB = AC$ but $B \neq C$ :

$\left[ \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right] \left[ \begin{array}{cc} 1 & 0 \\ 0 & 0 \end{array} \right]= 0_{2 \times 2}=\left[ \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right]\left[ \begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array} \right].$

As a result, we say that the set $\mathbb{F}^{n \times n}$ fails to have the so-called cancellation property. This failure is a direct result of the fact that there are non-zero matrices in $\mathbb{F}^{n \times n}$ that have no multiplicative inverse. We discuss matrix invertibility at length in the next section and
deﬁne a special subset $GL(n, \mathbb{F}) \subset \mathbb{F}^{n \times n}$ upon which the cancellation property does hold.

A.2.3 Invertibility of square matrices

In this section, we explore the opposite of matrix multiplication. More speciﬁcally, we characterize square matrices for which multiplicative inverses exist.

Deﬁnition A.2.8. Given a positive integer $n \in \mathbb{Z}_+$ , we say that the square matrix $A \in \mathbb{F}^{n \times n}$ is invertible (a.k.a. nonsingular) if there exists a square matrix $B \in \mathbb{F}^{n \times n}$ such that

$AB = BA = I_{n} .$

We use $GL(n, \mathbb{F})$ to denote the set of all invertible $n \times n$ matrices having entries from $\mathbb{F}$ .

One can prove that, if the multiplicative inverse of a matrix exists, then the inverse is unique. As such, we will usually denote the so-called inverse matrix of $A \in GL(n, \mathbb{F})$ by $A_{−1}$ . Even though this notation is analogous to the notation for the multiplicative inverse of a scalar, you should not take this to mean that it is possible to “divide” by a matrix. Moreover, note that the zero matrix $0_{n \times n} \not\in GL(n, \mathbb{F})$ . This means that $GL(n, \mathbb{F})$ is not a vector subspace of $\mathbb{F}^{n \times n}$ .

Since matrix multiplication is not a commutative operation, care must be taken when working with the multiplicative inverses of invertible matrices. In particular, many of the algebraic properties for multiplicative inverses of scalars, when properly modiﬁed, continue to hold. We summarize the most basic of these properties in the following theorem.

Theorem A.2.9. Let $n \in \mathbb{Z}_+$ be a positive integer and $A, B \in GL(n, \mathbb{F}).$ Then

the inverse matrix $A^{−1} \in GL(n, \mathbb{F})$ and satisﬁes $(A_{−1} )^{−1} = A.$
the matrix power $A^m \in GL(n, \mathbb{F})$ and satisﬁes $(A^m )^{−1} = (A^{−1} )^m$ , where $m \in \mathbb{Z}_+$ is any positive integer.
the matrix $\alpha A \in GL(n, \mathbb{F})$ and satisﬁes $(\alpha A)^{−1} = \alpha^{−1} A^{−1}$ , where $\alpha \in \mathbb{F}$ is any non-zero scalar.
the product $AB \in GL(n, \mathbb{F})$ and has inverse given by the formula $(AB)^{−1} = B^{ −1} A^{−1} .$

Moreover, $GL(n, \mathbb{F})$ has the cancellation property. In other words, given any three matrices $A, B, C \in GL(n, \mathbb{F})$ , if $AB = AC,$ then $B = C.$

At the same time, it is important to note that the zero matrix is not the only non-invertible matrix. As an illustration of the subtlety involved in understanding invertibility, we give the following theorem for the $2 \times 2$ case.

Theorem A.2.10. Let $A = \left[ \begin{array}{cc} a_{1 1} & a_{1 2} \\ a_{2 1} & a_{2 2} \end{array} \right] \in \mathbb{F}^{2 \times 2}.$ Then $A$ is invertible if and only if $A$ satisfies

$a_{1 1}a_{2 2}- a_{1 2}a_{2 1} \neq 0.$

Moreover, if $A$ is invertible, then

$A^{-1}= \left[ \begin{array}{cc} \frac{a_{2 2}}{a_{1 1}a_{2 2}- a_{1 2}a_{2 1} } & \frac{-a_{1 2}}{a_{1 1}a_{2 2}- a_{1 2}a_{2 1} } \\ \frac{-a_{2 1}}{a_{1 1}a_{2 2}- a_{1 2}a_{2 1} } & \frac{a_{1 1}}{a_{1 1}a_{2 2}- a_{1 2}a_{2 1} } \end{array} \right].$

A more general theorem holds for larger matrices, but its statement requires substantially more machinery than could reasonably be included here. We nonetheless state this result for completeness and refer the reader to Chapter 8 for the deﬁnition of the determinant.

Theorem A.2.11. Let $n \in \mathbb{Z}_+$ be a positive integer, and let $A = (a_{ij} ) \in \mathbb{F}^{n \times n}$ be an $n \times n$ matrix. Then $A$ is invertible if and only if $det(A) \neq 0.$ Moreover, if $A$ is invertible, then the “ $i, j$ entry” of $A_{−1}$ is $A_{ji} / det(A).$ Here, $A_{ij} = (−1)^{i+j} M_{ij}$ , and $M_{ij}$ is the determinant of the matrix obtained when both the $i$ th row and $j$ th column are removed from $A$ .

We close this section by noting that the set $GL(n, \mathbb{F})$ of all invertible $n \times n$ matrices over $\mathbb{F}$ is often called the general linear group. This set has so many important uses in mathematics that there are many equivalent notations for it, including $GLn (\mathbb{F})$ and $GL(\mathbb{F}n ),$ and sometimes simply $GL(n)$ or $GLn$ if it is not important to emphasis the dependence on $\mathbb{F}.$ Note, moreover, that the usage of the term “group” in the name “general linear group” is highly technical. This is because $GL(n, \mathbb{F})$ forms a nonabelian group under matrix multiplication. (See Section C.2 for the deﬁnition of a group.)

Contributors

Both hardbound and softbound versions of this textbook are available online at WorldScientific.com.

A.2.1 Addition and scalar multiplication

A.2.2 Multiplying and matrices

A.2.3 Invertibility of square matrices

Contributors

Support Center

How can we help?