Skip to main content
Mathematics LibreTexts

3.2 Matrix Operations

  • Page ID
    113658
  • This page is a draft and is under active development. 

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    \( \def\Span#1{\text{Span}\left\lbrace #1\right\rbrace} \def\vect#1{\mathbf{#1}} \def\ip{\boldsymbol{\cdot}} \def\iff{\Longleftrightarrow} \def\cp{\times} \)

     

    Introduction

    In Chapter 2 matrices were introduced to represent systems of linear equations. The coefficients of a linear system were put into the coefficient matrix \(A \), and a system as a whole could be squeezed into the augmented matrix. In Section Sec:LinearTrafo we used matrices to construct linear transformations. In this chapter we will study matrices as entities on their own, though every now and then we will keep in mind their role in the two contexts just mentioned.

    Sum, Scalar Multiple and Transpose

    In this section we will define the sum and the product of two matrices, and the transpose of a matrix. Recall that an \(m\times n \) matrix has \(m \) (horizontal) rows of size \(n \) or, equivalently, \(n \) (vertical) columns of size \(m \).

    Definition
    Two matrices are said to have the same size if they have the same number of rows and the same number of columns. Two matrices \(A \) and \(B \) are equal if they have the same size, say \(m \) rows and \(n \) columns, and all the corresponding entries are equal, i.e. \[ a_{ij} = b_{ij}, \text{ for }i = 1,\ldots,m, j = 1,\ldots,n. \nonumber\]
    Definition
    A zero matrix \(O \) is a matrix with all entries equal to 0. If the context requires clarity as to its size it may be denoted by \(O_{mn} \).
    Definition (Scalar multiplication)
    If \(A \) is an \(m\times n \) matrix and \(c \) is a scalar, then \(cA \) is the \(m \times n \) matrix that is the result of multiplying each entry of \(A \) by \(c \): \[ c \left[\begin{array}{cccc} a_{11} & a_{12}& \ldots& a_{1n} \\ a_{21} & a_{22}& \ldots& a_{2n} \\ \vdots & \vdots& \cdots& \vdots \\ a_{m1} & a_{m2}& \ldots& a_{mn} \end{array} \right] = \left[\begin{array}{cccc} ca_{11} & ca_{12}& \ldots& ca_{1n} \\ ca_{21} & ca_{22}& \ldots& ca_{2n} \\ \vdots & \vdots& \cdots& \vdots \\ ca_{m1} & ca_{m2}& \ldots& ca_{mn} \end{array} \right]. \nonumber\]
    Definition (The sum of two matrices)
    If \(A \) and \(B \) are two \(m\times n \) matrices then the sum \(A+B \) is the \(m\times n \) matrix of which the entry on the position \((i,j) \) is the sum of the corresponding entries of \(A \) and \(B \): \[ \left[\begin{array}{cccc} a_{11} & a_{12}& \ldots& a_{1n} \\ a_{21} & a_{22}& \ldots& a_{2n} \\ \vdots & \vdots& \cdots& \vdots \\ a_{m1} & a_{m2}& \ldots& a_{mn} \end{array} \right] + \left[\begin{array}{cccc} b_{11} & b_{12}& \ldots& b_{1n} \\ b_{21} & b_{22}& \ldots& b_{2n} \\ \vdots & \vdots& \cdots& \vdots \\ b_{m1} & b_{m2}& \ldots& b_{mn} \end{array} \right] = \nonumber\] \[ = \left[\begin{array}{cccc} a_{11}+b_{11} & a_{12}+b_{12}& \ldots& a_{1n}+b_{1n} \\ a_{21}+b_{21} & a_{22}+b_{22}& \ldots& a_{2n}+b_{2n} \\ \vdots & \vdots& \cdots& \vdots \\ a_{m1}+b_{m1} & a_{m2}+b_{m2}& \ldots& a_{mn}+b_{mn} \end{array} \right]. \nonumber\] If \(A \) and \(B \) are not of the same size their sum is not defined.
    Example
    \[ \left[\begin{array}{r} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{array}\right] + \left[\begin{array}{r} 3 & 2 \\ 4 & -5 \\ 2 & 5 \end{array}\right] = \left[\begin{array}{r} 4 & 5 \\ 9 & -3 \\ 8 & 1 \end{array}\right], \nonumber\] \[ \left[\begin{array}{r} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{array}\right] + \left[\begin{array}{r} 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{array}\right] = \left[\begin{array}{r} 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{array}\right] + \left[\begin{array}{r} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{array}\right] = \left[\begin{array}{r} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{array}\right], \nonumber\] \[ \begin{array}{lcl} \left[\begin{array}{rr} 1 & 3 & 5 \\ 2 & 4 & 1 \end{array}\right] + (-1) \left[\begin{array}{rr} 1 & 3 & 5 \\ 2 & 4 & 1 \end{array}\right] &=& \left[\begin{array}{rr} 1 & 3 & 5 \\ 2 & 4 & 1 \end{array}\right] + \left[\begin{array}{rr} -1 & -3 &-5 \\ -2 & -4 & -1 \end{array}\right] \\ &=&\quad \left[\begin{array}{rr} 0 & 0 & 0 \\ 0 & 0 & 0 \end{array}\right]. \end{array}. \nonumber\]

    The multiple \((-1)A \) is also written as \(-A \). An obvious property, illustrated in the third example, is: \[ A + (-A) = O, \nonumber\] where \(O \) is the zero matrix.

    Example
    \[ \left[\begin{array}{r} 1 & 3 \\ 5 & 2 \\ 6 & -4 \end{array}\right] + \left[\begin{array}{rr} 1 & 3 & 5 \\ 2 & 4 & 1 \end{array}\right] \nonumber\] is not defined. This is because the matrices do not have the same size.
    Note
    The two definitions of sum and scalar multiple are called componentwise definitions. They are completely analogous to the definitions of the scalar multiples of a vector and the sum of two vectors. Hence it is not surprising that they obey exactly the same rules, as is summarized in the next proposition. (cf. Section Sec:Vectors.)
    Proposition
    Suppose \(A, B \) and \(C \) are \(m\times n \) matrices and let \(c_{1},c_{2} \) be two real numbers. Then we have:
    1. \(A+O_{mn}=A=O_{mn}+A \)
    2. \((A+B)+C=A+(B+C) \)
       
    3. \(A+B=B+A \)
       
    4. \(A+(-A)=O \)
    5. \(1A=A \)
    6. \(c_{1}(A+B)=c_{1}A+c_{1}B \)
    7. \((c_{1}+c_{2})A=c_{1}A+c_{2}A \)
    8. \(c_{1}(c_{2}A)=(c_{1}c_{2})A \)

    An operator of which the usefulness is not immediately clear, but which fits well in this section with matrix operations, is the following:

    Definition
    The transpose of an \(m \times n \) matrix \(A \) with entries \(a_{ij} \) is the \(n \times m \) matrix \(B \) with entries \(b_{ij} \) defined by \( b_{ij} = a_{ji} \). It is denoted by \(B = A^T \).
    Example
    \[ \left[\begin{array}{r} 1 & 3 \\ 5 & 2 \\ 6 & 4 \end{array}\right]^T = \left[\begin{array}{rr} 1 & 5 & 6 \\ 3 & 2 & 4 \end{array}\right] \quad \text{and} \quad \left[\begin{array}{rrr} -1 & 2 & -4 & 0\end{array}\right]^T = \left[\begin{array}{r} -1 \\ 2 \\-4 \\ 0\end{array}\right]. \nonumber\]

    The following rules involving the three operators defined so far in this section are easy to prove:

    Proposition
    Let \(A \) and \(B \) be \(m\times n \) matrices and \(c \) a scalar. Then we have
    1. \((cA)^T = c A^T \)
    2. \((A+B)^T = A^T + B^T \)
    3. \((A^T)^T = A \).
    Skip/Read the proof
    Proof
    We will prove the second statement and leave the other two to the diligent reader. See Exercise 13. So, suppose \(A \) and \(B \) are two \(m \times n \) matrices. Then \(A+B \) is an \(m \times n \) matrix too, hence \((A+B)^T \) is an \(n \times m \) matrix. The matrix \(A^T + B^T \) on the right-hand side of the equation is the sum of two \(n \times m \) matrices, which is again an \(n \times m \) matrix. So the matrices on both sides of the equation have the same size. Next we have to show that they have equal entries on the corresponding positions. If we put \[ E = (A+B)^T \quad \text{and}\quad F = A^T + B^T \nonumber \nonumber\] we see that \[ e_{ij} = \text{ entry of } (A+B) \text{ on position }(j,i) \nonumber \nonumber\] and \[ \begin{array}{rl} f_{ij} &= \text{ entry of } A^T \text{ on position }(i,j) + \text{ entry of } B^T \text{ on position }(i,j) \\ &= \text{ entry of } A \text{ on position }(j,i) + \text{ entry of } B \text{ on position }(j,i)\\ &= \text{ entry of } (A+B) \text{ on position }(j,i)\\ &= e_{ij}, \end{array} \nonumber \nonumber\] so we are done. If you are lost in the forest of indices, have a look at Example 12
    Example
     
    We check property (ii) for two general \(3\times 4 \) matrices \(A \) and \(B \) on the position \((2,3) \). Let \[ A = \left[\begin{array}{rrr} a_{11}& a_{12} & a_{13} & a_{14} \\ a_{21}& a_{22} & a_{23} & a_{24} \\ a_{31} & \fbox{\(a_{32} \)} & a_{33} & a_{34} \end{array}\right] \quad \text{and} \quad B = \left[\begin{array}{rrr} b_{11}& b_{12} & b_{13} & b_{14} \\ b_{21}& b_{22} & b_{23} & b_{24} \\ b_{31} & \fbox{\(b_{32} \)} & b_{33} & b_{34} \end{array}\right]. \nonumber\] Then \[ E = (A+B)^T = \left[\begin{array}{rrr} a_{11}+b_{11}& a_{12}+b_{12} & a_{13}+b_{13} & a_{14} +b_{14}\\ a_{21}+b_{21}& a_{22}+b_{22} & a_{23}+b_{23} & a_{24}+b_{24} \\ a_{31}+b_{31} & \fbox{\(a_{32}+b_{32} \)} & a_{33}+b_{33} & a_{34}+b_{34} \end{array}\right]^T \nonumber\] so \[ E = \left[\begin{array} {rr} a_{11}+b_{11}& a_{21}+b_{21} & a_{31}+b_{31} \\ a_{12}+b_{12}& a_{22}+b_{22} & \fbox{\(a_{32}+b_{32} \)} \\ a_{13}+b_{13}& a_{23}+b_{23} & a_{33}+b_{33} \\ a_{14} +b_{14}& a_{24}+b_{24} & a_{34}+b_{34} \end{array}\right], \nonumber\] and on position \((2,3) \) we have \(a_{32}+b_{32} \). On the other hand \[ F = A^T + B^T = \left[\begin{array} {rr} a_{11}& a_{21} & a_{31} \\ a_{12}& a_{22} & \fbox{\(a_{32} \)} \\ a_{13}& a_{23} & a_{33}\\ a_{14}& a_{24} & a_{34}\end{array}\right] + \left[\begin{array} {rr} b_{11}& b_{21} & b_{31} \\ b_{12}& b_{22} & \fbox{\(b_{32} \)} \\ b_{13}& b_{23} & b_{33}\\ b_{14}& b_{24} & b_{34} \end{array}\right], \nonumber\] with on position \((2,3) \) the value \({a_{32}} \) + \({b_{32}} \).
    Exercise
     
    Prove statements (i) and (iii) of the above proposition.
    Example
    Find \(X \) if \(A + 2X^T + B = C \), where \[ A = \left[\begin{array}{rr} 1 & 1 & 2 \\ 3 & 1 & 0 \end{array}\right], \quad B = \left[\begin{array}{rr} 2 & 0 & 3 \\ 2 & 3 & 4 \end{array}\right], \text{ and} \quad C = \left[\begin{array}{rr} 7 & 5 & 1 \\ 1 & 4 & 2 \end{array}\right]. \nonumber \nonumber\] We will extricate \(X \) step by step: \[ A + 2X^T + B = C \iff 2X^T = C-A-B \iff X^T = \tfrac12(C-A-B). \nonumber \nonumber\] Next we transpose both terms to find \[ X = \tfrac12(C-A-B)^T = \frac12 \left[\begin{array}{rr} 4 & 4 & -4 \\ -4 & 0 & -2 \end{array}\right]^T = \left[\begin{array}{r} 2 & -2 \\ 2 & 0 \\ -2 & -1 \end{array}\right] \nonumber \nonumber\]

    The product of two matrices

     

    Next we turn our attention to the most important matrix operation, namely the product \(AB \) of two matrices. In the previous chapter we have already seen the special case where \(B \) is a matrix of just one column, i.e., \[ B = \vect{x} = \left[\begin{array}{r}x_1 \\ x_2 \\ \vdots \\ x_n \end{array}\right], \nonumber \nonumber\] a vector in \(\mathbb{R}^n \), which we can identify with an \(n \times 1 \) matrix. We want of course the definition to be consistent with this.

    Definition
    The product of an \(m\times n \) matrix \(A \) and an \(n\times p \) matrix \(B = [ \vect{b}_1 \vect{b}_2 \ldots \vect{b}_p ] \) is defined by \[ AB = [ A\vect{b}_1 A\vect{b}_2 \ldots A\vect{b}_p ]. \nonumber \nonumber\] So we have \[ j\text{-th column of } AB = A\text{ times \(j \)-th column of } B, \quad j = 1,2,\ldots,p \nonumber\] Note that this makes \(AB \) an \(m \times p \) matrix. If the number of columns of \(A \) is not equal to the number of rows of \(B \) the product \(AB \) is not defined.
    Example
    \[ \left[\begin{array}{r} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{array}\right] \left[\begin{array}{rr} 2 & 1 & 1\\ 3 & 0 & 2 \end{array}\right] = \left[\begin{array}{rr} -7 & 1 & -5 \\ 4 & -1 & 3 \\ 0 & 3 &-1 \end{array}\right]. \nonumber\] For instance, the third column is computed as \[ \left[\begin{array}{r} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{array}\right] \left[\begin{array}{r} 1\\ 2 \end{array}\right] = 1 \left[\begin{array}{r} 1 \\ -1 \\ 3\end{array}\right] + 2 \left[\begin{array}{r} -3 \\ 2 \\ -2 \end{array}\right] = \left[\begin{array}{r} -5 \\ 3 \\ -1\end{array}\right] \nonumber\]
    Proposition
     
    The product of the \(m\times n \) matrix \(A \) and the \(n\times p \) matrix \(B \) is the \(m\times p \) matrix \(C \) for which the entry on the position \((i,j) \) is given by \[ c_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} + \ldots + a_{in}b_{nj} = \left[\begin{array}{rrr}a_{i1} & a_{i2} & \ldots & a_{in} \end{array}\right] \left[\begin{array}{r} b_{1j} \\ b_{2j} \\ \vdots \\ b_{nj}\end{array}\right]. \nonumber \nonumber\]
    Skip/Read the proof
    Proof
    We already saw this row-column expansion in Section Sec:MatVecProduct.

    The following scheme nicely visualizes the row-column expansion \[ \begin{array}{ccc} & \left[\begin{array} {rrrrr} b_{11} & b_{12}& \ldots& {\color{blue}b_{1j}} & \ldots& b_{1p} \\ b_{21} & b_{22}& \ldots& {\color{blue}b_{2j}} & \ldots& b_{2p} \\ \vdots & \vdots& \ldots& & \ldots& \vdots \\ b_{m1} & b_{m2}& \ldots& {\color{blue}b_{nj}} & \ldots& b_{mp} \end{array}\right] \\ \left[\begin{array} {rrrr} a_{11} & a_{12}& \ldots& \ldots& a_{1n} \\ a_{21} & a_{22}& \ldots& \ldots& a_{2n} \\ \vdots & \vdots& \ldots& \ldots& \vdots \\ {\color{blue}a_{i1}} & {\color{blue}a_{i2}}& {\color{blue}\cdots}& \ldots& {\color{blue}a_{in}} \\ \vdots & \vdots& \ldots& \ldots& \vdots \\ a_{m1} & a_{m2}& \ldots& \ldots& a_{mn} \end{array}\right] \!\! & \! \left[\begin{array} {rrrrr} c_{11} & c_{12}& \ldots& c_{1j} &\ldots& c_{1p} \\ c_{21} & a_{22}& \ldots& c_{2j} &\ldots& c_{2p} \\ \vdots & \vdots& \ldots& & \ldots& \vdots \\ c_{i1} & a_{i2}& \cdots&{\color{blue}c_{ij}} &\ldots& c{in} \\ \vdots & \vdots& \ldots& &\ldots& \vdots \\ a_{m1} & a_{m2}& \ldots& c_{n} &\ldots& a_{np} \end{array}\right] \end{array} \nonumber\]

    Example
    Let us consider the same matrix product \[ \left[\begin{array}{r} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{array}\right] \left[\begin{array}{rr} 2 & 1 & 1\\ 3 & 0 & 2 \end{array}\right] = \left[\begin{array}{rr} -7 & 1 & -5 \\ 4 & -1 & 3 \\ 0 & 3 &-1 \end{array}\right]. \nonumber \nonumber\] The \(-5 \) on position \((1,3) \) and the \(3 \) on position \((3,2) \) in the product come from \[ -5 = \left[\begin{array}{r} 1 & -3 \end{array}\right] \left[\begin{array}{r} 1\\ 2 \end{array}\right] \text{ and } 3 = \left[\begin{array}{r} 3 & -2 \end{array}\right] \left[\begin{array}{r} 1\\ 0 \end{array}\right]. \nonumber \nonumber\]
    Exercise
     
    Explain why the product \[ \left[\begin{array}{r} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{array}\right] \left[\begin{array}{r} 1 & -3 \\ -1 & 2 \\ 3& -2 \end{array}\right] \nonumber \nonumber\] is not defined.
    Note
    The product of a matrix \(A \) with itself is only defined if \(A \) is an \(n \times n \) matrix. In that case we use the obvious notation \[ A^2 = A\cdot A. \nonumber \nonumber\]
    Exercise
     
    Suppose \(A = \left[\begin{array}{c} \vect{a}_1 & \vect{a}_2 & \ldots & \vect{a}_n \end{array}\right] \) is an \(m\times n \) matrix \(A \) and \(B= \left[\begin{array}{c} \vect{b}_1 & \vect{b}_2 & \ldots & \vect{b}_p \end{array}\right] \) an \(m\times p \) matrix. Show that \[ A^TB = \left[\begin{array}{rrr} \vect{a}_1 \ip \vect{b}_1 & \vect{a}_1\ip\vect{b}_2 & \ldots & \vect{a}_1\ip \vect{b}_p \\ \vect{a}_2\ip \vect{b}_1 & \vect{a}_2\ip\vect{b}_2 & \ldots & \vect{a}_2\ip \vect{b}_p \\ \vdots & \vdots & & \vdots \\ \vect{a}_n\ip \vect{b}_1 & \vect{a}_n\ip\vect{b}_2 & \ldots & \vect{a}_n\ip \vect{b}_p \\ \end{array}\right], \nonumber\] where \(\vect{a}\ip\vect{b} \) is the dot product of the vectors \(\vect{a} \) and \(\vect{b} \).
    Exercise
     
    The special case in the previous exercise where \(A = B \) will become very important when we will look at orthogonal projections. For now, show that: the columns of a matrix \(A \) are orthogonal if and only if the matrix \(A^TA \) is a diagonal matrix.

     

    Example
     
    \[ \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{array}\right] \left[\begin{array}{rr} 1 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 1 \end{array}\right] = \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{array}\right]. \nonumber \nonumber\]

    This example illustrates the existence of a `unit element' with respect to the multiplication. To identify it we first introduce some more terminology.

    Definition
    An \(n\times n \) matrix \(A \) is called a square matrix. So it is a matrix where the number of columns is equal to the number of rows. For a square matrix \(A \) we call the elements \(a_{ii} \) the diagonal elements. Together the diagonal elements form the (main) diagonal of \(A \). A square matrix where all non-diagonal elements are equal to 0 is called a diagonal matrix.
    Note
    The other diagonal of a square matrix, the one from bottom left to top right, plays a minor role. For this reason we don't reserve a name for it. By `diagonal' we will always mean: main diagonal.
    Example
    Consider the matrices \[ A = \left[\begin{array}{r} 2 & 2 \\ 3 & 3 \end{array}\right], \quad B = \left[\begin{array}{rr} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 6 \end{array}\right], \quad C = \left[\begin{array}{r} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array}\right]. \nonumber \nonumber\] The matrices \(A \) and \(B \) are square, and only \(B \) is a diagonal matrix.
    Exercise
     
    Is the following statement true or false? The \(n \times n \) zero matrix \(O_{nn} \) is a diagonal matrix.
    Definition
    The identity matrix \(I_n \) is the \(n \times n \) diagonal matrix with 1's on the diagonal. If the size is irrelevant or clear from the context, we denote it simply by \(I \).
    Exercise
     
    Let \[ I = I_4 = \left[\begin{array}{rrr}1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \quad \text{and} \quad A = \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \end{array}\right] \nonumber \nonumber\] Show that \(IA = A \).

    The definition of the product of two matrices and the earlier definition of the product of a matrix and a vector (Def. Dfn:MatVectProd:ProductMatVec) immediately imply that the columns of the product of two matrices are linear combinations of the columns of the first matrix. As is often the case in linear algebra things can be looked at from a different perspective. From Proposition 17 it follows that the elements \(c_{i1},c_{i2},\ldots,c_{in} \) of the \(i \)-th row of the product \(C = AB \) as far as \(A \) is concerned only depend on the elements \(a_{ik} \) of its \(i \)-th row. The following proposition explains in which way.

    Proposition
     
    The \(i \)-th row of the product \(AB \) is the linear combination of the rows of the second matrix, \(B \), with the entries of the \(i \)-th row of \(A \) as coefficients.
    Skip/Read the proof
    Proof
    The indicated linear combination yields: \[ a_{i1} \left[\begin{array}{rrr}b_{11} & b_{12} & \ldots &b_{1p} \end{array}\right] + a_{i2} \left[\begin{array}{rr}b_{21} & \ldots &b_{2p} \end{array}\right] + \ldots + a_{in} \left[\begin{array}{rrr}b_{n1} & b_{n2} & \ldots &b_{np} \end{array}\right] \nonumber \nonumber\] \[ = \left[\begin{array}{c} (a_{i1}b_{11} + a_{i2}b_{21}+ \ldots +a_{in}b_{n1}) & \ldots & (a_{i1}b_{1p} + a_{i2} b_{2p} + \ldots + a_{in}b_{np}) \end{array}\right]. \nonumber \nonumber\] This is a row vector with on the \(j \)-th position the number \[ (a_{i1}b_{1j} + a_{i2} b_{2j} + \ldots + a_{in}b_{nj}), \nonumber \nonumber\] and that is precisely the entry \(c_{ij} \) of the matrix \(C = AB \).

    Interestingly this opens the way to describe the row operations of Chapter 2 via matrix multiplication. The following example illustrates this for the three basic row operations.

    Example
    The following multiplication adds the first row of the matrix \[ A = \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \end{array}\right] \nonumber \nonumber\] four times to the second row: \[ \left[\begin{array}{rr} 1 & 0 & 0 \\ 4 & 1 & 0 \\ 0 & 0 & 1\end{array}\right] \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \\ \end{array}\right] = \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ 4a_{11}+a_{21}&4a_{12} +a_{22}& 4a_{13}+a_{23} \\ a_{31}& a_{32} & a_{33} \\ \end{array}\right]. \nonumber \nonumber\] Here the third row is scaled with a factor 5: \[ \left[\begin{array}{rr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5\end{array}\right] \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{array}\right] = \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ 5a_{31}& 5a_{32} & 5a_{33} \\ \end{array}\right]. \nonumber \nonumber\] And with the following multiplication the first and third row of \(A \) are swapped: \[ \left[\begin{array}{rr} 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0\end{array}\right] \left[\begin{array}{rr} a_{11}& a_{12} & a_{13} \\ a_{21}& a_{22} & a_{23} \\ a_{31}& a_{32} & a_{33} \end{array}\right] = \left[\begin{array}{rr}a_{31}& a_{32} & a_{33} \\ a_{21}& a_{22} & a_{23} \\ a_{11}& a_{12} & a_{13} \end{array}\right]. \nonumber \nonumber\]

    For future reference we give these matrices a name:

    Definition
     
    The matrices \(E \) that perform one single row operation (row replacement, row scaling, or row exchange) via \(A \mapsto EA \) are called elementary matrices.
    Exercise
     
    Describe in words which row operations are the effect of pre-multiplying a \(4\times n \) matrix \(A \) with the following elementary matrices: \[ E_1 = \left[\begin{array}{rrr} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & -1\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array}\right], \quad \quad E_2 = \left[\begin{array}{rrr} 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 1\\ 0 & 0 & 1 & 0\\ 0 & 1 & 0 & 0 \end{array}\right]. \nonumber \nonumber\]
    Example
     
    The following product may at first sight seem a bit odd, but it is exactly according to the definition: \[ \left[\begin{array}{r} 1 \\-2\\3\\4 \end{array}\right] \left[\begin{array}{rrr} 2&4&0& -1 \end{array}\right] = \left[\begin{array}{rrr} 2 & 4 & 0 & -1 \\ -4 & -8 & 0 & 2 \\ 6 & 12 & 0 & -3 \\ 8 & 16 & 0 & -4 \end{array}\right]. \nonumber \nonumber\]

    The column-row product in the last example is the building block for yet another way to look at the matrix product. The next exercise explains how.

    Exercise
     
    Denote the columns of the \(m\times n \) matrix \(A \) by \(A_{(1)}, \ldots, A_{(n)} \), and the rows of the \(n\times p \) matrix \(B \) by \(B^{(1)}, \ldots, B^{(p)} \), so \[ A_{(j)} = \left[\begin{array}{r} a_{1j} \\ \vdots \\ a_{mj}\end{array}\right] \quad \text{and} \quad B^{(i)} = \left[\begin{array}{rrr} b_{i1} & b_{i2} & \ldots & b_{ip}\end{array}\right]. \nonumber \nonumber\] Show that \[ AB = A_{(1)} B^{(1)} + A_{(2)} B^{(2)} + \ldots + A_{(n)} B^{(n)}, \nonumber \nonumber\] i.e., \(AB \) is the sum of \(n \) column-row products (like in Example 34).

    Properties of the matrix product

     

    Now let us have a look which of the rules of the products of numbers also hold for products of matrices. And which do not.

    Proposition
     
    For all \(m \times n \) matrices \(A,A_1,A_2 \), all \(n \times p \) matrices \(B,B_1,B_2 \), all \(p \times q \) matrices \(C \) and all real numbers \(c \) the following are true:
    1. \(A(B_1+B_2) = AB_1 + AB_2 \) and \((A_1+A_2)B = A_1B+A_2B \);
    2. \(A(cB) = c(AB) = (cA)B \);
    3. \(AI_n = A \) and \(I_mA = A \) (the identity matrix \(I \) acts as a unit element);
    4.  
      \(A(BC) = (AB)C \).
    Example
    As an illustration of rule (iv): we compute the two triple products for the three matrices \[ A = \left[\begin{array}{r} 3 & 1 \\ 2 & 1 \\ 0 & 5 \end{array}\right], \quad B = \left[\begin{array}{r} 1 & 2 \\ 3 & 0 \end{array}\right], \quad C = \left[\begin{array}{rr} 1 & 2 & 0 \\ 2 & 1 & 2 \end{array}\right]. \nonumber \nonumber\] On the one hand \[ A(BC) = \left[\begin{array}{r} 3 & 1 \\ 2 & 1 \\ 0 & 5 \end{array}\right] \left[\begin{array}{rr} 5 & 4 & 4\\ 3 & 6 & 0 \end{array}\right] = \left[\begin{array}{rr} 18 & 18 & 12\\ 13 & 14 & 8 \\ 15 & 30 & 0 \end{array}\right], \nonumber \nonumber\] and on the other hand \[ (AB)C = \left[\begin{array}{r} 6 & 6 \\ 5 & 4 \\ 15 & 0 \end{array}\right] \left[\begin{array}{rr} 1 & 2 & 0 \\ 2 & 1 & 2 \end{array}\right] = \left[\begin{array}{rr} 18 & 18 & 12 \\ 13 & 14 & 8 \\ 15 & 30 & 0 \end{array}\right]. \nonumber \nonumber\] So the products are indeed equal. But it is not immediately clear how: the value 14 on position (2,2) comes about in two ways \[ \text{via } A(BC)\!: 14 = 2\cdot4 + 1\cdot 6, \quad \text{via } (AB)C\!: 14 = 5\cdot2 + 4\cdot1. \nonumber \nonumber\] We need a good perspective to give a proof of the general case.
    Skip/Read the proof
    Proof
    Rules (i), (ii) are checked in a straightforward way. See Exercise Exc:RulesProduct.
    1. \addtocounter{enumi}{2}
    2. We saw instances of this property already in Example 23 and Exercise 29. For the general case, one way to show validity of the first statement is to note that the \(j \)-th column of \(AI_n \) is \(A\vect{e}_j \) where \(\vect{e}_j \) is the \(j \)-th column of the identity matrix \(I_n \). This gives the linear combination \[ A\vect{e}_j = 0\vect{a}_1 + 0\vect{a}_2 + \ldots + 1\vect{a}_j +\dots + 0\vect{a}_n = \vect{a}_j \nonumber \nonumber\] which shows that the \(j \)-th column of \(AI_n \) is equal to the \(j \)-th column of \(A \). And this holds for any column. The identity \[ I_mA = A \nonumber \nonumber\] is shown in an analogous way, working row by row.
    3. First we observe that both triple products yield \(m \times q \) matrices. Then the identity can be proved `column by column', as the previous one. We are done if we can show that \[ \begin{array}{rcl} k\text{-th column of }A(BC) &=& k\text{-th column of }(AB)C \\ &=& (AB)( k\text{-th column of }C) = (AB)\vect{c}_k, \end{array} \nonumber \nonumber\] for \( k = 1,2,\ldots q \). Now recall that (by definition) \[ k\text{-th column of }BC = B\vect{c}_k, \nonumber \nonumber\] so \[ k\text{-th column of }A(BC) = A (B\vect{c}_k) \nonumber \nonumber\] Making extensive use of the rule \[ A(c_1\vect{x} + c_2\vect{y}) = c_1A\vect{x} + c_2A\vect{y} \nonumber \nonumber\] we find \[ \begin{array}{ccl} A (B\vect{c}_k) & = & A (c_{1k}\vect{b}_1 +c_{2k}\vect{b}_2 + \ldots + c_{pk}\vect{b}_p)\\ & = & c_{1k}(A\vect{b}_1) +c_{2k}(A\vect{b}_2) + \ldots + c_{pk}(A\vect{b}_p)\\ & = & \left[\begin{array}{rrr} A\vect{b}_1 & A\vect{b}_2 & \ldots & A\vect{b}_p \end{array}\right] \left[\begin{array}{r} c_{1k} \\ \vdots \\ c_{pk} \end{array}\right] \\ & = & (AB)\vect{c}_k. \end{array} \nonumber \nonumber\]
    Exercise
     
    Prove rules (i) and (ii). Recall: matrices are equal when they have the same size and the entries on corresponding positions are equal (which may be checked column by column or row by row).
    Note
     
    The proof of (iv) can be seen in another light. In Section ?? we saw that an \(m\times n \) matrix \(A \) defines a transformation \(T \) from \(\mathbb{R}^n \) to \(\mathbb{R}^m \), namely \[ \text{for } \vect{x} \in \mathbb{R}^n: \vect{x} \mapsto T(\vect{x}) = A\vect{x}. \nonumber \nonumber\] The definition of the product of two matrices then precisely matches the composition of two of such transformations: if \(A \) is an \(m\times n \) matrix and \(B \) is an \(n\times p \) matrix \[ \vect{x}\in\mathbb{R}^p \stackrel{B}{\longrightarrow} \vect{y}_1 = B\vect{x}\in\mathbb{R}^n \stackrel{A}{\longrightarrow} \vect{y}_2 = A(B\vect{x}) \in \mathbb{R}^m \nonumber \nonumber\] and \[ \vect{x}\in\mathbb{R}^p \stackrel{AB}{\longrightarrow} \vect{y}_3 = (AB)\vect{x} \in \mathbb{R}^m \nonumber \nonumber\] yield the same vector: \[ \vect{y}_2 = \vect{y}_3. \nonumber \nonumber\]

    So far so good: matrix multiplication behaves as multiplication of numbers. However, in two important respects the concepts deviate. First of all, commutativity no longer holds.

    Example
    For the matrices \[ A = \left[\begin{array}{rr} 2 & 2 & 1\\ 3 & 3 & 0 \end{array}\right] \quad \text{and} \quad B = \left[\begin{array}{r} 1 & 3 \\ 3 & 1 \\ 4 & 0 \end{array}\right] \nonumber \nonumber\] it is clear than \[ AB \neq BA \nonumber \nonumber\] simply because the two products are not of the same size: \(AB \) is a \(2\times 2 \) matrix, \(BA \) a \(3\times3 \) matrix. The following example illustrates that \(AB = BA \) is not even guaranteed for two \(n\times n \) matrices \(A \) and \(B \): \[ \left[\begin{array}{r} 1 & 3 \\ 2 & 1 \end{array}\right] \left[\begin{array}{r} 0 & 1 \\ 1 & 2 \end{array}\right] = \left[\begin{array}{r} 3 & 7 \\ 1 & 4 \end{array}\right] \neq \left[\begin{array}{r} 2 & 1 \\ 5 & 5 \end{array}\right] = \left[\begin{array}{r} 0 & 1 \\ 1 & 2 \end{array}\right] \left[\begin{array}{r} 1 & 3 \\ 2 & 1 \end{array}\right]. \nonumber \nonumber\]

    The fact that \(AB \neq BA \) can be understood by thinking about the composition of the two transformations corresponding to \(A \) and \(B \). (See Section Sec:LinTrafo.) The following two exercises shed some light on the non-commutativity.

    Example
     
    Consider the two matrices \[ A = \left[\begin{array}{r} 2 & 0 \\ 0 & 1 \end{array}\right] \quad \text{and} \quad B = \left[\begin{array}{r} 0 & 1 \\ 1 & 0 \end{array}\right] \nonumber \nonumber\] and the corresponding linear transformations \[ T_A: \mathbb{R}^2 \to \mathbb{R}^2, \quad \vect{x} \mapsto T_A(\vect{x}) = A \vect{x}, \quad T_B: \mathbb{R}^2 \to \mathbb{R}^2, \quad \vect{x} \mapsto T_B(\vect{x}) = B \vect{x}. \nonumber \nonumber\] We get \[ \vect{x} = \left[\begin{array}{r} x_1\\ x_2 \end{array}\right] \mapsto A\vect{x} = \left[\begin{array}{r} 2 & 0 \\ 0 & 1 \end{array}\right] \left[\begin{array}{r} x_1\\ x_2 \end{array}\right] = \left[\begin{array}{r} 2x_1\\ x_2 \end{array}\right] \nonumber \nonumber\] and likewise \[ T_B(\vect{x}) = \left[\begin{array}{r} 0 & 1 \\ 1 & 0 \end{array}\right] \left[\begin{array}{r} x_1\\ x_2 \end{array}\right] = \left[\begin{array}{r} x_2\\ x_1 \end{array}\right] \nonumber\]
    Fig-MatrixOps-NonCommutativity.svg
    Figure 1.

    Note that \(T_A \) is a transformation that 'stretches' horizontally, and \(T_B \) is a reflection. Figure 1 visualizes the transformations corresponding to \(AB \) and \(BA \). When we apply the transformations one after another, the order in which we do this is important.
    Exercise
     
    Recall that the matrices \[ E_1 = \left[\begin{array}{r} 1 & 0 \\ 2 & 1 \end{array}\right] \quad \text{and} \quad E_2 = \left[\begin{array}{r} 3 & 0 \\ 0 & 1 \end{array}\right] \nonumber \nonumber\] perform row operations, when multiplied with a \(2 \times n \) matrix \(A \).
    1. Describe in words the row operations corresponding to \(E_1 \) and \(E_2 \).
    2. Describe in words the combined row operations corresponding to \(E_1E_2 \) and \(E_2E_1 \). Can you explain why \(E_1E_2 \neq E_2E_1 \)?
    3. Compute \(E_1E_2 \) and \(E_2E_1 \) to double check the last non-identity.

    The second major difference between the product of numbers and the product of matrices: for two (e.g.\ real) numbers \(a \) and \(b \) it is known that \[ \text{if } a \neq 0 \text{ and } b \neq 0 \text{ then } ab \neq 0, \nonumber \nonumber\] or, equivalently, \[ ab = 0 \Rightarrow a = 0 \text{ or } b = 0. \nonumber \nonumber\] As the following example shows, things are different in the realm of matrices.

    Example
    \[ \left[\begin{array}{r} 1 & 2 \\ 2 & 4 \end{array}\right] \left[\begin{array}{r} 2 & 6 \\ -1 & -3 \end{array}\right] = \left[\begin{array}{r} 0 & 0 \\ 0 & 0 \end{array}\right]. \nonumber \nonumber\] So the product of two nonzero matrices may be the zero matrix.

    The following example shows that things are even `worse':

    Example
    \[ \left[\begin{array}{rr} 1 & -3 & 2 \\ 1 & -3 & 2 \\ 1 & -3 & 2 \end{array}\right] \left[\begin{array}{rr}1 & -3 & 2 \\ 1 & -3 & 2 \\ 1 & -3 & 2 \end{array}\right] = \left[\begin{array}{rr} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array}\right], \nonumber \nonumber\] which shows that we cannot even conclude from \(A\cdot A = O \) that \(A \) itself must be the zero matrix.
    Note
    The next list gives six situations where matrix multiplication acts differently than multiplication of numbers. In fact, all statements can be related to one of the first two.
    1. In general, \(AB = BA \) does not hold for two \(n\times n \) matrices \(A \) and \(B \).
    2. In general, from \(AB = O \) it does not follow that either \(A =O \) or \(B = O \).
    3. In general, \((A+B)(A+B) = A^2 + 2AB + B^2 \) does not hold for two \(n\times n \) matrices \(A \) and \(B \).
    4. In general, \((A+B)(A-B) = A^2 - B^2 \) does not hold for two \(n\times n \) matrices \(A \) and \(B \).
    5. In general, from \(AB = AC \) and \(A \neq O \) it does not follow that \(B = C \).
    6. In general, from \(A^2 = I \) it does not follow that either \(A = I \) or \(A = -I \). For each statement counterexamples can be given, as we already did for the first two. To get more insight in what is really going on, we can also try to find out how the third till the sixth statement relate to the first two statements. For instance, the third statement is closely related to the first. Let us check where `things go wrong'. \[ \begin{array}{cl} (A+B)(A+B)& = A(A+B) +B(A+B)\\ & = A^2 + AB + BA + B^2 \end{array} \nonumber \nonumber\] The last expression is equal to \[ A^2 + 2AB + B^2 \nonumber \nonumber\] if and only if \[ AB + BA = 2AB \iff BA = AB. \nonumber \nonumber\] So any pair of two matrices \(A \) and \(B \) with \[ AB \neq BA \nonumber \nonumber\] provides a counterexample where \[ (A+B)(A+B) \neq A^2 + 2AB + B^2. \nonumber \nonumber\] Likewise, (v) follows from (ii): \[ AB = AC \iff AB - AC = O \iff A(B-C) = O. \nonumber \nonumber\] According to (ii) from the last equation we cannot deduce that \[ \text{either } A = O \quad \text{or}\quad B-C = O. \nonumber \nonumber\] We can create a counterexample by taking for \(A \) and \(B \) nonzero matrices for which \[ AB = O, \nonumber \nonumber\] and we let \(C \) be the zero matrix. Then \(B \neq C \), whereas \[ AB = AC = O \text{ and (by assumption) } A \neq O. \nonumber \nonumber\] Statement (vi) also relates to (ii): \[ A^2 = I \iff A^2 - I = (A+I)(A-I) = O \nonumber \nonumber\] from which we cannot conclude that one of the factors \((A+I) \) or \((A-I) \) must be the zero matrix. In this case we do not get a counterexample for free. You are asked to construct counterexamples in Exercise 46.
    Exercise
     
    1. Give a \(2 \times 2 \) matrix \(A \neq \pm I \) for which \(A^2 = I \).
    2. Give a \(2 \times 2 \) matrix \(A \) not containing any zeros, for which \(A^2 = I \).
    3. Give a \(2 \times 2 \) matrix \(B \) for which \(B^2 = -I \).

    The following property connects the two operations matrix transposition and matrix multiplication.

    Proposition
    If \(A \) is an \(m\times n \) matrix and \(B \) an \(n\times p \) matrix, then \[ (AB)^T = B^TA^T. \nonumber \nonumber\]

    Before we present the proof, we consider a typical example.

    Example
     
    We verify the rule for the two matrices \[ A = \left[\begin{array}{rr} 2 & 1 & -1 \\ 1 & -1 & 3 \end{array}\right] \quad\text{and}\quad B = \left[\begin{array}{rr} 1 & -3 & 0\\ 4 & 2 & -1 \\ 5 & 2 & 1\end{array}\right]. \nonumber \nonumber\] We can easily check: \[ AB = \left[\begin{array}{rr} 2 & 1 & -1 \\ 1 & -1 & 3 \end{array}\right] \left[\begin{array}{rr} 1 & -3 & 0 \\ 4 & 2 & -1\\ 5 & 2 & 1 \end{array}\right] = \left[\begin{array}{rr} 1 & -6 & -2 \\ 12 & 1 & 4\end{array}\right] \nonumber \nonumber\] and \[ B^TA^T = \left[\begin{array}{rr} 1 & 4 & 5 \\ -3 & 2 & 2 \\ 0 & -1 & 1 \end{array}\right] \left[\begin{array}{r} 2& 1 \\ 1 & -1 \\ -1 & 3 \end{array}\right] = \left[\begin{array}{r} 1 & 12 \\ -6 & 1 \\ -2 & 4 \end{array}\right], \nonumber \nonumber\] so that indeed \[ B^TA^T = \left[\begin{array}{r} 1 & 12 \\ -6 & 1 \\ -2 & 4 \end{array}\right] = \left[\begin{array}{rr} 1 & -6 & -2 \\ 12 & 1 & 4\end{array}\right]^T = (AB)^T. \nonumber \nonumber\] Careful inspection learns that for the two matrix products exactly the same sums and products of numbers have to be computed. For instance, in both products the 12 is the sum of products \[ 12 = 1\cdot1 +4\cdot(-1) +5\cdot3 = 1\cdot1 +(-1)\cdot4 +3\cdot5. \nonumber \nonumber\]

    As Example 48 illustrates the rule is not restricted to square matrices \(A \) and \(B \). The proof for general matrices \(A \) and \(B \) for which the product \(AB \) is well defined is as follows

    Skip/Read the proof
    Proof
    To show that \[ (AB)^T = B^TA^T \nonumber \nonumber\] we have to show that the matrices have the same size, and are equal entry by entry. First, we see that \(AB \) is an \(m \times p \) matrix, so \((AB)^T \) is a \(p \times m \) matrix, and \(B^TA^T \), being the product of a \(p \times n \) matrix with an \(n \times m \), is also a \(p \times m \) matrix. Second, the \((i,j) \) entry of \((AB)^T \) is the \((j,i) \) entry of \(AB \), which is the (row-column) product of the \(j \)-th row of \(A \) and the \(i \)-th column of \(B \): \[ [(AB)^T]_{ij} = \left[\begin{array}{rrr} a_{j1} & a_{j2} & \ldots & a_{jn} \end{array}\right] \left[\begin{array}{r} b_{1i} \\ b_{2i} \\ \vdots \\ b_{ni} \end{array}\right]. \nonumber \nonumber\] The \((i,j) \) entry of \(B^TA^T \) is the product of the \(i \)-th row of \(B^T \) and the \(j \)-th column of \(A^T \). Now the \(i \)-th row of \(B^T \) is the \(i \)-th column of \(B \) written as a row, and the \(j \)-th column of \(A^T \) is the \(j \)-th row of \(A \) written as a column: \[ [B^TA^T]_{ij} = \left[\begin{array}{rrr} b_{1i} & b_{2i} & \ldots & b_{ni} \end{array}\right] \left[\begin{array}{r} a_{j1} \\ a_{j2} \\ \vdots \\ a_{jn} \end{array}\right]. \nonumber \nonumber\] Both row-column products end up as the same value \[ a_{j1}b_{1i} + a_{j2}b_{2i} + \ldots + a_{jn}b_{ni} = b_{1i}a_{j1} + b_{2i}a_{j2} + \ldots + b_{ni}a_{jn}. \nonumber \nonumber\]
    Note
    We already defined \(A^2 \) for a square matrix \(A \). We can extend this to higher powers of \(A \) in an obvious way: \[ A^3 = A(A^2),\quad A^4 = A(A^3), \quad \text{ and so on.} \nonumber \nonumber\] Since \[ A(A^2) = A(AA) = (AA)A, \nonumber \nonumber\] we can do without the parentheses. For the same reason \[ A^kA^{\ell} = A^{k+\ell}, \text{ for integers } k,\ell \geq 1. \nonumber \nonumber\] If we define \[ A^0 = I, \nonumber \nonumber\] then \[ A^kA^{\ell} = A^{k+\ell} \text{ holds for all integers } k,\ell \geq 0. \nonumber \nonumber\] And what can we say about \(A^{-1} \)? We will dedicate the whole next section to this topic.

    3.2 Matrix Operations is shared under a CC BY license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?