Skip to main content
Mathematics LibreTexts

2.4: Matrix Multiplication

  • Page ID
    58836
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    In Section [sec:2_2] matrix-vector products were introduced. If \(A\) is an \(m \times n\) matrix, the product \(A\mathbf{x}\) was defined for any \(n\)-column \(\mathbf{x}\) in \(\mathbb{R}^n\) as follows: If \(A = \left[ \begin{array}{cccc} \mathbf{a}_{1} & \mathbf{a}_{2} & \cdots & \mathbf{a}_{n} \end{array} \right]\) where the \(\mathbf{a}_{j}\) are the columns of \(A\), and if \(\mathbf{x} = \left[ \begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array} \right]\), Definition [def:002668] reads

    \[\label{eq:mvectorprod} A\mathbf{x} = x_{1}\mathbf{a}_{1} + x_{2}\mathbf{a}_{2} + \cdots + x_{n}\mathbf{a}_{n} \]

    This was motivated as a way of describing systems of linear equations with coefficient matrix \(A\). Indeed every such system has the form \(A\mathbf{x} = \mathbf{b}\) where \(\mathbf{b}\) is the column of constants.

    In this section we extend this matrix-vector multiplication to a way of multiplying matrices in general, and then investigate matrix algebra for its own sake. While it shares several properties of ordinary arithmetic, it will soon become clear that matrix arithmetic is different in a number of ways.

    Matrix multiplication is closely related to composition of transformations.

    Composition and Matrix Multiplication

    Sometimes two transformations “link” together as follows:

    \[\mathbb{R}^{k} \xrightarrow{T} \mathbb{R}^{n} \xrightarrow{S} \mathbb{R}^{m} \nonumber \]

    In this case we can apply \(T\) first and then apply \(S\), and the result is a new transformation

    \[S \circ T : \mathbb{R}^{k} \rightarrow \mathbb{R}^{m} \nonumber \]

    called the composite of \(S\) and \(T\), defined by

    \[(S \circ T)(\mathbf{x}) = S \left[ T(\mathbf{x}) \right] \quad \mbox{ for all } \mathbf{x} \mbox{ in } \mathbb{R}^{k} \nonumber \]

    The action of \(S \circ T\) can be described as “first \(T\) then \(S\) ” (note the order!)1. This new transformation is described in the diagram. The reader will have encountered composition of ordinary functions: For example, consider \(\mathbb{R} \xrightarrow{g} \mathbb{R} \xrightarrow{f} \mathbb{R}\) where \(f(x) = x^{2}\) and \(g(x) = x + 1\) for all \(x\) in \(\mathbb{R}\). Then

    \[\begin{aligned} (f \circ g)(x) &= f \left[ g(x) \right] = f(x + 1) = (x + 1)^{2} \\ (g \circ f)(x) &= g \left[ f(x) \right] = g(x^{2}) = x^{2} + 1\end{aligned} \nonumber \]

    for all \(x\) in \(\mathbb{R}\).

    Our concern here is with matrix transformations. Suppose that \(A\) is an \(m \times n\) matrix and \(B\) is an \(n \times k\) matrix, and let \(\mathbb{R}^{k} \xrightarrow{T_{B}} \mathbb{R}^{n} \xrightarrow{T_{A}} \mathbb{R}^{m}\) be the matrix transformations induced by \(B\) and \(A\) respectively, that is:

    \[T_{B}(\mathbf{x}) = B\mathbf{x} \mbox{ for all } \mathbf{x} \mbox{ in } \mathbb{R}^{k} \quad \mbox{ and } \quad T_{A}(\mathbf{y}) = A\mathbf{y} \mbox{ for all } \mathbf{y} \mbox{ in } \mathbb{R}^{n} \nonumber \]

    Write \(B = \left[ \begin{array}{cccc} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{k} \end{array} \right]\) where \(\mathbf{b}_{j}\) denotes column \(j\) of \(B\) for each \(j\). Hence each \(\mathbf{b}_{j}\) is an \(n\)-vector (\(B\) is \(n \times k\)) so we can form the matrix-vector product \(A\mathbf{b}_{j}\). In particular, we obtain an \(m \times k\) matrix

    \[\left[ \begin{array}{cccc} A\mathbf{b}_{1} & A\mathbf{b}_{2} & \cdots & A\mathbf{b}_{k} \end{array} \right] \nonumber \]

    with columns \(A\mathbf{b}_{1}, A\mathbf{b}_{2}, \cdots, A\mathbf{b}_{k}\). Now compute \((T_{A} \circ T_{B})(\mathbf{x})\) for any \(\mathbf{x} = \left[ \begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{k} \end{array} \right]\) in \(\mathbb{R}^k\):

    \[\begin{array}{lllll} (T_{A} \circ T_{B})(\mathbf{x}) & = & T_{A} \left[ T_{B}(\mathbf{x}) \right] & & \mbox{Definition of } T_{A} \circ T_{B} \\ & = & A(B\mathbf{x}) & & A \mbox{ and } B \mbox{ induce } T_{A} \mbox{ and } T_{B} \\ & = & A(x_{1}\mathbf{b}_{1} + x_{2}\mathbf{b}_{2} + \cdots + x_{k}\mathbf{b}_{k}) & & \mbox{Equation~\ref{eq:mvectorprod} above} \\ & = & A(x_{1}\mathbf{b}_{1}) + A(x_{2}\mathbf{b}_{2}) + \cdots + A(x_{k}\mathbf{b}_{k}) & & \mbox{Theorem~\ref{thm:002811}} \\ & = & x_{1}(A\mathbf{b}_{1}) + x_{2}(A\mathbf{b}_{2}) + \cdots + x_{k}(A\mathbf{b}_{k}) & & \mbox{Theorem~\ref{thm:002811}} \\ & = & \left[ \begin{array}{cccc} A\mathbf{b}_{1} & A\mathbf{b}_{2} & \cdots & A\mathbf{b}_{k} \end{array} \right] \mathbf{x} & & \mbox{Equation~\ref{eq:mvectorprod} above} \end{array} \nonumber \]

    Because \(\mathbf{x}\) was an arbitrary vector in \(\mathbb{R}^n\), this shows that \(T_{A} \circ T_{B}\) is the matrix transformation induced by the matrix \(\left[ \begin{array}{cccc} A\mathbf{b}_{1} & A\mathbf{b}_{2} & \cdots & A\mathbf{b}_{n} \end{array} \right]\). This motivates the following definition.

    Matrix Multiplication003447 Let \(A\) be an \(m \times n\) matrix, let \(B\) be an \(n \times k\) matrix, and write \(B = \left[ \begin{array}{cccc} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{k} \end{array} \right]\) where \(\mathbf{b}_{j}\) is column \(j\) of \(B\) for each \(j\). The product matrix \(AB\) is the \(m \times k\) matrix defined as follows:

    \[AB = A \left[ \begin{array}{cccc} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{k} \end{array} \right] = \left[ \begin{array}{cccc} A\mathbf{b}_{1} & A\mathbf{b}_{2} & \cdots & A\mathbf{b}_{k} \end{array} \right] \nonumber \]

    Thus the product matrix \(AB\) is given in terms of its columns \(A\mathbf{b}_{1}, A\mathbf{b}_{2}, \dots, A\mathbf{b}_{n}\): Column \(j\) of \(AB\) is the matrix-vector product \(A\mathbf{b}_{j}\) of \(A\) and the corresponding column \(\mathbf{b}_{j}\) of \(B\). Note that each such product \(A\mathbf{b}_{j}\) makes sense by Definition [def:002668] because \(A\) is \(m \times n\) and each \(\mathbf{b}_{j}\) is in \(\mathbb{R}^n\) (since \(B\) has \(n\) rows). Note also that if \(B\) is a column matrix, this definition reduces to Definition [def:002668] for matrix-vector multiplication.

    Given matrices \(A\) and \(B\), Definition [def:003447] and the above computation give

    \[A(B\mathbf{x}) = \left[ \begin{array}{cccc} A\mathbf{b}_{1} & A\mathbf{b}_{2} & \cdots & A\mathbf{b}_{n} \end{array} \right] \mathbf{x} = (AB)\mathbf{x} \nonumber \]

    for all \(\mathbf{x}\) in \(\mathbb{R}^k\). We record this for reference.

    003469 Let \(A\) be an \(m \times n\) matrix and let \(B\) be an \(n \times k\) matrix. Then the product matrix \(AB\) is \(m \times k\) and satisfies

    \[A(B\mathbf{x}) = (AB)\mathbf{x} \quad \mbox{ for all } \mathbf{x} \mbox{ in } \mathbb{R}^{k} \nonumber \]

    Here is an example of how to compute the product \(AB\) of two matrices using Definition [def:003447].

    003474 Compute \(AB\) if \(A = \left[ \begin{array}{rrr} 2 & 3 & 5 \\ 1 & 4 & 7 \\ 0 & 1 & 8 \end{array} \right]\) and \(B = \left[ \begin{array}{rr} 8 & 9 \\ 7 & 2 \\ 6 & 1 \end{array} \right]\).

    The columns of \(B\) are \(\mathbf{b}_{1} = \left[ \begin{array}{r} 8 \\ 7 \\ 6 \end{array} \right]\) and \(\mathbf{b}_{2} = \left[ \begin{array}{r} 9 \\ 2 \\ 1 \end{array} \right]\), so Definition [def:002668] gives

    \[A\mathbf{b}_{1} = \left[ \begin{array}{rrr} 2 & 3 & 5 \\ 1 & 4 & 7 \\ 0 & 1 & 8 \end{array} \right] \left[ \begin{array}{r} 8 \\ 7 \\ 6 \end{array} \right] = \left[ \begin{array}{r} 67 \\ 78 \\ 55 \end{array} \right] \mbox{ and } A\mathbf{b}_{2} = \left[ \begin{array}{rrr} 2 & 3 & 5 \\ 1 & 4 & 7 \\ 0 & 1 & 8 \end{array} \right] \left[ \begin{array}{r} 9 \\ 2 \\ 1 \end{array} \right] = \left[ \begin{array}{r} 29 \\ 24 \\ 10 \end{array} \right] \nonumber \]

    Hence Definition [def:003447] above gives \(AB = \left[ \begin{array}{cc} A\mathbf{b}_{1} & A\mathbf{b}_{2} \end{array} \right] = \left[ \begin{array}{rr} 67 & 29 \\ 78 & 24 \\ 55 & 10 \end{array} \right]\).

    example232 If \(A\) is \(m \times n\) and \(B\) is \(n \times k\), Theorem [thm:003469] gives a simple formula for the composite of the matrix transformations \(T_{A}\) and \(T_{B}\):

    \[T_{A} \circ T_{B} = T_{AB} \nonumber \]

    Given any \(\mathbf{x}\) in \(\mathbb{R}^{k}\),

    \[\begin{aligned} (T_{A} \circ T_{B})(\mathbf{x}) &=& T_{A}[T_{B}(\mathbf{x})] \\ &=& A[B\mathbf{x}]\\ &=& (AB)\mathbf{x}\\ &=& T_{AB}(\mathbf{x})\end{aligned} \nonumber \]

    While Definition [def:003447] is important, there is another way to compute the matrix product \(AB\) that gives a way to calculate each individual entry. In Section [sec:2_2] we defined the dot product of two \(n\)-tuples to be the sum of the products of corresponding entries. We went on to show (Theorem [thm:002903]) that if \(A\) is an \(m \times n\) matrix and \(\mathbf{x}\) is an \(n\)-vector, then entry \(j\) of the product \(A\mathbf{x}\) is the dot product of row \(j\) of \(A\) with \(\mathbf{x}\). This observation was called the “dot product rule” for matrix-vector multiplication, and the next theorem shows that it extends to matrix multiplication in general.

    Dot Product Rule003488 Let \(A\) and \(B\) be matrices of sizes \(m \times n\) and \(n \times k\), respectively. Then the \((i, j)\)-entry of \(AB\) is the dot product of row \(i\) of \(A\) with column \(j\) of \(B\).

    Write \(B = \left[ \begin{array}{cccc} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{n} \end{array} \right]\) in terms of its columns. Then \(A\mathbf{b}_{j}\) is column \(j\) of \(AB\) for each \(j\). Hence the \((i, j)\)-entry of \(AB\) is entry \(i\) of \(A\mathbf{b}_{j}\), which is the dot product of row \(i\) of \(A\) with \(\mathbf{b}_{j}\). This proves the theorem.

    Thus to compute the \((i, j)\)-entry of \(AB\), proceed as follows (see the diagram):

    Go across row \(i\) of \(A\), and down column \(j\) of \(B\), multiply corresponding entries, and add the results.

    \[\left[\begin{array}{ccc} \; & \tn{A}{} & \;\\ \tn{rowi1}{} & \tn{rowi}{} & \tn{rowi2}{}\\ & & \\ \end{array}\right] \left[\begin{array}{ccc} \; & \tn{columnj1}{}& \;\\ & & \\ & \tn{columnj2}{} & \\ \end{array}\right] = \left[ \begin{array}{ccc} \; & \tn{columnj3}{} & \; \\ \tn{rowi3}{} & \tn{ijentry}{} & \tn{rowi4}{} \\ \; & \tn{columnj4}{} & \; \\ \end{array}\right] \nonumber \]

    Note that this requires that the rows of \(A\) must be the same length as the columns of \(B\). The following rule is useful for remembering this and for deciding the size of the product matrix \(AB\).

    Let \(A\) and \(B\) denote matrices. If \(A\) is \(m \times n\) and \(B\) is \(n^\prime \times k\), the product \(AB\) can be formed if and only if \(n=n^\prime\). In this case the size of the product matrix \(AB\) is \(m \times k\), and we say that \(AB\) is defined, or that \(A\) and \(B\) are compatible for multiplication.

    The diagram provides a useful mnemonic for remembering this. We adopt the following convention:

    Whenever a product of matrices is written, it is tacitly assumed that the sizes of the factors are such that the product is defined.

    To illustrate the dot product rule, we recompute the matrix product in Example [exa:003474].

    003516 Compute \(AB\) if \(A = \left[ \begin{array}{rrr} 2 & 3 & 5 \\ 1 & 4 & 7 \\ 0 & 1 & 8 \end{array} \right]\) and \(B = \left[ \begin{array}{rr} 8 & 9 \\ 7 & 2 \\ 6 & 1 \end{array} \right]\).

    Here \(A\) is \(3 \times 3\) and \(B\) is \(3 \times 2\), so the product matrix \(AB\) is defined and will be of size \(3 \times 2\). Theorem [thm:003488] gives each entry of \(AB\) as the dot product of the corresponding row of \(A\) with the corresponding column of \(B_{j}\) that is,

    \[AB = \left[ \begin{array}{rrr} 2 & 3 & 5 \\ 1 & 4 & 7 \\ 0 & 1 & 8 \end{array} \right] \left[ \begin{array}{rr} 8 & 9 \\ 7 & 2 \\ 6 & 1 \end{array} \right] = \left[ \arraycolsep=8pt \begin{array}{cc} 2 \cdot 8 + 3 \cdot 7 + 5 \cdot 6 & 2 \cdot 9 + 3 \cdot 2 + 5 \cdot 1 \\ 1 \cdot 8 + 4 \cdot 7 + 7 \cdot 6 & 1 \cdot 9 + 4 \cdot 2 + 7 \cdot 1 \\ 0 \cdot 8 + 1 \cdot 7 + 8 \cdot 6 & 0 \cdot 9 + 1 \cdot 2 + 8 \cdot 1 \end{array} \right] = \left[ \begin{array}{rr} 67 & 29 \\ 78 & 24 \\ 55 & 10 \end{array} \right] \nonumber \]

    Of course, this agrees with Example [exa:003474].

    003527 Compute the \((1, 3)\)- and \((2, 4)\)-entries of \(AB\) where

    \[A = \left[ \begin{array}{rrr} 3 & -1 & 2 \\ 0 & 1 & 4 \end{array} \right] \mbox{ and } B = \left[ \begin{array}{rrrr} 2 & 1 & 6 & 0 \\ 0 & 2 & 3 & 4 \\ -1 & 0 & 5 & 8 \end{array} \right]. \nonumber \]

    Then compute \(AB\).

    The \((1, 3)\)-entry of \(AB\) is the dot product of row 1 of \(A\) and column 3 of \(B\) (highlighted in the following display), computed by multiplying corresponding entries and adding the results.

    \[\begin{array}{ll} \left[ \begin{array}{rrr} \tn{a1}{3} & -1 & \tn{a2}{2} \\ 0 & 1 & 4 \end{array} \right] \left[ \begin{array}{rrrr} 2 & 1 & \tn{b1}{6} & 0 \\ 0 & 2 & 3 & 4 \\ -1 & 0 & \tn{b2}{5} & 8 \end{array} \right] & (1, 3)\mbox{-entry} = 3 \cdot 6 + (-1) \cdot 3 + 2 \cdot 5 = 25 \end{array} \begin{tikzpicture}[remember picture, overlay] \draw[color=blue!30!gray,rounded corners]([xshift=-7pt, yshift=5pt]a1.west) rectangle ([xshift=5pt, yshift=-5pt]a2.east); \draw[color=blue!30!gray,rounded corners]([xshift=-7pt, yshift=5pt]b1.north) rectangle ([xshift=5pt, yshift=-5pt]b2.south); \end{tikzpicture} \nonumber \]

    Similarly, the \((2, 4)\)-entry of \(AB\) involves row 2 of \(A\) and column 4 of \(B\).

    \[\begin{array}{ll} \left[ \begin{array}{rrr} 3 & -1 & 2 \\ \tn{a3}{0} & 1 & \tn{a4}{4} \end{array} \right] \left[ \begin{array}{rrrr} 2 & 1 & 6 & \tn{b3}{0} \\ 0 & 2 & 3 & 4 \\ -1 & 0 & 5 & \tn{b4}{8} \end{array} \right] & (2, 4)\mbox{-entry} = 0 \cdot 0 + 1 \cdot 4 + 4 \cdot 8 = 36 \end{array} \begin{tikzpicture}[remember picture, overlay] \draw[color=blue!30!gray,rounded corners]([xshift=-7pt, yshift=5pt]a3.west) rectangle ([xshift=5pt, yshift=-5pt]a4.east); \draw[color=blue!30!gray,rounded corners]([xshift=-7pt, yshift=5pt]b3.north) rectangle ([xshift=5pt, yshift=-5pt]b4.south); \end{tikzpicture} \nonumber \]

    Since \(A\) is \(2 \times 3\) and \(B\) is \(3 \times 4\), the product is \(2 \times 4\).

    \[AB = \left[ \begin{array}{rrr} 3 & -1 & 2 \\ 0 & 1 & 4 \end{array} \right] \left[ \begin{array}{rrrr} 2 & 1 & 6 & 0 \\ 0 & 2 & 3 & 4 \\ -1 & 0 & 5 & 8 \end{array} \right] = \left[ \begin{array}{rrrr} 4 & 1 & 25 & 12 \\ -4 & 2 & 23 & 36 \end{array} \right] \nonumber \]

    003540 If \(A = \left[ \begin{array}{ccc} 1 & 3 & 2\end{array}\right]\) and \(B = \left[ \begin{array}{r} 5 \\ 6 \\ 4 \end{array} \right]\), compute \(A^{2}\), \(AB\), \(BA\), and \(B^{2}\) when they are defined.

    Here, \(A\) is a \(1 \times 3\) matrix and \(B\) is a \(3 \times 1\) matrix, so \(A^{2}\) and \(B^{2}\) are not defined. However, the compatibility rule reads

    \[\begin{array}{ccc} \begin{array}{cc} A & B\\ 1 \times 3 & 3 \times 1 \end{array} & \mbox{ and } & \begin{array}{cc} B & A \\ 3 \times 1 & 1 \times 3 \end{array} \end{array} \nonumber \]

    so both \(AB\) and \(BA\) can be formed and these are \(1 \times 1\) and \(3 \times 3\) matrices, respectively.

    \[AB = \left[ \begin{array}{rrr} 1 & 3 & 2 \end{array} \right] \left[ \begin{array}{r} 5 \\ 6 \\ 4 \end{array} \right] = \left[ \begin{array}{c} 1 \cdot 5 + 3 \cdot 6 + 2 \cdot 4 \end{array} \right] = \arraycolsep=1.5pt \left[ \begin{array}{c} 31 \end{array}\right] \nonumber \]

    \[BA = \left[ \begin{array}{r} 5 \\ 6 \\ 4 \end{array} \right] \left[ \begin{array}{rrr} 1 & 3 & 2 \end{array} \right] = \left[ \begin{array}{rrr} 5 \cdot 1 & 5 \cdot 3 & 5 \cdot 2 \\ 6 \cdot 1 & 6 \cdot 3 & 6 \cdot 2 \\ 4 \cdot 1 & 4 \cdot 3 & 4 \cdot 2 \end{array} \right] = \left[ \begin{array}{rrr} 5 & 15 & 10 \\ 6 & 18 & 12 \\ 4 & 12 & 8 \end{array} \right] \nonumber \]

    Unlike numerical multiplication, matrix products \(AB\) and \(BA\) need not be equal. In fact they need not even be the same size, as Example [exa:003540] shows. It turns out to be rare that \(AB = BA\) (although it is by no means impossible), and \(A\) and \(B\) are said to commute when this happens.

    003554 Let \(A = \left[ \begin{array}{rr} 6 & 9 \\ -4 & -6 \end{array} \right]\) and \(B = \left[ \begin{array}{rr} 1 & 2 \\ -1 & 0 \end{array} \right]\). Compute \(A^{2}\), \(AB\), \(BA\).

    \(A^{2} = \left[ \begin{array}{rr} 6 & 9 \\ -4 & -6 \end{array} \right] \left[ \begin{array}{rr} 6 & 9 \\ -4 & -6 \end{array} \right] = \left[ \begin{array}{rr} 0 & 0 \\ 0 & 0 \end{array} \right]\), so \(A^{2} = 0\) can occur even if \(A \neq 0\). Next,

    \[\begin{aligned} AB & = \left[ \begin{array}{rr} 6 & 9 \\ -4 & -6 \end{array} \right] \left[ \begin{array}{rr} 1 & 2 \\ -1 & 0 \end{array} \right] = \left[ \begin{array}{rr} -3 & 12 \\ 2 & -8 \end{array} \right] \\ BA & = \left[ \begin{array}{rr} 1 & 2 \\ -1 & 0 \end{array} \right] \left[ \begin{array}{rr} 6 & 9 \\ -4 & -6 \end{array} \right] = \left[ \begin{array}{rr} -2 & -3 \\ -6 & -9 \end{array} \right]\end{aligned} \nonumber \]

    Hence \(AB \neq BA\), even though \(AB\) and \(BA\) are the same size.

    003567 If \(A\) is any matrix, then \(IA = A\) and \(AI = A\), and where \(I\) denotes an identity matrix of a size so that the multiplications are defined.

    These both follow from the dot product rule as the reader should verify. For a more formal proof, write \(A = \left[ \begin{array}{rrrr} \mathbf{a}_{1} & \mathbf{a}_{2} & \cdots & \mathbf{a}_{n} \end{array} \right]\) where \(\mathbf{a}_{j}\) is column \(j\) of \(A\). Then Definition [def:003447] and Example [exa:002949] give

    \[IA = \left[ \begin{array}{rrrr} I\mathbf{a}_{1} & I\mathbf{a}_{2} & \cdots & I\mathbf{a}_{n} \end{array} \right] = \left[ \begin{array}{rrrr} \mathbf{a}_{1} & \mathbf{a}_{2} & \cdots & \mathbf{a}_{n} \end{array} \right] = A \nonumber \]

    If \(\mathbf{e}_{j}\) denotes column \(j\) of \(I\), then \(A\mathbf{e}_{j} = \mathbf{a}_{j}\) for each \(j\) by Example [exa:002964]. Hence Definition [def:003447] gives:

    \[AI = A \left[ \begin{array}{rrrr} \mathbf{e}_{1} & \mathbf{e}_{2} & \cdots & \mathbf{e}_{n} \end{array} \right] = \left[ \begin{array}{rrrr} A\mathbf{e}_{1} & A\mathbf{e}_{2} & \cdots & A\mathbf{e}_{n} \end{array} \right] = \left[ \begin{array}{rrrr} \mathbf{a}_{1} & \mathbf{a}_{2} & \cdots & \mathbf{a}_{n} \end{array} \right] = A \nonumber \]

    The following theorem collects several results about matrix multiplication that are used everywhere in linear algebra.

    003584 Assume that \(a\) is any scalar, and that \(A\), \(B\), and \(C\) are matrices of sizes such that the indicated matrix products are defined. Then:

    2

    1. \(IA = A\) and \(AI = A\) where \(I\) denotes an identity matrix.
    2. \(A(BC) = (AB)C\).
    3. \(A(B + C) = AB + AC\).
    4. \((B + C)A = BA + CA\).
    5. \(a(AB) = (aA)B = A(aB)\).
    6. \((AB)^{T} = B^{T}A^{T}\).

    Condition (1) is Example [exa:003567]; we prove (2), (4), and (6) and leave (3) and (5) as exercises.

    1. If \(C = \left[ \begin{array}{cccc} \mathbf{c}_{1} & \mathbf{c}_{2} & \cdots & \mathbf{c}_{k} \end{array} \right]\) in terms of its columns, then \(BC = \left[ \begin{array}{cccc} B\mathbf{c}_{1} & B\mathbf{c}_{2} & \cdots & B\mathbf{c}_{k} \end{array} \right]\) by Definition [def:003447], so

      \[\begin{array}{lllll} A(BC) & = & \left[ \begin{array}{rrrr} A(B\mathbf{c}_{1}) & A(B\mathbf{c}_{2}) & \cdots & A(B\mathbf{c}_{k}) \end{array} \right] & & \mbox{Definition~\ref{def:003447}} \\ & & & & \\ & = & \left[ \begin{array}{rrrr} (AB)\mathbf{c}_{1} & (AB)\mathbf{c}_{2} & \cdots & (AB)\mathbf{c}_{k}) \end{array} \right] & & \mbox{Theorem~\ref{thm:003469}} \\ & & & & \\ & = & (AB)C & & \mbox{Definition~\ref{def:003447}} \end{array} \nonumber \]

    2. We know (Theorem [thm:002811]) that \((B + C)\mathbf{x} = B\mathbf{x} + C\mathbf{x}\) holds for every column \(\mathbf{x}\). If we write
      \(A = \left[ \begin{array}{rrrr} \mathbf{a}_{1} & \mathbf{a}_{2} & \cdots & \mathbf{a}_{n} \end{array} \right]\) in terms of its columns, we get

      \[\begin{array}{lllll} (B + C)A & = & \left[ \begin{array}{rrrr} (B + C)\mathbf{a}_{1} & (B + C)\mathbf{a}_{2} & \cdots & (B + C)\mathbf{a}_{n} \end{array} \right] & & \mbox{Definition~\ref{def:003447}} \\ & & & & \\ & = & \left[ \begin{array}{rrrr} B\mathbf{a}_{1} + C\mathbf{a}_{1} & B\mathbf{a}_{2} + C\mathbf{a}_{2} & \cdots & B\mathbf{a}_{n} + C\mathbf{a}_{n} \end{array} \right] & & \mbox{Theorem~\ref{thm:002811}} \\ & & & & \\ & = & \left[ \begin{array}{rrrr} B\mathbf{a}_{1} & B\mathbf{a}_{2} & \cdots & B\mathbf{a}_{n} \end{array} \right] + \left[ \begin{array}{rrrr} C\mathbf{a}_{1} & C\mathbf{a}_{2} & \cdots & C\mathbf{a}_{n} \end{array} \right] & & \mbox{Adding Columns} \\ & & & & \\ & = & BA + CA & & \mbox{Definition~\ref{def:003447}} \end{array} \nonumber \]

    3. As in Section [sec:2_1], write \(A = [a_{ij}]\) and \(B = [b_{ij}]\), so that \(A^{T} = [a^\prime_{ij}]\) and \(B^{T} = [b^\prime_{ij}]\) where \(a^\prime_{ij} = a_{ji}\) and \(b^\prime_{ji} = b_{ij}\) for all \(i\) and \(j\). If \(c_{ij}\) denotes the \((i, j)\)-entry of \(B^{T}A^{T}\), then \(c_{ij}\) is the dot product of row \(i\) of \(B^{T}\) with column \(j\) of \(A^{T}\). Hence

      \[\begin{aligned} c_{ij} = b_{i1}^\prime a_{1j}^\prime + b_{i2}^\prime a_{2j}^\prime + \cdots + b_{im}^\prime a_{mj}^\prime &= b_{1i} a_{j1} + b_{2i} a_{j2} + \cdots + b_{mi} a_{jm} \\ &= a_{j1}b_{1i} + a_{j2}b_{2i} + \cdots + a_{jm}b_{mi} \end{aligned} \nonumber \]

    Property 2 in Theorem [thm:003584] is called the associative law of matrix multiplication. It asserts that the equation \(A(BC) = (AB)C\) holds for all matrices (if the products are defined). Hence this product is the same no matter how it is formed, and so is written simply as \(ABC\). This extends: The product \(ABCD\) of four matrices can be formed several ways—for example, \((AB)(CD)\), \([A(BC)]D\), and \(A[B(CD)]\)—but the associative law implies that they are all equal and so are written as \(ABCD\). A similar remark applies in general: Matrix products can be written unambiguously with no parentheses.

    However, a note of caution about matrix multiplication must be taken: The fact that \(AB\) and \(BA\) need not be equal means that the order of the factors is important in a product of matrices. For example \(ABCD\) and \(ADCB\) may not be equal.

    If the order of the factors in a product of matrices is changed, the product matrix may change (or may not be defined). Ignoring this warning is a source of many errors by students of linear algebra!

    Properties 3 and 4 in Theorem [thm:003584] are called distributive laws. They assert that \(A(B + C) = AB + AC\) and \((B + C)A = BA + CA\) hold whenever the sums and products are defined. These rules extend to more than two terms and, together with Property 5, ensure that many manipulations familiar from ordinary algebra extend to matrices. For example

    \[\begin{aligned} A(2B - 3C + D - 5E) & = 2AB - 3AC + AD - 5AE \\ (A + 3C - 2D)B & = AB + 3CB - 2DB\end{aligned} \nonumber \]

    Note again that the warning is in effect: For example \(A(B - C)\) need not equal \(AB - CA\). These rules make possible a lot of simplification of matrix expressions.

    003663 Simplify the expression \(A(BC - CD) + A(C - B)D - AB(C - D)\).

    \[\begin{aligned} A(BC - CD) + A(C - B)D - AB(C - D) &= A(BC) - A(CD) + (AC-AB)D - (AB)C + (AB)D \\ &= ABC - ACD + ACD - ABD - ABC + ABD \\ &= 0\end{aligned} \nonumber \]

    Example [exa:003671] and Example [exa:003678] below show how we can use the properties in Theorem [thm:003488] to deduce other facts about matrix multiplication. Matrices \(A\) and \(B\) are said to commute if \(AB = BA\).

    003671 Suppose that \(A\), \(B\), and \(C\) are \(n \times n\) matrices and that both \(A\) and \(B\) commute with \(C\); that is, \(AC = CA\) and \(BC = CB\). Show that \(AB\) commutes with \(C\).

    Showing that \(AB\) commutes with \(C\) means verifying that \((AB)C = C(AB)\). The computation uses the associative law several times, as well as the given facts that \(AC = CA\) and \(BC = CB\).

    \[(AB)C = A(BC) = A(CB) = (AC)B = (CA)B = C(AB) \nonumber \]

    003678 Show that \(AB = BA\) if and only if \((A - B)(A + B) = A^{2} - B^{2}\).

    The following always holds:

    \[\label{eq:commute} (A - B)(A + B) = A(A + B) - B(A + B) = A^{2} + AB - BA -B^{2} \]

    Hence if \(AB = BA\), then \((A - B)(A + B) = A^{2} - B^{2}\) follows. Conversely, if this last equation holds, then equation ([eq:commute]) becomes

    \[A^{2} - B^{2} = A^{2} + AB - BA - B^{2} \nonumber \]

    This gives \(0 = AB - BA\), and \(AB = BA\) follows.

    In Section [sec:2_2] we saw (in Theorem [thm:002684]) that every system of linear equations has the form

    \[A\mathbf{x} = \mathbf{b} \nonumber \]

    where \(A\) is the coefficient matrix, \(\mathbf{x}\) is the column of variables, and \(\mathbf{b}\) is the constant matrix. Thus the system of linear equations becomes a single matrix equation. Matrix multiplication can yield information about such a system.

    003694 Consider a system \(A\mathbf{x} = \mathbf{b}\) of linear equations where \(A\) is an \(m \times n\) matrix. Assume that a matrix \(C\) exists such that \(CA = I_{n}\). If the system \(A\mathbf{x} = \mathbf{b}\) has a solution, show that this solution must be \(C\mathbf{b}\). Give a condition guaranteeing that \(C\mathbf{b}\) is in fact a solution.

    Suppose that \(\mathbf{x}\) is any solution to the system, so that \(A\mathbf{x} = \mathbf{b}\). Multiply both sides of this matrix equation by \(C\) to obtain, successively,

    \[C(A\mathbf{x}) = C\mathbf{b}, \quad (CA)\mathbf{x} = C\mathbf{b}, \quad I_{n}\mathbf{x} = C\mathbf{b}, \quad \mathbf{x} = C\mathbf{b} \nonumber \]

    This shows that if the system has a solution \(\mathbf{x}\), then that solution must be \(\mathbf{x} = C\mathbf{b}\), as required. But it does not guarantee that the system has a solution. However, if we write \(\mathbf{x}_{1} = C\mathbf{b}\), then

    \[A\mathbf{x}_{1} = A(C\mathbf{b}) = (AC)\mathbf{b} \nonumber \]

    Thus \(\mathbf{x}_{1} = C\mathbf{b}\) will be a solution if the condition \(AC = I_{m}\) is satisfied.

    The ideas in Example [exa:003694] lead to important information about matrices; this will be pursued in the next section.

    Block Multiplication

    Block Partition of a Matrix003711 It is often useful to consider matrices whose entries are themselves matrices (called blocks). A matrix viewed in this way is said to be partitioned into blocks.

    For example, writing a matrix \(B\) in the form

    \[B = \left[ \begin{array}{cccc} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{k} \end{array} \right] \mbox{ where the } \mathbf{b}_{j} \mbox{ are the columns of } B \nonumber \]

    is such a block partition of \(B\). Here is another example.

    Consider the matrices

    \[A = \left[ \begin{array}{rr|rrr} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ \hline 2 & -1 & 4 & 2 & 1 \\ 3 & 1 & -1 & 7 & 5 \end{array} \right] = \left[ \begin{array}{cc} I_{2} & 0_{23} \\ P & Q \end{array} \right] \quad \mbox{ and } \quad B = \left[ \begin{array}{rr} 4 & -2 \\ 5 & 6 \\ \hline 7 & 3 \\ -1 & 0 \\ 1 & 6 \end{array} \right] = \left[ \begin{array}{c} X \\ Y \end{array} \right] \nonumber \]

    where the blocks have been labelled as indicated. This is a natural way to partition \(A\) into blocks in view of the blocks \(I_{2}\) and \(0_{23}\) that occur. This notation is particularly useful when we are multiplying the matrices \(A\) and \(B\) because the product \(AB\) can be computed in block form as follows:

    \[AB = \left[ \begin{array}{cc} I & 0 \\ P & Q \end{array} \right] \left[ \begin{array}{c} X \\ Y \end{array} \right] = \left[ \begin{array}{c} IX + 0Y \\ PX + QY \end{array} \right] = \left[ \begin{array}{c} X \\ PX + QY \end{array} \right] = \left[ \begin{array}{rr} 4 & -2 \\ 5 & 6 \\ \hline 30 & 8 \\ 8 & 27 \end{array} \right] \nonumber \]

    This is easily checked to be the product \(AB\), computed in the conventional manner.

    In other words, we can compute the product \(AB\) by ordinary matrix multiplication, using blocks as entries. The only requirement is that the blocks be compatible. That is, the sizes of the blocks must be such that all (matrix) products of blocks that occur make sense. This means that the number of columns in each block of \(A\) must equal the number of rows in the corresponding block of \(B\).

    Block Multiplication003723 If matrices \(A\) and \(B\) are partitioned compatibly into blocks, the product \(AB\) can be computed by matrix multiplication using blocks as entries.

    We omit the proof.

    We have been using two cases of block multiplication. If \(B = \left[ \begin{array}{cccc} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{k} \end{array} \right]\) is a matrix where the \(\mathbf{b}_{j}\) are the columns of \(B\), and if the matrix product \(AB\) is defined, then we have

    \[AB = A \left[ \begin{array}{rrrr} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{k} \end{array} \right] = \left[ \begin{array}{rrrr} A\mathbf{b}_{1} & A\mathbf{b}_{2} & \cdots & A\mathbf{b}_{k} \end{array} \right] \nonumber \]

    This is Definition [def:003447] and is a block multiplication where \(A = \left[ A \right]\) has only one block. As another illustration,

    \[B\mathbf{x} = \left[ \begin{array}{rrrr} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{k} \end{array} \right] \left[ \begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{k} \end{array} \right] = x_{1}\mathbf{b}_{1} + x_{2}\mathbf{b}_{2} + \cdots + x_{k}\mathbf{b}_{k} \nonumber \]

    where \(\mathbf{x}\) is any \(k \times 1\) column matrix (this is Definition [def:002668]).

    It is not our intention to pursue block multiplication in detail here. However, we give one more example because it will be used below.

    003738 Suppose matrices \(A = \left[ \begin{array}{cc} B & X \\ 0 & C \end{array} \right]\) and \(A_{1} = \left[ \begin{array}{cc} B_{1} & X_{1} \\ 0 & C_{1} \end{array} \right]\) are partitioned as shown where \(B\) and \(B_{1}\) are square matrices of the same size, and \(C\) and \(C_{1}\) are also square of the same size. These are compatible partitionings and block multiplication gives

    \[AA_{1} = \left[ \begin{array}{cc} B & X \\ 0 & C \end{array} \right] \left[ \begin{array}{cc} B_{1} & X_{1} \\ 0 & C_{1} \end{array} \right] = \left[ \begin{array}{cc} BB_{1} & BX_{1} + XC_{1} \\ 0 & CC_{1} \end{array} \right] \nonumber \]

    003746 Obtain a formula for \(A^{k}\) where \(A = \left[ \begin{array}{cc} I & X \\ 0 & 0 \end{array} \right]\) is square and \(I\) is an identity matrix.

    We have \(A^{2} = \left[ \begin{array}{cc} I & X \\ 0 & 0 \end{array} \right] \left[ \begin{array}{cc} I & X \\ 0 & 0 \end{array} \right] = \left[ \begin{array}{cc} I^{2} & IX + X0 \\ 0 & 0^{2} \end{array} \right] = \left[ \begin{array}{cc} I & X \\ 0 & 0 \end{array} \right] = A\). Hence \(A^{3} = AA^{2} = AA = A^{2} = A\). Continuing in this way, we see that \(A^{k} = A\) for every \(k \geq 1\).

    Block multiplication has theoretical uses as we shall see. However, it is also useful in computing products of matrices in a computer with limited memory capacity. The matrices are partitioned into blocks in such a way that each product of blocks can be handled. Then the blocks are stored in auxiliary memory and their products are computed one by one.

    Directed Graphs

    The study of directed graphs illustrates how matrix multiplication arises in ways other than the study of linear equations or matrix transformations.

    A directed graph consists of a set of points (called vertices) connected by arrows (called edges). For example, the vertices could represent cities and the edges available flights. If the graph has \(n\) vertices \(v_{1}, v_{2}, \dots, v_{n}\), the adjacency matrix \(A = \left[ a_{ij} \right]\) is the \(n \times n\) matrix whose \((i, j)\)-entry \(a_{ij}\) is \(1\) if there is an edge from \(v_{j}\) to \(v_{i}\) (note the order), and zero otherwise. For example, the adjacency matrix of the directed graph shown is \(A = \left[ \begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 1 & 0 & 0 \end{array} \right]\).

    A path of length \(r\) (or an \(r\)-path) from vertex \(j\) to vertex \(i\) is a sequence of \(r\) edges leading from \(v_{j}\) to \(v_{i}\). Thus \(v_{1} \to v_{2} \to v_{1} \to v_{1} \to v_{3}\) is a \(4\)-path from \(v_{1}\) to \(v_{3}\) in the given graph. The edges are just the paths of length \(1\), so the \((i, j)\)-entry \(a_{ij}\) of the adjacency matrix \(A\) is the number of \(1\)-paths from \(v_{j}\) to \(v_{i}\). This observation has an important extension:

    003785 If \(A\) is the adjacency matrix of a directed graph with \(n\) vertices, then the \((i, j)\)-entry of \(A^r\) is the number of \(r\)-paths \(v_{j} \to v_{i}\).

    As an illustration, consider the adjacency matrix \(A\) in the graph shown. Then

    \[A = \left[ \begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 1 & 0 & 0 \end{array} \right], \quad A^{2} = \left[ \begin{array}{rrr} 2 & 1 & 1 \\ 2 & 1 & 0 \\ 1 & 1 & 0 \end{array} \right], \quad \mbox{ and } \quad A^{3} = \left[ \begin{array}{rrr} 4 & 2 & 1 \\ 3 & 2 & 1 \\ 2 & 1 & 1 \end{array} \right] \nonumber \]

    Hence, since the \((2, 1)\)-entry of \(A^{2}\) is \(2\), there are two \(2\)-paths \(v_{1} \to v_{2}\) (in fact they are \(v_{1} \to v_{1} \to v_{2}\) and \(v_{1} \to v_{3} \to v_{2}\)). Similarly, the \((2, 3)\)-entry of \(A^{2}\) is zero, so there are no \(2\)-paths \(v_{3} \to v_{2}\), as the reader can verify. The fact that no entry of \(A^{3}\) is zero shows that it is possible to go from any vertex to any other vertex in exactly three steps.

    To see why Theorem [thm:003785] is true, observe that it asserts that

    \[\label{eq:assert} \mbox{the } (i, j)\mbox{-entry of } A^{r} \mbox{ equals the number of } r\mbox{-paths } v_{j} \rightarrow v_{i} \]

    holds for each \(r \geq 1\). We proceed by induction on \(r\) (see Appendix [chap:appcinduction]). The case \(r = 1\) is the definition of the adjacency matrix. So assume inductively that ([eq:assert]) is true for some \(r \geq 1\); we must prove that ([eq:assert]) also holds for \(r + 1\). But every \((r + 1)\)-path \(v_{j} \to v_{i}\) is the result of an \(r\)-path \(v_{j} \to v_{k}\) for some \(k\), followed by a \(1\)-path \(v_{k} \to v_{i}\). Writing \(A = \left[ a_{ij} \right]\) and \(A^{r} = \left[ b_{ij} \right]\), there are \(b_{kj}\) paths of the former type (by induction) and \(a_{ik}\) of the latter type, and so there are \(a_{ik}b_{kj}\) such paths in all. Summing over \(k\), this shows that there are

    \[a_{i1}b_{1j} + a_{i2}b_{2j} + \cdots + a_{in}b_{nj} \quad (r + 1)\mbox{-paths } v_{j} \rightarrow v_{i} \nonumber \]

    But this sum is the dot product of the \(i\)th row \(\left[ \begin{array}{cccc} a_{i1} & a_{i2} & \cdots & a_{in} \end{array} \right]\) of \(A\) with the \(j\)th column \(\left[ \begin{array}{cccc} b_{1j} & b_{2j} & \cdots & b_{nj} \end{array} \right]^{T}\) of \(A^{r}\). As such, it is the \((i, j)\)-entry of the matrix product \(A^{r}A = A^{r+1}\). This shows that ([eq:assert]) holds for \(r + 1\), as required.


    1. When reading the notation \(S \circ T\), we read \(S\) first and then \(T\) even though the action is “first \(T\) then \(S\) ”. This annoying state of affairs results because we write \(T(\mathbf{x})\) for the effect of the transformation \(T\) on \(\mathbf{x}\), with \(T\) on the left. If we wrote this instead as (\(\mathbf{x}\))\(T\), the confusion would not occur. However the notation \(T(\mathbf{x})\) is well established.↩

    This page titled 2.4: Matrix Multiplication is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by W. Keith Nicholson (Lyryx Learning Inc.) via source content that was edited to the style and standards of the LibreTexts platform.