Skip to main content
Mathematics LibreTexts

2.3: Elementary Matrices

  • Page ID
    161303
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
    Learning Objectives
    • Use multiplication by an elementary matrix to apply row operations.
    • Write a matrix as a product of elementary matrices.

    We now turn our attention to a special type of matrix called an elementary matrix. An elementary matrix is always a square matrix. Recall the row operations given in Definition 1.2.2. Any elementary matrix, which we often denote by \(E\), is obtained from applying one row operation to the identity matrix of the same size.

    For example, the matrix \[E = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right]\nonumber \] is the elementary matrix obtained from switching the two rows. The matrix \[E = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{array} \right]\nonumber \] is the elementary matrix obtained from multiplying the second row of the \(3 \times 3\) identity matrix by \(3\). The matrix \[E = \left[ \begin{array}{rr} 1 & 0 \\ -3 & 1 \end{array} \right]\nonumber \] is the elementary matrix obtained from adding \(-3\) times the first row to the third row.

    You may construct an elementary matrix from any row operation, but remember that you can only apply one operation.

    Consider the following definition.

    Definition \(\PageIndex{1}\): Elementary Matrices and Row Operations

    Let \(E\) be an \(n \times n\) matrix. Then \(E\) is an elementary matrix if it is the result of applying one row operation to the \(n \times n\) identity matrix \(I_n\).

    Those which involve switching rows of the identity matrix are called permutation matrices.

    Therefore, \(E\) constructed above by switching the two rows of \(I_2\) is called a permutation matrix.

    Elementary matrices can be used in place of row operations and therefore are very useful. It turns out that multiplying (on the left hand side) by an elementary matrix \(E\) will have the same effect as doing the row operation used to obtain \(E\).

    The following theorem is an important result which we will use throughout this text.

    Theorem \(\PageIndex{1}\): Multiplication by an Elementary Matrix and Row Operations

    To perform any of the three row operations on a matrix \(A\) it suffices to take the product \(EA\), where \(E\) is the elementary matrix obtained by using the desired row operation on the identity matrix.

    Therefore, instead of performing row operations on a matrix \(A\), we can row reduce through matrix multiplication with the appropriate elementary matrix. We will examine this theorem in detail for each of the three row operations given in Definition 1.3.2.

    First, consider the following lemma.

    Lemma \(\PageIndex{1}\): Action of Permutation Matrix

    Let \(P^{ij}\) denote the elementary matrix which involves switching the \(i^{th}\) and the \(j^{th}\) rows. Then \(P^{ij}\) is a permutation matrix and \[P^{ij}A=B\nonumber \] where \(B\) is obtained from \(A\) by switching the \(i^{th}\) and the \(j^{th}\) rows.

    We will explore this idea more in the following example.

    Example \(\PageIndex{1}\): Switching Rows with an Elementary Matrix

    Let \[P^{12} = \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right], A = \left[ \begin{array}{cc} a & b \\ g & d \\ e & f \end{array} \right] \nonumber \]

    Find \(B\) where \(B = P^{12}A\).

    Solution

    You can see that the matrix \(P^{12}\) is obtained by switching the first and second rows of the \(3 \times 3\) identity matrix \(I\).

    Using our usual procedure, compute the product \(P^{12}A = B\). The result is given by

    \[B =\left[ \begin{array}{cc} g & d \\ a & b \\ e & f \end{array} \right] \nonumber\]

    Notice that \(B\) is the matrix obtained by switching rows \(1\) and \(2\) of \(A\). Therefore by multiplying \(A\) by \(P^{12}\), the row operation which was applied to \(I\) to obtain \(P^{12}\) is applied to \(A\) to obtain \(B\).

    Theorem \(\PageIndex{1}\) applies to all three row operations, and we now look at the row operation of multiplying a row by a scalar. Consider the following lemma.

    Lemma \(\PageIndex{2}\): Multiplication by a Scalar and Elementary Matrices

    Let \(E\left( k,i\right)\) denote the elementary matrix corresponding to the row operation in which the \(i^{th}\) row is multiplied by the nonzero scalar, \(k.\) Then

    \[E\left( k,i\right) A=B \nonumber\]

    where \(B\) is obtained from \(A\) by multiplying the \(i^{th}\) row of \(A\) by \(k\).

    We will explore this lemma further in the following example.

    Example \(\PageIndex{2}\): Multiplication of a Row by 5 Using Elementary Matrix

    Let

    \[E \left(5, 2\right) = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 1 \end{array} \right], A = \left[ \begin{array}{cc} a & b \\ c & d \\ e & f \end{array} \right] \nonumber\]

    Find the matrix \(B\) where \(B = E \left(5, 2\right)A\)

    Solution

    You can see that \(E \left(5, 2\right)\) is obtained by multiplying the second row of the identity matrix by \(5\).

    Using our usual procedure for multiplication of matrices, we can compute the product \(E \left(5, 2\right)A\). The resulting matrix is given by

    \[B =\left[ \begin{array}{cc} a & b \\ 5c & 5d \\ e & f \end{array} \right] \nonumber\]

    Notice that \(B\) is obtained by multiplying the second row of \(A\) by the scalar \(5\).

    There is one last row operation to consider. The following lemma discusses the final operation of adding a multiple of a row to another row.

    Lemma \(\PageIndex{3}\): Adding Multiples of Rows and Elementary Matrices

    Let \(E\left( k \times i+j\right)\) denote the elementary matrix obtained from \(I\) by adding \(k\) times the \(i^{th}\) row to the \(j^{th}\). Then

    \[E\left( k \times i+j\right) A=B\nonumber \]

    where \(B\) is obtained from \(A\) by adding \(k\) times the \(i^{th}\) row to the \(j^{th}\) row of \(A.\)

    Consider the following example.

    Example \(\PageIndex{3}\): Adding Two Times the First Row to the Last

    Let

    \[E\left( 2 \times 1+3\right) = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{array} \right], A = \left[ \begin{array}{cc} a & b \\ c & d \\ e & f \end{array} \right] \nonumber\]

    Find \(B\) where \(B = E\left( 2 \times 1+3\right)A\).

    Solution

    You can see that the matrix \(E\left( 2 \times 1+3\right)\) was obtained by adding \(2\) times the first row of \(I\) to the third row of \(I\).

    Using our usual procedure, we can compute the product \(E\left( 2 \times 1+3\right)A\). The resulting matrix \(B\) is given by \[B = \left[ \begin{array}{cc} a & b \\ c & d \\ 2a+e & 2b+f \end{array} \right] \nonumber\]

    You can see that \(B\) is the matrix obtained by adding \(2\) times the first row of \(A\) to the third row.

    Suppose we have applied a row operation to a matrix \(A\). Consider the row operation required to return \(A\) to its original form, to undo the row operation. It turns out that this action is how we find the inverse of an elementary matrix \(E\).

    Consider the following theorem.

    Theorem \(\PageIndex{2}\): Elementary Matrices and Inverses

    Every elementary matrix is invertible and its inverse is also an elementary matrix.

    In fact, the inverse of an elementary matrix is constructed by doing the reverse row operation on \(I\). \(E^{-1}\) will be obtained by performing the row operation which would carry \(E\) back to \(I\).

    • If \(E\) is obtained by switching rows \(i\) and \(j\), then \(E^{-1}\) is also obtained by switching rows \(i\) and \(j\).
    • If \(E\) is obtained by multiplying row \(i\) by the scalar \(k\), then \(E^{-1}\) is obtained by multiplying row \(i\) by the scalar \(\frac{1}{k}\).
    • If \(E\) is obtained by adding \(k\) times row \(i\) to row \(j\), then \(E^{-1}\) is obtained by subtracting \(k\) times row \(i\) from row \(j\).

    Consider the following example.

    Example \(\PageIndex{4}\): Inverse of an Elementary Matrix

    Let \[E = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 2 \end{array} \right]\nonumber \]

    Find \(E^{-1}\).

    Solution

    Consider the elementary matrix \(E\) given by

    \[E = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 2 \end{array} \right]\nonumber \]

    Here, \(E\) is obtained from the \(2 \times 2\) identity matrix by multiplying the second row by \(2\). In order to carry \(E\) back to the identity, we need to multiply the second row of \(E\) by \(\frac{1}{2}\). Hence,

    \(E^{-1}\) is given by \[E^{-1} = \left[ \begin{array}{rr} 1 & 0 \\ 0 & \frac{1}{2} \end{array} \right] \nonumber\]

    We can verify that \(EE^{-1}=I\). Take the product \(EE^{-1}\), given by

    \[EE^{-1} = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 2 \end{array} \right] \left[ \begin{array}{rr} 1 & 0 \\ 0 & \frac{1}{2} \end{array} \right] = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right] \nonumber\]

    This equals \(I\) so we know that we have compute \(E^{-1}\) properly.

    Suppose an \(m \times n\) matrix \(A\) is row reduced to its reduced row-echelon form. By tracking each row operation completed, this row reduction can be completed through multiplication by elementary matrices.

    Consider the following definition.

    Definition \(\PageIndex{2}\): The Form \(B=UA\)

    Let \(A\) be an \(m \times n\) matrix and let \(B\) be the reduced row-echelon form of \(A\). Then we can write \(B = UA\) where \(U\) is the product of all elementary matrices representing the row operations done to \(A\) to obtain \(B\).

    Consider the following example.

    Example \(\PageIndex{5}\): The Form \(B=UA\)

    Let \(A = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \\ 2 & 0 \end{array} \right]\). Find \(B\), the reduced row-echelon form of \(A\) and write it in the form \(B=UA\).

    Solution

    To find \(B\), row reduce \(A\). For each step, we will record the appropriate elementary matrix. First, switch rows \(1\) and \(2\).

    \[\left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \\ 2 & 0 \end{array} \right] \rightarrow \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 2 & 0 \end{array} \right]\nonumber \]

    The resulting matrix is equivalent to finding the product of \(P^{12} =\left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right]\) and \(A\).

    Next, add \((-2)\) times row \(1\) to row \(3\).

    \[\left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 2 & 0 \end{array} \right] \rightarrow \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array} \right]\nonumber \]

    This is equivalent to multiplying by the matrix \(E(-2 \times 1 + 3) = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -2 & 0 & 1 \end{array} \right]\). Notice that the resulting matrix is \(B\), the required reduced row-echelon form of \(A\).

    We can then write

    \[\begin{aligned} B &= E(-2 \times 1 + 2) \left( P^{12} A \right) \\ &= \left( E(-2 \times 1 + 2) P^{12} \right) A \\ &= U A\end{aligned}\]

    It remains to find the matrix \(U\).

    \[\begin{aligned} U &= E(-2 \times 1 + 2) P^{12} \\ &= \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -2 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right] \\ &= \left[ \begin{array}{rrr} 0 & 1 & 0\\ 1 & 0 & 0 \\ 0 & -2 & 1 \end{array} \right]\end{aligned}\]

    We can verify that \(B = UA\) holds for this matrix \(U\): \[\begin{aligned} UA &= \left[ \begin{array}{rrr} 0 & 1 & 0\\ 1 & 0 & 0 \\ 0 & -2 & 1 \end{array} \right] \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \\ 2 & 0 \end{array} \right] \\ &= \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array} \right] \\ &= B \end{aligned}\]

    While the process used in the above example is reliable and simple when only a few row operations are used, it becomes cumbersome in a case where many row operations are needed to carry \(A\) to \(B\). The following theorem provides an alternate way to find the matrix \(U\).

    Theorem \(\PageIndex{3}\): Finding the Matrix \(U\)

    Let \(A\) be an \(m \times n\) matrix and let \(B\) be its reduced row-echelon form. Then \(B = UA\) where \(U\) is an invertible \(m \times m\) matrix found by forming the matrix \(\left[ A | I_m \right]\) and row reducing to \(\left[ B | U \right]\).

    Let’s revisit the above example using the process outlined in Theorem \(\PageIndex{3}\).

    Example \(\PageIndex{6}\): The Form \(B=UA\), Revisited

    Let \(A = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \\ 2 & 0 \end{array}\right]\). Using the process outlined in Theorem \(\PageIndex{3}\), find \(U\) such that \(B=UA\).

    Solution

    First, set up the matrix \(\left[ A | I_m \right]\). \[\left[ \begin{array}{rr|rrr} 0 & 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 \\ 2 & 0 & 0 & 0 & 1 \end{array}\right]\nonumber \] Now, row reduce this matrix until the left side equals the reduced row-echelon form of \(A\).

    \[\begin{aligned} \left[ \begin{array}{rr|rrr} 0 & 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 \\ 2 & 0 & 0 & 0 & 1 \end{array}\right] &\rightarrow \left[ \begin{array}{rr|rrr} 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 \\ 2 & 0 & 0 & 0 & 1 \end{array}\right] \\ &\rightarrow \left[ \begin{array}{rr|rrr} 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & -2 & 1 \end{array}\right]\end{aligned}\]

    The left side of this matrix is \(B\), and the right side is \(U\). Comparing this to the matrix \(U\) found above in Example \(\PageIndex{5}\), you can see that the same matrix is obtained regardless of which process is used.

    Recall from Algorithm 2.2.1 that an \(n \times n\) matrix \(A\) is invertible if and only if \(A\) can be carried to the \(n \times n\) identity matrix using the usual row operations. This leads to an important consequence related to the above discussion.

    Suppose \(A\) is an \(n \times n\) invertible matrix. Then, set up the matrix \(\left[ A | I_n \right]\) as done above, and row reduce until it is of the form \(\left[ B | U \right]\). In this case, \(B = I_n\) because \(A\) is invertible.

    \[\begin{aligned} B &= UA \\ I_n &=UA \\ U^{-1} &= A \end{aligned}\]

    Now suppose that \(U = E_1 E_2 \cdots E_k\) where each \(E_i\) is an elementary matrix representing a row operation used to carry \(A\) to \(I\). Then,

    \[U^{-1} = \left( E_1 E_2 \cdots E_k \right) ^{-1} = E_k^{-1} \cdots E_2^{-1} E_1^{-1}\nonumber \]

    Remember that if \(E_i\) is an elementary matrix, so too is \(E_i^{-1}\). It follows that

    \[\begin{aligned} A&= U^{-1} \\ &= E_k^{-1} \cdots E_2^{-1} E_1^{-1}\end{aligned}\]

    and \(A\) can be written as a product of elementary matrices.

    Theorem \(\PageIndex{4}\): Product of Elementary Matrices

    Let \(A\) be an \(n \times n\) matrix. Then \(A\) is invertible if and only if it can be written as a product of elementary matrices.

    Consider the following example.

    Example \(\PageIndex{7}\): Product of Elementary Matrices

    Let \(A = \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 1 & 0 \\ 0 & -2 & 1 \end{array} \right]\). Write \(A\) as a product of elementary matrices.

    Solution

    We will use the process outlined in Theorem \(\PageIndex{3}\) to write \(A\) as a product of elementary matrices. We will set up the matrix \(\left[ A | I \right]\) and row reduce, recording each row operation as an elementary matrix.

    First:

    \[\left[ \begin{array}{rrr|rrr} 0 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 0 & 1 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrr|rrr} 1 & 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \nonumber\]

    represented by the elementary matrix \(E_1= \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right]\).

    Secondly:

    \[\left[ \begin{array}{rrr|rrr} 1 & 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrr|rrr} 1 & 0 & 0 & -1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \nonumber\]

    represented by the elementary matrix \(E_2 = \left[ \begin{array}{rrr} 1 & -1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right]\).

    Finally:

    \[\left[ \begin{array}{rrr|rrr} 1 & 0& 0 & -1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrr|rrr} 1 & 0 & 0 &-1 & 1 & 0\\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 2 & 0 & 1 \end{array} \right] \nonumber\]

    represented by the elementary matrix \(E_3= \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 2 & 1 \end{array} \right]\).

    Notice that the reduced row-echelon form of \(A\) is \(I\). Hence \(I = UA\) where \(U\) is the product of the above elementary matrices. It follows that \(A = U^{-1}\). Since we want to write \(A\) as a product of elementary matrices, we wish to express \(U^{-1}\) as a product of elementary matrices.

    \[\begin{aligned} U^{-1} &= \left( E_3 E_2 E_1 \right)^{-1}\\[6px] &= E_1^{-1} E_2^{-1} E_3^{-1} \\[6px] &= \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -2 & 1 \end{array} \right] \\ &= A\end{aligned}\]

    This gives \(A\) written as a product of elementary matrices. By Theorem \(\PageIndex{4}\) it follows that \(A\) is invertible.

    More on Matrix Inverses

    In this section, we will prove three theorems which will clarify the concept of matrix inverses. In order to do this, first recall some important properties of elementary matrices.

    Recall that an elementary matrix is a square matrix obtained by performing an elementary operation on an identity matrix. Each elementary matrix is invertible, and its inverse is also an elementary matrix. If \(E\) is an \(m \times m\) elementary matrix and \(A\) is an \(m \times n\) matrix, then the product \(EA\) is the result of applying to \(A\) the same elementary row operation that was applied to the \(m \times m\) identity matrix in order to obtain \(E\).

    Let \(R\) be the reduced row-echelon form of an \(m \times n\) matrix \(A\). \(R\) is obtained by iteratively applying a sequence of elementary row operations to \(A\). Denote by \(E_1, E_2, \cdots, E_k\) the elementary matrices associated with the elementary row operations which were applied, in order, to the matrix \(A\) to obtain the resulting \(R\). We then have that \(R = \left( E_k \cdots \left( E_2 \left( E_1A \right) \right)\right) = E_k \cdots E_2E_1A\). Let \(E\) denote the product matrix \(E_k \cdots E_2E_1\) so that we can write \(R=EA\) where \(E\) is an invertible matrix whose inverse is the product \((E_1)^{-1}(E_2)^{-1} \cdots (E_k)^{-1}\).

    Now, we will consider some preliminary lemmas.

    Lemma \(\PageIndex{4}\): Invertible Matrix and Zeros

    Suppose that \(A\) and \(B\) are matrices such that the product \(AB\) is an identity matrix. Then the reduced row-echelon form of \(A\) does not have a row of zeros.

    Proof

    Let \(R\) be the reduced row-echelon form of \(A\). Then \(R=EA\) for some invertible square matrix \(E\) as described above. By hypothesis \(AB=I\) where \(I\) is an identity matrix, so we have a chain of equalities \[R(BE^{-1}) = (EA)(BE^{-1}) = E(AB)E^{-1} = EIE^{-1} = EE^{-1} = I\nonumber \] If \(R\) would have a row of zeros, then so would the product \(R(BE^{-1})\). But since the identity matrix \(I\) does not have a row of zeros, neither can \(R\) have one.

    We now consider a second important lemma.

    Lemma \(\PageIndex{5}\): Size of Invertible Matrix

    Suppose that \(A\) and \(B\) are matrices such that the product \(AB\) is an identity matrix. Then \(A\) has at least as many columns as it has rows.

    Proof

    Let \(R\) be the reduced row-echelon form of \(A\). By Lemma \(\PageIndex{1}\), we know that \(R\) does not have a row of zeros, and therefore each row of \(R\) has a leading \(1\). Since each column of \(R\) contains at most one of these leading \(1\)s, \(R\) must have at least as many columns as it has rows.

    An important theorem follows from this lemma.

    Theorem \(\PageIndex{5}\): Invertible Matrices are Square

    Only square matrices can be invertible.

    Proof

    Suppose that \(A\) and \(B\) are matrices such that both products \(AB\) and \(BA\) are identity matrices. We will show that \(A\) and \(B\) must be square matrices of the same size. Let the matrix \(A\) have \(m\) rows and \(n\) columns, so that \(A\) is an \(m \times n\) matrix. Since the product \(AB\) exists, \(B\) must have \(n\) rows, and since the product \(BA\) exists, \(B\) must have \(m\) columns so that \(B\) is an \(n \times m\) matrix. To finish the proof, we need only verify that \(m=n\).

    We first apply Lemma \(\PageIndex{2}\) with \(A\) and \(B\), to obtain the inequality \(m \leq n\). We then apply Lemma \(\PageIndex{2}\) again (switching the order of the matrices), to obtain the inequality \(n \leq m\). It follows that \(m=n\), as we wanted.

    Of course, not all square matrices are invertible. In particular, zero matrices are not invertible, along with many other square matrices.

    The following proposition will be useful in proving the next theorem.

    Proposition \(\PageIndex{1}\): Reduced Row-Echelon Form of a Square Matrix

    If \(R\) is the reduced row-echelon form of a square matrix, then either \(R\) has a row of zeros or \(R\) is an identity matrix.

    The proof of this proposition is left as an exercise to the reader. We now consider the second important theorem of this section.

    Theorem \(\PageIndex{6}\): Unique Inverse of a Matrix

    Suppose \(A\) and \(B\) are square matrices such that \(AB=I\) where \(I\) is an identity matrix. Then it follows that \(BA=I\). Further, both \(A\) and \(B\) are invertible and \(B=A^{-1}\) and \(A=B^{-1}\).

    Proof

    Let \(R\) be the reduced row-echelon form of a square matrix \(A\). Then, \(R=EA\) where \(E\) is an invertible matrix. Since \(AB=I\), Lemma \(\PageIndex{1}\) gives us that \(R\) does not have a row of zeros. By noting that \(R\) is a square matrix and applying Proposition \(\PageIndex{1}\), we see that \(R=I\). Hence, \(EA=I\).

    Using both that \(EA=I\) and \(AB=I\), we can finish the proof with a chain of equalities as given by \[\begin{aligned} BA = IBIA &= (EA)B(E^{-1}E)A \\ &= E(AB)E^{-1}(EA) \\ &= EIE^{-1}I \\ &= EE^{-1} = I\end{aligned}\]

    It follows from the definition of the inverse of a matrix that \(B=A^{-1}\) and \(A=B^{-1}\).

    This theorem is very useful, since with it we need only test one of the products \(AB\) or \(BA\) in order to check that \(B\) is the inverse of \(A\). The hypothesis that \(A\) and \(B\) are square matrices is very important, and without this the theorem does not hold.

    We will now consider an example.

    Example \(\PageIndex{8}\): Non Square Matrices

    Let \[A = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array} \right],\nonumber \] Show that \(A^{T}A = I\) but \(AA^{T} \neq 0\).

    Solution

    Consider the product \(A^{T}A\) given by \[\left[ \begin{array}{rrr} 1 & 0 & 0\\ 0 & 1 & 0 \end{array} \right] \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array} \right] = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right]\] Therefore, \(A^{T}A = I_2\), where \(I_2\) is the \(2 \times 2\) identity matrix. However, the product \(AA^{T}\) is \[\left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array} \right] \left[ \begin{array}{rrr} 1 & 0 & 0\\ 0 & 1 & 0 \end{array} \right] = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{array} \right]\] Hence \(AA^{T}\) is not the \(3 \times 3\) identity matrix. This shows that for Theorem \(\PageIndex{2}\), it is essential that both matrices be square and of the same size.

    Is it possible to have matrices \(A\) and \(B\) such that \(AB=I\), while \(BA=0\)? This question is left to the reader to answer, and you should take a moment to consider the answer.

    We conclude this section with an important theorem.

    Theorem \(\PageIndex{7}\): The reduced row-echelon form of an Invertible Matrix

    For any matrix \(A\) the following conditions are equivalent:

    • \(A\) is invertible
    • The reduced row-echelon form of \(A\) is an identity matrix
    Proof

    In order to prove this, we show that for any given matrix \(A\), each condition implies the other. We first show that if \(A\) is invertible, then its reduced row-echelon form is an identity matrix, then we show that if the reduced row-echelon form of \(A\) is an identity matrix, then \(A\) is invertible.

    If \(A\) is invertible, there is some matrix \(B\) such that \(AB = I\). By Lemma \(\PageIndex{1}\), we get that the of \(A\) does not have a row of zeros. Then by Theorem \(\PageIndex{1}\), it follows that \(A\) and the reduced row-echelon form of \(A\) are square matrices. Finally, by Proposition \(\PageIndex{1}\), this reduced row-echelon form of \(A\) must be an identity matrix. This proves the first implication.

    Now suppose the reduced row-echelon form of \(A\) is an identity matrix \(I\). Then \(I=EA\) for some product \(E\) of elementary matrices. By Theorem \(\PageIndex{2}\), we can conclude that \(A\) is invertible.

    Theorem \(\PageIndex{3}\) corresponds to Algorithm 2.2.1, which claims that \(A^{-1}\) is found by row reducing the augmented matrix \(\left[ A|I\right]\) to the form \(\left[ I|A^{-1}\right]\). This will be a matrix product \(E\left[ A|I\right]\) where \(E\) is a product of elementary matrices. By the rules of matrix multiplication, we have that \(E\left[ A|I\right] = \left[ EA|EI\right] = \left[ EA|E\right]\).

    It follows that the reduced row-echelon form of \(\left[ A|I\right]\) is \(\left[ EA|E\right]\), where \(EA\) gives the reduced row-echelon form of \(A\). By Theorem \(\PageIndex{3}\), if \(EA \neq I\), then \(A\) is not invertible, and if \(EA=I\), \(A\) is invertible. If \(EA=I\), then by Theorem \(\PageIndex{2}\), \(E=A^{-1}\). This proves that Algorithm 2.2.1 does in fact find \(A^{-1}\).


    This page titled 2.3: Elementary Matrices is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Doli Bambhania, Fatemeh Yarahmadi, and Bill Wilson.