2.1: Elementary Matrices
- Page ID
- 58838
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)It is now clear that elementary row operations are important in linear algebra: They are essential in solving linear systems (using the gaussian algorithm) and in inverting a matrix (using the matrix inversion algorithm). It turns out that they can be performed by left multiplying by certain invertible matrices. These matrices are the subject of this section.
Elementary Matrices005200 An \(n \times n\) matrix \(E\) is called an elementary matrix if it can be obtained from the identity matrix \(I_{n}\) by a single elementary row operation (called the operation corresponding to \(E\)). We say that \(E\) is of type I, II, or III if the operation is of that type (see Definition [def:000795]).
Hence
\[E_{1} = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right], \quad E_{2} = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 9 \end{array} \right], \quad \mbox{ and } \quad E_{3} = \left[ \begin{array}{rr} 1 & 5 \\ 0 & 1 \end{array} \right] \nonumber \]
are elementary of types I, II, and III, respectively, obtained from the \(2 \times 2\) identity matrix by interchanging rows 1 and 2, multiplying row 2 by \(9\), and adding \(5\) times row 2 to row 1.
Suppose now that the matrix \(A = \left[ \begin{array}{ccc} a & b & c \\ p & q & r \end{array} \right]\) is left multiplied by the above elementary matrices \(E_{1}\), \(E_{2}\), and \(E_{3}\). The results are:
\[\begin{aligned} E_{1}A &= \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{rrr} a & b & c \\ p & q & r \end{array} \right] = \left[ \begin{array}{rrr} p & q & r \\ a & b & c \end{array} \right] \\ E_{2}A &= \left[ \begin{array}{rr} 1 & 0 \\ 0 & 9 \end{array} \right] \left[ \begin{array}{rrr} a & b & c \\ p & q & r \end{array} \right] = \left[ \begin{array}{ccc} a & b & c \\ 9p & 9q & 9r \end{array} \right] \\ E_{3}A &= \left[ \begin{array}{rr} 1 & 5 \\ 0 & 1 \end{array} \right] \left[ \begin{array}{ccc} a & b & c \\ p & q & r \end{array} \right] = \left[ \begin{array}{ccc} a + 5p & b + 5q & c + 5r \\ p & q & r \end{array} \right]\end{aligned} \nonumber \]
In each case, left multiplying \(A\) by the elementary matrix has the same effect as doing the corresponding row operation to \(A\). This works in general.
005213 If an elementary row operation is performed on an \(m \times n\) matrix \(A\), the result is \(EA\) where \(E\) is the elementary matrix obtained by performing the same operation on the \(m \times m\) identity matrix.
We prove it for operations of type III; the proofs for types I and II are left as exercises. Let \(E\) be the elementary matrix corresponding to the operation that adds \(k\) times row \(p\) to row \(q \neq p\). The proof depends on the fact that each row of \(EA\) is equal to the corresponding row of \(E\) times \(A\). Let \(K_{1}, K_{2}, \dots, K_{m}\) denote the rows of \(I_{m}\). Then row \(i\) of \(E\) is \(K_{i}\) if \(i \neq q\), while row \(q\) of \(E\) is \(K_{q} + kK_{p}\). Hence:
\[\begin{aligned} {3} & \mbox{If } i \neq q \mbox{ then row } i \mbox{ of } EA =&&K_{i}A &&= (\mbox{row } i \mbox{ of } A). \\ & \mbox{Row } q \mbox{ of } EA = (K_{q} + kK_{p})A && = && K_{q}A + k(K_{p}A) \\ & && = && (\mbox{row } q \mbox{ of } A) \mbox{ plus } k\ (\mbox{row } p \mbox{ of } A).\end{aligned} \nonumber \]
Thus \(EA\) is the result of adding \(k\) times row \(p\) of \(A\) to row \(q\), as required.
The effect of an elementary row operation can be reversed by another such operation (called its inverse) which is also elementary of the same type (see the discussion following (Example [exa:000809]). It follows that each elementary matrix \(E\) is invertible. In fact, if a row operation on \(I\) produces \(E\), then the inverse operation carries \(E\) back to \(I\). If \(F\) is the elementary matrix corresponding to the inverse operation, this means \(FE = I\) (by Lemma [lem:005213]). Thus \(F = E^{-1}\) and we have proved
005237 Every elementary matrix \(E\) is invertible, and \(E^{-1}\) is also a elementary matrix (of the same type). Moreover, \(E^{-1}\) corresponds to the inverse of the row operation that produces \(E\).
The following table gives the inverse of each type of elementary row operation:
|c|c|c| Type & Operation & Inverse Operation
I & Interchange rows \(p\) and \(q\) & Interchange rows \(p\) and \(q\)
II & Multiply row \(p\) by \(k \neq 0\) & Multiply row \(p\) by \(1/k\), \(k \neq 0\)
III & Add \(k\) times row \(p\) to row \(q \neq p\) & Subtract \(k\) times row \(p\) from row \(q\), \(q \neq p\)
Note that elementary matrices of type I are self-inverse.
005264 Find the inverse of each of the elementary matrices
\[E_{1} = \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right], \quad E_{2} = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 9 \end{array} \right], \quad \mbox{and} \quad E_{3} = \left[ \begin{array}{rrr} 1 & 0 & 5 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right]. \nonumber \]
\(E_{1}\), \(E_{2}\), and \(E_{3}\) are of type I, II, and III respectively, so the table gives
\[E_{1}^{-1} = \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right] = E_{1}, \quad E_{2}^{-1} = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & \frac{1}{9} \end{array} \right], \quad \mbox{and} \quad E_{3}^{-1} = \left[ \begin{array}{rrr} 1 & 0 & -5 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right]. \nonumber \]
Inverses and Elementary Matrices
Suppose that an \(m \times n\) matrix \(A\) is carried to a matrix \(B\) (written \(A \to B\)) by a series of \(k\) elementary row operations. Let \(E_{1}, E_{2}, \dots, E_{k}\) denote the corresponding elementary matrices. By Lemma [lem:005213], the reduction becomes
\[A \rightarrow E_{1}A \rightarrow E_{2}E_{1}A \rightarrow E_{3}E_{2}E_{1}A \rightarrow \cdots \rightarrow E_{k}E_{k-1} \cdots E_{2}E_{1}A = B \nonumber \]
In other words,
\[A \rightarrow UA = B \quad \mbox{ where } U = E_{k}E_{k-1} \cdots E_{2}E_{1} \nonumber \]
The matrix \(U = E_{k}E_{k-1} \cdots E_{2}E_{1}\) is invertible, being a product of invertible matrices by Lemma [lem:005237]. Moreover, \(U\) can be computed without finding the \(E_{i}\) as follows: If the above series of operations carrying \(A \to B\) is performed on \(I_{m}\) in place of \(A\), the result is \(I_{m} \to UI_{m} = U\). Hence this series of operations carries the block matrix \(\left[ \begin{array}{cc} A & I_{m} \end{array} \right] \to \left[ \begin{array}{cc} B & U \end{array} \right]\). This, together with the above discussion, proves
005294 Suppose \(A\) is \(m \times n\) and \(A \to B\) by elementary row operations.
- \(B = UA\) where \(U\) is an \(m \times m\) invertible matrix.
- \(U\) can be computed by \(\left[ \begin{array}{cc} A & I_{m} \end{array} \right] \to \left[ \begin{array}{cc} B & U \end{array} \right]\) using the operations carrying \(A \to B\).
- \(U = E_{k}E_{k-1} \cdots E_{2}E_{1}\) where \(E_{1}, E_{2}, \dots, E_{k}\) are the elementary matrices corresponding (in order) to the elementary row operations carrying \(A\) to \(B\).
005312 If \(A = \left[ \begin{array}{rrr} 2 & 3 & 1 \\ 1 & 2 & 1 \end{array} \right]\), express the reduced row-echelon form \(R\) of \(A\) as \(R = UA\) where \(U\) is invertible.
Reduce the double matrix \(\left[ \begin{array}{cc} A & I \end{array} \right] \rightarrow \left[ \begin{array}{cc} R & U \end{array} \right]\) as follows:
\[\begin{aligned} \left[ \begin{array}{cc} A & I \end{array} \right] = \left[ \begin{array}{rrr|rr} 2 & 3 & 1 & 1 & 0 \\ 1 & 2 & 1 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrr|rr} 1 & 2 & 1 & 0 & 1 \\ 2 & 3 & 1 & 1 & 0 \end{array} \right] &\rightarrow \left[ \begin{array}{rrr|rr} 1 & 2 & 1 & 0 & 1 \\ 0 & -1 & -1 & 1 & -2 \end{array} \right] \\ &\rightarrow \left[ \begin{array}{rrr|rr} 1 & 0 & -1 & 2 & -3 \\ 0 & 1 & 1 & -1 & 2 \end{array} \right]\end{aligned} \nonumber \]
Hence \(R = \left[ \begin{array}{rrr} 1 & 0 & -1 \\ 0 & 1 & 1 \end{array} \right]\) and \(U = \left[ \begin{array}{rr} 2 & -3 \\ -1 & 2 \end{array} \right]\).
Now suppose that \(A\) is invertible. We know that \(A \to I\) by Theorem [thm:004553], so taking \(B = I\) in Theorem [thm:005294] gives \(\left[ \begin{array}{cc} A & I \end{array} \right] \rightarrow \left[ \begin{array}{cc} I & U \end{array} \right]\) where \(I = UA\). Thus \(U = A^{-1}\), so we have \(\left[ \begin{array}{cc} A & I \end{array} \right] \rightarrow \left[ \begin{array}{cc} I & A^{-1} \end{array} \right]\). This is the matrix inversion algorithm in Section [sec:2_4]. However, more is true: Theorem [thm:005294] gives \(A^{-1} = U = E_{k}E_{k-1} \cdots E_{2}E_{1}\) where \(E_{1}, E_{2}, \dots, E_{k}\) are the elementary matrices corresponding (in order) to the row operations carrying \(A \to I\). Hence
\[A = \left(A^{-1}\right)^{-1} = \left(E_{k}E_{k-1} \cdots E_{2}E_{1}\right)^{-1} = E_{1}^{-1}E_{2}^{-1} \cdots E_{k-1}^{-1}E_{k}^{-1} \nonumber \]
By Lemma [lem:005237], this shows that every invertible matrix \(A\) is a product of elementary matrices. Since elementary matrices are invertible (again by Lemma [lem:005237]), this proves the following important characterization of invertible matrices.
005336 A square matrix is invertible if and only if it is a product of elementary matrices.
It follows from Theorem [thm:005294] that \(A \to B\) by row operations if and only if \(B = UA\) for some invertible matrix \(U\). In this case we say that \(A\) and \(B\) are row-equivalent. (See Exercise [ex:ex2_5_17].)
005340 Express \(A = \left[ \begin{array}{rr} -2 & 3 \\ 1 & 0 \end{array} \right]\) as a product of elementary matrices.
Using Lemma [lem:005213], the reduction of \(A \to I\) is as follows:
\[A = \left[ \begin{array}{rr} -2 & 3 \\ 1 & 0 \end{array} \right] \rightarrow E_{1}A = \left[ \begin{array}{rr} 1 & 0 \\ -2 & 3 \end{array} \right] \rightarrow E_{2}E_{1}A = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 3 \end{array} \right] \rightarrow E_{3}E_{2}E_{1}A = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right] \nonumber \]
where the corresponding elementary matrices are
\[E_{1} = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right], \quad E_{2} = \left[ \begin{array}{rr} 1 & 0 \\ 2 & 1 \end{array} \right], \quad E_{3} = \left[ \begin{array}{rr} 1 & 0 \\ 0 & \frac{1}{3} \end{array} \right] \nonumber \]
Hence \((E_{3}\ E_{2}\ E_{1})A = I\), so:
\[A = \left(E_{3}E_{2}E_{1}\right)^{-1} = E_{1}^{-1}E_{2}^{-1}E_{3}^{-1} = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{rr} 1 & 0 \\ -2 & 1 \end{array} \right] \left[ \begin{array}{rr} 1 & 0 \\ 0 & 3 \end{array} \right] \nonumber \]
Smith Normal Form
Let \(A\) be an \(m \times n\) matrix of \(rank \;r\), and let \(R\) be the reduced row-echelon form of \(A\). Theorem [thm:005294] shows that \(R = UA\) where \(U\) is invertible, and that \(U\) can be found from \(\left[ \begin{array}{cc} A & I_{m} \end{array} \right] \rightarrow \left[ \begin{array}{cc} R & U \end{array} \right]\).
The matrix \(R\) has \(r\) leading ones (since \(rank \;A = r\)) so, as \(R\) is reduced, the \(n \times m\) matrix \(R^{T}\) contains each row of \(I_{r}\) in the first \(r\) columns. Thus row operations will carry \(R^{T} \rightarrow \left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]_{n \times m}\). Hence Theorem [thm:005294] (again) shows that \(\left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]_{n \times m} = U_{1}R^{T}\) where \(U_{1}\) is an \(n \times n\) invertible matrix. Writing \(V = U_{1}^{T}\), we obtain
\[UAV = RV = RU_{1}^{T} = \left(U_{1}R^{T}\right)^{T} = \left( \left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]_{n \times m} \right)^{T} = \left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]_{m \times n} \nonumber \]
Moreover, the matrix \(U_{1} = V^{T}\) can be computed by \(\left[ \begin{array}{cc} R^{T} & I_{n} \end{array} \right] \rightarrow \left[ \left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]_{n \times m} V^{T} \right]\). This proves
005369 Let \(A\) be an \(m \times n\) matrix of \(rank \;r\). There exist invertible matrices \(U\) and \(V\) of size \(m \times m\) and \(n \times n\), respectively, such that
\[UAV = \left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]_{m \times n} \nonumber \]
Moreover, if \(R\) is the reduced row-echelon form of \(A\), then:
- \(U\) can be computed by \(\left[ \begin{array}{cc} A & I_{m} \end{array} \right] \rightarrow \left[ \begin{array}{cc} R & U \end{array} \right]\);
- \(V\) can be computed by \(\left[ \begin{array}{cc} R^{T} & I_{n} \end{array} \right] \rightarrow \left[ \left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]_{n \times m} V^{T} \right]\).
If \(A\) is an \(m \times n\) matrix of \(rank \;r\), the matrix \(\left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]\) is called the Smith normal form1 of \(A\). Whereas the reduced row-echelon form of \(A\) is the “nicest” matrix to which \(A\) can be carried by row operations, the Smith canonical form is the “nicest” matrix to which \(A\) can be carried by row and column operations. This is because doing row operations to \(R^{T}\) amounts to doing column operations to \(R\) and then transposing.
005384 Given \(A = \left[ \begin{array}{rrrr} 1 & -1 & 1 & 2 \\ 2 & -2 & 1 & -1 \\ -1 & 1 & 0 & 3 \end{array} \right]\), find invertible matrices \(U\) and \(V\) such that \(UAV = \left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right]\), where \(r = rank \;A\).
The matrix \(U\) and the reduced row-echelon form \(R\) of \(A\) are computed by the row reduction \(\left[ \begin{array}{cc} A & I_{3} \end{array} \right] \rightarrow \left[ \begin{array}{cc} R & U \end{array} \right]\):
\[\left[ \begin{array}{rrrr|rrr} 1 & -1 & 1 & 2 & 1 & 0 & 0 \\ 2 & -2 & 1 & -1 & 0 & 1 & 0 \\ -1 & 1 & 0 & 3 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrrr|rrr} 1 & -1 & 0 & -3 & -1 & 1 & 0 \\ 0 & 0 & 1 & 5 & 2 & -1 & 0 \\ 0 & 0 & 0 & 0 & -1 & 1 & 1 \end{array} \right] \nonumber \]
Hence
\[R = \left[ \begin{array}{rrrr} 1 & -1 & 0 & -3 \\ 0 & 0 & 1 & 5 \\ 0 & 0 & 0 & 0 \end{array} \right] \quad \mbox{ and } \quad U = \left[ \begin{array}{rrr} -1 & 1 & 0 \\ 2 & -1 & 0 \\ -1 & 1 & 1 \end{array} \right] \nonumber \]
In particular, \(r = rank \;R = 2\). Now row-reduce \(\left[ \begin{array}{cc} R^{T} & I_{4} \end{array} \right] \rightarrow \left[ \begin{array}{cc} \left[ \begin{array}{cc} I_{r} & 0 \\ 0 & 0 \end{array} \right] & V^{T} \end{array} \right]\):
\[\left[ \begin{array}{rrr|rrrr} 1 & 0 & 0 & 1 & 0 & 0 & 0 \\ -1 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 1 & 0 \\ -3 & 5 & 0 & 0 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrr|rrrr} 1 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 3 & 0 & -5 & 1 \end{array} \right] \nonumber \]
whence
\[V^{T} = \left[ \begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 1 & 1 & 0 & 0 \\ 3 & 0 & -5 & -1 \end{array} \right] \quad \mbox { so } \quad V = \left[ \begin{array}{rrrr} 1 & 0 & 1 & 3 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & -5 \\ 0 & 0 & 0 & 1 \end{array} \right] \nonumber \]
Then \(UAV = \left[ \begin{array}{cc} I_{2} & 0 \\ 0 & 0 \end{array} \right]\) as is easily verified.
Uniqueness of the Reduced Row-echelon Form
In this short subsection, Theorem [thm:005294] is used to prove the following important theorem.
005405 If a matrix \(A\) is carried to reduced row-echelon matrices \(R\) and \(S\) by row operations, then \(R = S\).
Observe first that \(UR = S\) for some invertible matrix \(U\) (by Theorem [thm:005294] there exist invertible matrices \(P\) and \(Q\) such that \(R = PA\) and \(S = QA\); take \(U = QP^{-1}\)). We show that \(R = S\) by induction on the number \(m\) of rows of \(R\) and \(S\). The case \(m = 1\) is left to the reader. If \(R_{j}\) and \(S_{j}\) denote column \(j\) in \(R\) and \(S\) respectively, the fact that \(UR = S\) gives
\[\label{eq:urs} UR_{j} = S_{j} \quad \mbox{ for each } j \]
Since \(U\) is invertible, this shows that \(R\) and \(S\) have the same zero columns. Hence, by passing to the matrices obtained by deleting the zero columns from \(R\) and \(S\), we may assume that \(R\) and \(S\) have no zero columns.
But then the first column of \(R\) and \(S\) is the first column of \(I_{m}\) because \(R\) and \(S\) are row-echelon, so ([eq:urs]) shows that the first column of \(U\) is column 1 of \(I_{m}\). Now write \(U\), \(R\), and \(S\) in block form as follows.
\[U = \left[ \begin{array}{cc} 1 & X \\ 0 & V \end{array} \right], \quad R = \left[ \begin{array}{cc} 1 & Y \\ 0 & R^\prime \end{array} \right], \quad \mbox{ and } \quad S = \left[ \begin{array}{cc} 1 & Z \\ 0 & S^\prime \end{array} \right] \nonumber \]
Since \(UR = S\), block multiplication gives \(VR^\prime = S^\prime\) so, since \(V\) is invertible (\(U\) is invertible) and both \(R^\prime\) and \(S^\prime\) are reduced row-echelon, we obtain \(R^\prime = S^\prime\) by induction. Hence \(R\) and \(S\) have the same number (say \(r\)) of leading \(1\)s, and so both have \(m\)–\(r\) zero rows.
In fact, \(R\) and \(S\) have leading ones in the same columns, say \(r\) of them. Applying ([eq:urs]) to these columns shows that the first \(r\) columns of \(U\) are the first \(r\) columns of \(I_{m}\). Hence we can write \(U\), \(R\), and \(S\) in block form as follows:
\[U = \left[ \begin{array}{cc} I_{r} & M \\ 0 & W \end{array} \right], \quad R = \left[ \begin{array}{cc} R_{1} & R_{2} \\ 0 & 0 \end{array} \right], \quad \mbox{ and } \quad S = \left[ \begin{array}{cc} S_{1} & S_{2} \\ 0 & 0 \end{array} \right] \nonumber \]
where \(R_{1}\) and \(S_{1}\) are \(r \times r\). Then using \(UR=S\) block multiplication gives \(R_{1} = S_{1}\) and \(R_{2} = S_{2}\); that is, \(S = R\). This completes the proof.
- Named after Henry John Stephen Smith (1826–83).↩