2.1: Elementary Matrices
( \newcommand{\kernel}{\mathrm{null}\,}\)
It is now clear that elementary row operations are important in linear algebra: They are essential in solving linear systems (using the gaussian algorithm) and in inverting a matrix (using the matrix inversion algorithm). It turns out that they can be performed by left multiplying by certain invertible matrices. These matrices are the subject of this section.
Elementary Matrices005200 An n×n matrix E is called an elementary matrix if it can be obtained from the identity matrix In by a single elementary row operation (called the operation corresponding to E). We say that E is of type I, II, or III if the operation is of that type (see Definition [def:000795]).
Hence
E1=[0110],E2=[1009], and E3=[1501]
are elementary of types I, II, and III, respectively, obtained from the 2×2 identity matrix by interchanging rows 1 and 2, multiplying row 2 by 9, and adding 5 times row 2 to row 1.
Suppose now that the matrix A=[abcpqr] is left multiplied by the above elementary matrices E1, E2, and E3. The results are:
E1A=[0110][abcpqr]=[pqrabc]E2A=[1009][abcpqr]=[abc9p9q9r]E3A=[1501][abcpqr]=[a+5pb+5qc+5rpqr]
In each case, left multiplying A by the elementary matrix has the same effect as doing the corresponding row operation to A. This works in general.
005213 If an elementary row operation is performed on an m×n matrix A, the result is EA where E is the elementary matrix obtained by performing the same operation on the m×m identity matrix.
We prove it for operations of type III; the proofs for types I and II are left as exercises. Let E be the elementary matrix corresponding to the operation that adds k times row p to row q≠p. The proof depends on the fact that each row of EA is equal to the corresponding row of E times A. Let K1,K2,…,Km denote the rows of Im. Then row i of E is Ki if i≠q, while row q of E is Kq+kKp. Hence:
3If i≠q then row i of EA=KiA=(row i of A).Row q of EA=(Kq+kKp)A=KqA+k(KpA)=(row q of A) plus k (row p of A).
Thus EA is the result of adding k times row p of A to row q, as required.
The effect of an elementary row operation can be reversed by another such operation (called its inverse) which is also elementary of the same type (see the discussion following (Example [exa:000809]). It follows that each elementary matrix E is invertible. In fact, if a row operation on I produces E, then the inverse operation carries E back to I. If F is the elementary matrix corresponding to the inverse operation, this means FE=I (by Lemma [lem:005213]). Thus F=E−1 and we have proved
005237 Every elementary matrix E is invertible, and E−1 is also a elementary matrix (of the same type). Moreover, E−1 corresponds to the inverse of the row operation that produces E.
The following table gives the inverse of each type of elementary row operation:
|c|c|c| Type & Operation & Inverse Operation
I & Interchange rows p and q & Interchange rows p and q
II & Multiply row p by k≠0 & Multiply row p by 1/k, k≠0
III & Add k times row p to row q≠p & Subtract k times row p from row q, q≠p
Note that elementary matrices of type I are self-inverse.
005264 Find the inverse of each of the elementary matrices
E1=[010100001],E2=[100010009],andE3=[105010001].
E1, E2, and E3 are of type I, II, and III respectively, so the table gives
E−11=[010100001]=E1,E−12=[1000100019],andE−13=[10−5010001].
Inverses and Elementary Matrices
Suppose that an m×n matrix A is carried to a matrix B (written A→B) by a series of k elementary row operations. Let E1,E2,…,Ek denote the corresponding elementary matrices. By Lemma [lem:005213], the reduction becomes
A→E1A→E2E1A→E3E2E1A→⋯→EkEk−1⋯E2E1A=B
In other words,
A→UA=B where U=EkEk−1⋯E2E1
The matrix U=EkEk−1⋯E2E1 is invertible, being a product of invertible matrices by Lemma [lem:005237]. Moreover, U can be computed without finding the Ei as follows: If the above series of operations carrying A→B is performed on Im in place of A, the result is Im→UIm=U. Hence this series of operations carries the block matrix [AIm]→[BU]. This, together with the above discussion, proves
005294 Suppose A is m×n and A→B by elementary row operations.
- B=UA where U is an m×m invertible matrix.
- U can be computed by [AIm]→[BU] using the operations carrying A→B.
- U=EkEk−1⋯E2E1 where E1,E2,…,Ek are the elementary matrices corresponding (in order) to the elementary row operations carrying A to B.
005312 If A=[231121], express the reduced row-echelon form R of A as R=UA where U is invertible.
Reduce the double matrix [AI]→[RU] as follows:
[AI]=[2311012101]→[1210123110]→[121010−1−11−2]→[10−12−3011−12]
Hence R=[10−1011] and U=[2−3−12].
Now suppose that A is invertible. We know that A→I by Theorem [thm:004553], so taking B=I in Theorem [thm:005294] gives [AI]→[IU] where I=UA. Thus U=A−1, so we have [AI]→[IA−1]. This is the matrix inversion algorithm in Section [sec:2_4]. However, more is true: Theorem [thm:005294] gives A−1=U=EkEk−1⋯E2E1 where E1,E2,…,Ek are the elementary matrices corresponding (in order) to the row operations carrying A→I. Hence
A=(A−1)−1=(EkEk−1⋯E2E1)−1=E−11E−12⋯E−1k−1E−1k
By Lemma [lem:005237], this shows that every invertible matrix A is a product of elementary matrices. Since elementary matrices are invertible (again by Lemma [lem:005237]), this proves the following important characterization of invertible matrices.
005336 A square matrix is invertible if and only if it is a product of elementary matrices.
It follows from Theorem [thm:005294] that A→B by row operations if and only if B=UA for some invertible matrix U. In this case we say that A and B are row-equivalent. (See Exercise [ex:ex2_5_17].)
005340 Express A=[−2310] as a product of elementary matrices.
Using Lemma [lem:005213], the reduction of A→I is as follows:
A=[−2310]→E1A=[10−23]→E2E1A=[1003]→E3E2E1A=[1001]
where the corresponding elementary matrices are
E1=[0110],E2=[1021],E3=[10013]
Hence (E3 E2 E1)A=I, so:
A=(E3E2E1)−1=E−11E−12E−13=[0110][10−21][1003]
Smith Normal Form
Let A be an m×n matrix of rankr, and let R be the reduced row-echelon form of A. Theorem [thm:005294] shows that R=UA where U is invertible, and that U can be found from [AIm]→[RU].
The matrix R has r leading ones (since rankA=r) so, as R is reduced, the n×m matrix RT contains each row of Ir in the first r columns. Thus row operations will carry RT→[Ir000]n×m. Hence Theorem [thm:005294] (again) shows that [Ir000]n×m=U1RT where U1 is an n×n invertible matrix. Writing V=UT1, we obtain
UAV=RV=RUT1=(U1RT)T=([Ir000]n×m)T=[Ir000]m×n
Moreover, the matrix U1=VT can be computed by [RTIn]→[[Ir000]n×mVT]. This proves
005369 Let A be an m×n matrix of rankr. There exist invertible matrices U and V of size m×m and n×n, respectively, such that
UAV=[Ir000]m×n
Moreover, if R is the reduced row-echelon form of A, then:
- U can be computed by [AIm]→[RU];
- V can be computed by [RTIn]→[[Ir000]n×mVT].
If A is an m×n matrix of rankr, the matrix [Ir000] is called the Smith normal form1 of A. Whereas the reduced row-echelon form of A is the “nicest” matrix to which A can be carried by row operations, the Smith canonical form is the “nicest” matrix to which A can be carried by row and column operations. This is because doing row operations to RT amounts to doing column operations to R and then transposing.
005384 Given A=[1−1122−21−1−1103], find invertible matrices U and V such that UAV=[Ir000], where r=rankA.
The matrix U and the reduced row-echelon form R of A are computed by the row reduction [AI3]→[RU]:
[1−1121002−21−1010−1103001]→[1−10−3−11000152−100000−111]
Hence
R=[1−10−300150000] and U=[−1102−10−111]
In particular, r=rankR=2. Now row-reduce [RTI4]→[[Ir000]VT]:
[1001000−10001000100010−3500001]→[10010000100010000110000030−51]
whence
VT=[10000010110030−5−1] so V=[10130010010−50001]
Then UAV=[I2000] as is easily verified.
Uniqueness of the Reduced Row-echelon Form
In this short subsection, Theorem [thm:005294] is used to prove the following important theorem.
005405 If a matrix A is carried to reduced row-echelon matrices R and S by row operations, then R=S.
Observe first that UR=S for some invertible matrix U (by Theorem [thm:005294] there exist invertible matrices P and Q such that R=PA and S=QA; take U=QP−1). We show that R=S by induction on the number m of rows of R and S. The case m=1 is left to the reader. If Rj and Sj denote column j in R and S respectively, the fact that UR=S gives
URj=Sj for each j
Since U is invertible, this shows that R and S have the same zero columns. Hence, by passing to the matrices obtained by deleting the zero columns from R and S, we may assume that R and S have no zero columns.
But then the first column of R and S is the first column of Im because R and S are row-echelon, so ([eq:urs]) shows that the first column of U is column 1 of Im. Now write U, R, and S in block form as follows.
U=[1X0V],R=[1Y0R′], and S=[1Z0S′]
Since UR=S, block multiplication gives VR′=S′ so, since V is invertible (U is invertible) and both R′ and S′ are reduced row-echelon, we obtain R′=S′ by induction. Hence R and S have the same number (say r) of leading 1s, and so both have m–r zero rows.
In fact, R and S have leading ones in the same columns, say r of them. Applying ([eq:urs]) to these columns shows that the first r columns of U are the first r columns of Im. Hence we can write U, R, and S in block form as follows:
U=[IrM0W],R=[R1R200], and S=[S1S200]
where R1 and S1 are r×r. Then using UR=S block multiplication gives R1=S1 and R2=S2; that is, S=R. This completes the proof.
- Named after Henry John Stephen Smith (1826–83).↩