# 2.8: Elementary Matrices

- Page ID
- 19846

We now turn our attention to a special type of matrix called an **elementary matrix**. An elementary matrix is always a square matrix. Recall the row operations given in Definition 1.3.2. Any elementary matrix, which we often denote by \(E\), is obtained from applying *one* row operation to the identity matrix of the same size.

For example, the matrix \[E = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right]\nonumber \] is the elementary matrix obtained from switching the two rows. The matrix \[E = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \end{array} \right]\nonumber \] is the elementary matrix obtained from multiplying the second row of the \(3 \times 3\) identity matrix by \(3\). The matrix \[E = \left[ \begin{array}{rr} 1 & 0 \\ -3 & 1 \end{array} \right]\nonumber \] is the elementary matrix obtained from adding \(-3\) times the first row to the third row.

You may construct an elementary matrix from any row operation, but remember that you can only apply one operation.

Consider the following definition.

Let \(E\) be an \(n \times n\) matrix. Then \(E\) is an **elementary matrix** if it is the result of applying one row operation to the \(n \times n\) identity matrix \(I_n\).

Those which involve switching rows of the identity matrix are called permutation matrices.

Therefore, \(E\) constructed above by switching the two rows of \(I_2\) is called a permutation matrix.

Elementary matrices can be used in place of row operations and therefore are very useful. It turns out that multiplying (on the left hand side) by an elementary matrix \(E\) will have the same effect as doing the row operation used to obtain \(E\).

The following theorem is an important result which we will use throughout this text.

To perform any of the three row operations on a matrix \(A\) it suffices to take the product \(EA\), where \(E\) is the elementary matrix obtained by using the desired row operation on the identity matrix.

Therefore, instead of performing row operations on a matrix \(A\), we can row reduce through matrix multiplication with the appropriate elementary matrix. We will examine this theorem in detail for each of the three row operations given in Definition 1.3.2.

First, consider the following lemma.

Let \(P^{ij}\) denote the elementary matrix which involves switching the \(i^{th}\) and the \(j^{th}\) rows. Then \(P^{ij}\) is a permutation matrix and \[P^{ij}A=B\nonumber \] where \(B\) is obtained from \(A\) by switching the \(i^{th}\) and the \(j^{th}\) rows.

We will explore this idea more in the following example.

Let \[P^{12} = \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right], A = \left[ \begin{array}{cc} a & b \\ g & d \\ e & f \end{array} \right] \nonumber \]

Find \(B\) where \(B = P^{12}A\).

###### Solution

You can see that the matrix \(P^{12}\) is obtained by switching the first and second rows of the \(3 \times 3\) identity matrix \(I\).

Using our usual procedure, compute the product \(P^{12}A = B\). The result is given by

\[B =\left[ \begin{array}{cc} g & d \\ a & b \\ e & f \end{array} \right] \nonumber\]

Notice that \(B\) is the matrix obtained by switching rows \(1\) and \(2\) of \(A\). Therefore by multiplying \(A\) by \(P^{12}\), the row operation which was applied to \(I\) to obtain \(P^{12}\) is applied to \(A\) to obtain \(B\).

Theorem \(\PageIndex{1}\) applies to all three row operations, and we now look at the row operation of multiplying a row by a scalar. Consider the following lemma.

Let \(E\left( k,i\right)\) denote the elementary matrix corresponding to the row operation in which the \(i^{th}\) row is multiplied by the nonzero scalar, \(k.\) Then

\[E\left( k,i\right) A=B \nonumber\]

where \(B\) is obtained from \(A\) by multiplying the \(i^{th}\) row of \(A\) by \(k\).

We will explore this lemma further in the following example.

Let

\[E \left(5, 2\right) = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 1 \end{array} \right], A = \left[ \begin{array}{cc} a & b \\ c & d \\ e & f \end{array} \right] \nonumber\]

Find the matrix \(B\) where \(B = E \left(5, 2\right)A\)

###### Solution

You can see that \(E \left(5, 2\right)\) is obtained by multiplying the second row of the identity matrix by \(5\).

Using our usual procedure for multiplication of matrices, we can compute the product \(E \left(5, 2\right)A\). The resulting matrix is given by

\[B =\left[ \begin{array}{cc} a & b \\ 5c & 5d \\ e & f \end{array} \right] \nonumber\]

Notice that \(B\) is obtained by multiplying the second row of \(A\) by the scalar \(5\).

There is one last row operation to consider. The following lemma discusses the final operation of adding a multiple of a row to another row.

Let \(E\left( k \times i+j\right)\) denote the elementary matrix obtained from \(I\) by adding \(k\) times the \(i^{th}\) row to the \(j^{th}\). Then

\[E\left( k \times i+j\right) A=B\nonumber \]

where \(B\) is obtained from \(A\) by adding \(k\) times the \(i^{th}\) row to the \(j^{th}\) row of \(A.\)

Consider the following example.

Let

\[E\left( 2 \times 1+3\right) = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{array} \right], A = \left[ \begin{array}{cc} a & b \\ c & d \\ e & f \end{array} \right] \nonumber\]

Find \(B\) where \(B = E\left( 2 \times 1+3\right)A\).

###### Solution

You can see that the matrix \(E\left( 2 \times 1+3\right)\) was obtained by adding \(2\) times the first row of \(I\) to the third row of \(I\).

Using our usual procedure, we can compute the product \(E\left( 2 \times 1+3\right)A\). The resulting matrix \(B\) is given by \[B = \left[ \begin{array}{cc} a & b \\ c & d \\ 2a+e & 2b+f \end{array} \right] \nonumber\]

You can see that \(B\) is the matrix obtained by adding \(2\) times the first row of \(A\) to the third row.

Suppose we have applied a row operation to a matrix \(A\). Consider the row operation required to return \(A\) to its original form, to undo the row operation. It turns out that this action is how we find the inverse of an elementary matrix \(E\).

Consider the following theorem.

Every elementary matrix is invertible and its inverse is also an elementary matrix.

In fact, the inverse of an elementary matrix is constructed by doing the *reverse *row operation on \(I\). \(E^{-1}\) will be obtained by performing the row operation which would carry \(E\) back to \(I\).

- If \(E\) is obtained by switching rows \(i\) and \(j\), then \(E^{-1}\) is also obtained by switching rows \(i\) and \(j\).
- If \(E\) is obtained by multiplying row \(i\) by the scalar \(k\), then \(E^{-1}\) is obtained by multiplying row \(i\) by the scalar \(\frac{1}{k}\).
- If \(E\) is obtained by adding \(k\) times row \(i\) to row \(j\), then \(E^{-1}\) is obtained by subtracting \(k\) times row \(i\) from row \(j\).

Consider the following example.

Let \[E = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 2 \end{array} \right]\nonumber \]

Find \(E^{-1}\).

###### Solution

Consider the elementary matrix \(E\) given by

\[E = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 2 \end{array} \right]\nonumber \]

Here, \(E\) is obtained from the \(2 \times 2\) identity matrix by multiplying the second row by \(2\). In order to carry \(E\) back to the identity, we need to multiply the second row of \(E\) by \(\frac{1}{2}\). Hence,

\(E^{-1}\) is given by \[E^{-1} = \left[ \begin{array}{rr} 1 & 0 \\ 0 & \frac{1}{2} \end{array} \right] \nonumber\]

We can verify that \(EE^{-1}=I\). Take the product \(EE^{-1}\), given by

\[EE^{-1} = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 2 \end{array} \right] \left[ \begin{array}{rr} 1 & 0 \\ 0 & \frac{1}{2} \end{array} \right] = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right] \nonumber\]

This equals \(I\) so we know that we have compute \(E^{-1}\) properly.

Suppose an \(m \times n\) matrix \(A\) is row reduced to its reduced row-echelon form. By tracking each row operation completed, this row reduction can be completed through multiplication by elementary matrices.

Consider the following definition.

Let \(A\) be an \(m \times n\) matrix and let \(B\) be the reduced row-echelon form of \(A\). Then we can write \(B = UA\) where \(U\) is the product of all elementary matrices representing the row operations done to \(A\) to obtain \(B\).

Consider the following example.

Let \(A = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \\ 2 & 0 \end{array} \right]\). Find \(B\), the reduced row-echelon form of \(A\) and write it in the form \(B=UA\).

###### Solution

To find \(B\), row reduce \(A\). For each step, we will record the appropriate elementary matrix. First, switch rows \(1\) and \(2\).

\[\left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \\ 2 & 0 \end{array} \right] \rightarrow \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 2 & 0 \end{array} \right]\nonumber \]

The resulting matrix is equivalent to finding the product of \(P^{12} =\left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right]\) and \(A\).

Next, add \((-2)\) times row \(1\) to row \(3\).

\[\left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 2 & 0 \end{array} \right] \rightarrow \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array} \right]\nonumber \]

This is equivalent to multiplying by the matrix \(E(-2 \times 1 + 3) = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -2 & 0 & 1 \end{array} \right]\). Notice that the resulting matrix is \(B\), the required reduced row-echelon form of \(A\).

We can then write

\[\begin{aligned} B &= E(-2 \times 1 + 2) \left( P^{12} A \right) \\ &= \left( E(-2 \times 1 + 2) P^{12} \right) A \\ &= U A\end{aligned}\]

It remains to find the matrix \(U\).

\[\begin{aligned} U &= E(-2 \times 1 + 2) P^{12} \\ &= \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -2 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right] \\ &= \left[ \begin{array}{rrr} 0 & 1 & 0\\ 1 & 0 & 0 \\ 0 & -2 & 1 \end{array} \right]\end{aligned}\]

We can verify that \(B = UA\) holds for this matrix \(U\): \[\begin{aligned} UA &= \left[ \begin{array}{rrr} 0 & 1 & 0\\ 1 & 0 & 0 \\ 0 & -2 & 1 \end{array} \right] \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \\ 2 & 0 \end{array} \right] \\ &= \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{array} \right] \\ &= B \end{aligned}\]

While the process used in the above example is reliable and simple when only a few row operations are used, it becomes cumbersome in a case where many row operations are needed to carry \(A\) to \(B\). The following theorem provides an alternate way to find the matrix \(U\).

Let \(A\) be an \(m \times n\) matrix and let \(B\) be its reduced row-echelon form. Then \(B = UA\) where \(U\) is an invertible \(m \times m\) matrix found by forming the matrix \(\left[ A | I_m \right]\) and row reducing to \(\left[ B | U \right]\).

Let’s revisit the above example using the process outlined in Theorem \(\PageIndex{3}\).

Let \(A = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \\ 2 & 0 \end{array}\right]\). Using the process outlined in Theorem \(\PageIndex{3}\), find \(U\) such that \(B=UA\).

###### Solution

First, set up the matrix \(\left[ A | I_m \right]\). \[\left[ \begin{array}{rr|rrr} 0 & 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 \\ 2 & 0 & 0 & 0 & 1 \end{array}\right]\nonumber \] Now, row reduce this matrix until the left side equals the reduced row-echelon form of \(A\).

\[\begin{aligned} \left[ \begin{array}{rr|rrr} 0 & 1 & 1 & 0 & 0 \\ 1 & 0 & 0 & 1 & 0 \\ 2 & 0 & 0 & 0 & 1 \end{array}\right] &\rightarrow \left[ \begin{array}{rr|rrr} 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 \\ 2 & 0 & 0 & 0 & 1 \end{array}\right] \\ &\rightarrow \left[ \begin{array}{rr|rrr} 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & -2 & 1 \end{array}\right]\end{aligned}\]

The left side of this matrix is \(B\), and the right side is \(U\). Comparing this to the matrix \(U\) found above in Example \(\PageIndex{5}\), you can see that the same matrix is obtained regardless of which process is used.

Recall from Algorithm 2.7.1 that an \(n \times n\) matrix \(A\) is invertible if and only if \(A\) can be carried to the \(n \times n\) identity matrix using the usual row operations. This leads to an important consequence related to the above discussion.

Suppose \(A\) is an \(n \times n\) invertible matrix. Then, set up the matrix \(\left[ A | I_n \right]\) as done above, and row reduce until it is of the form \(\left[ B | U \right]\). In this case, \(B = I_n\) because \(A\) is invertible.

\[\begin{aligned} B &= UA \\ I_n &=UA \\ U^{-1} &= A \end{aligned}\]

Now suppose that \(U = E_1 E_2 \cdots E_k\) where each \(E_i\) is an elementary matrix representing a row operation used to carry \(A\) to \(I\). Then,

\[U^{-1} = \left( E_1 E_2 \cdots E_k \right) ^{-1} = E_k^{-1} \cdots E_2^{-1} E_1{-1}\nonumber \]

Remember that if \(E_i\) is an elementary matrix, so too is \(E_i^{-1}\). It follows that

\[\begin{aligned} A&= U^{-1} \\ &= E_k^{-1} \cdots E_2^{-1} E_1{-1}\end{aligned}\]

and \(A\) can be written as a product of elementary matrices.

Let \(A\) be an \(n \times n\) matrix. Then \(A\) is invertible if and only if it can be written as a product of elementary matrices.

Consider the following example.

Let \(A = \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 1 & 0 \\ 0 & -2 & 1 \end{array} \right]\). Write \(A\) as a product of elementary matrices.

###### Solution

We will use the process outlined in Theorem \(\PageIndex{3}\) to write \(A\) as a product of elementary matrices. We will set up the matrix \(\left[ A | I \right]\) and row reduce, recording each row operation as an elementary matrix.

First:

\[\left[ \begin{array}{rrr|rrr} 0 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 0 & 1 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrr|rrr} 1 & 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \nonumber\]

represented by the elementary matrix \(E_1= \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right]\).

Secondly:

\[\left[ \begin{array}{rrr|rrr} 1 & 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrr|rrr} 1 & 0 & 0 & -1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \nonumber\]

represented by the elementary matrix \(E_2 = \left[ \begin{array}{rrr} 1 & -1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right]\).

Finally:

\[\left[ \begin{array}{rrr|rrr} 1 & 0& 0 & -1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & -2 & 1 & 0 & 0 & 1 \end{array} \right] \rightarrow \left[ \begin{array}{rrr|rrr} 1 & 0 & 0 &-1 & 1 & 0\\ 0 & 1 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 2 & 0 & 1 \end{array} \right] \nonumber\]

represented by the elementary matrix \(E_3= \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 2 & 1 \end{array} \right]\).

Notice that the reduced row-echelon form of \(A\) is \(I\). Hence \(I = UA\) where \(U\) is the product of the above elementary matrices. It follows that \(A = U^{-1}\). Since we want to write \(A\) as a product of elementary matrices, we wish to express \(U^{-1}\) as a product of elementary matrices. \[\begin{aligned} U^{-1} &= \left( E_3 E_2 E_1 \right)^{-1}\\ &= E_1^{-1} E_2^{-1} E_3^{-1} \\ &= \left[ \begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -2 & 1 \end{array} \right] \\ &= A\end{aligned}\]

This gives \(A\) written as a product of elementary matrices. By Theorem \(\PageIndex{4}\) it follows that \(A\) is invertible.