3.1: Invertibility
\(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\twovec}[2]{\begin{pmatrix} #1 \\ #2 \end{pmatrix} } \) \(\newcommand{\threevec}[3]{\begin{pmatrix} #1 \\ #2 \\ #3 \end{pmatrix} } \) \(\newcommand{\fourvec}[4]{\begin{pmatrix} #1 \\ #2 \\ #3 \\ #4 \end{pmatrix} } \) \(\newcommand{\fivevec}[5]{\begin{pmatrix} #1 \\ #2 \\ #3 \\ #4 \\ #5 \end{pmatrix} } \)In previous sections, we have found solutions to linear systems using the Gaussian elimination algorithm. We will now investigate another way of finding solutions to a specific type of equation \(A\mathbf x=\mathbf b\) when the matrix \(A\) has the same number of rows and columns. To get started, let's look at some familiar examples.
Preview Activity 3.1.1.
- Explain how you would solve the equation \(3x = 5\) without using the concept of division.
- Find the \(2\times2\) matrix \(A\) that rotates vectors counterclockwise by \(90^\circ\text{.}\)
- Find the \(2\times2\) matrix \(B\) that rotates vectors clockwise by \(90^\circ\text{.}\)
- What do you expect the product \(BA\) to be? Explain the reasoning behind your expectation and then compute \(BA\) to verify it.
-
Solve the equation \(A\mathbf x = \twovec{3}{-2}\) using Gaussian elimination.
- Explain why your solution may also be found by computing \(\mathbf x = B\twovec{3}{-2}\text{.}\)
Invertible matrices
The preview activity began with a familiar type of equation, \(3x = 5\text{,}\) and asked for a strategy to solve it. One possible response is to divide both sides by 3; instead, let's rephrase this as multiplying by \(3^{-1} = \frac 13\text{,}\) the multiplicative inverse of 3.
Now that we are interested in solving equations of the form \(A\mathbf x = \mathbf b\text{,}\) we might try to find a similar approach. Is there a matrix \(A^{-1}\) that plays the role of the multiplicative inverse? Of course, we can't expect every matrix to have a multiplicative inverse; after all, the real number \(0\) doesn't have an inverse. We will see, however, that many matrices do.
An \(n\times n\) matrix \(A\) is called invertible if there is a matrix \(B\) such that \(BA = I_n\text{,}\) where \(I_n\) is the \(n\times n\) identity matrix. The matrix \(B\) is called the inverse of \(A\) and denoted \(A^{-1}\text{.}\)
In the preview activity, we considered the matrices
since \(A\) rotates vectors in \(\mathbb R^2\) by \(90^\circ\) and \(B\) rotates vectors by \(-90^\circ\text{.}\) It's easy to check that
This shows that \(B = A^{-1}\text{.}\)
The preview also indicates the use of matrix inverses. Since we have \(A^{-1}A = I\text{,}\) we can solve the equation \(A\mathbf x = \mathbf b\) by multiplying both sides on the left by \(A^{-1}\text{:}\)
Notice that this is similar to finding the solution to \(3x=5\) as \(x=\frac13 5\text{.}\)
Activity 3.1.2.
Let's consider the matrices
-
Define these matrices in Sage and verify that \(BA = I\) so that \(B=A^{-1}\text{.}\)
- Find the solution to the equation \(A\mathbf x = \threevec{4}{-1}{4}\) using \(A^{-1}\text{.}\)
- Using your Sage cell above, multiply \(A\) and \(B\) in the opposite order; that is, what do you find when you evaluate \(AB\text{?}\)
- Suppose that \(A\) is an \(n\times n\) invertible matrix with inverse \(A^{-1}\text{.}\) This means that every equation of the form \(A\mathbf x=\mathbf b\) has a solution, namely, \(\mathbf x = A^{-1}\mathbf b\text{.}\) What can you conclude about the span of the columns of \(A\text{?}\)
- What can you conclude about the pivot positions of the matrix \(A\text{?}\)
- If \(A\) is an invertible \(4\times4\) matrix, what is its reduced row echelon form?
This activity demonstrates a few important things. First, we said that \(A\) is invertible if there is a matrix \(B\) such that \(BA = I\text{.}\) In general, multiplying matrices requires care because the product depends on the order in which the matrices are multiplied. However, in this case, we can check that \(BA = I\) implies that \(AB = I\) as well. This means that \(B\) is also invertible and that \(A=B^{-1}\text{.}\) This is the subject of Exercise 3.1.5.9.
Also, if the matrix \(A\) is invertible, then every equation \(A\mathbf x = \mathbf b\) has a solution \(\mathbf x = A^{-1}\mathbf b\text{.}\) This means that the span of the columns of \(A\) is \(\mathbb R^n\) so that \(A\) has a pivot in every row. Since the matrix \(A\) has \(n\) rows and \(n\) columns, there must be a pivot in every row and every column. Therefore, the reduced row echelon form of \(A\) is
This provides us with a useful characterization of invertible matrices.
Constructing a matrix inverse
We have seen that an invertible matrix \(A\) has the property that its reduced row echelon form is the identity; that is, \(A\sim I\text{.}\) Here, we will use this fact to construct the inverse of a matrix \(A\text{.}\)
Activity 3.1.3.
In this activity, we will begin with the matrix
and construct its inverse \(A^{-1}\text{.}\) For the time being, let's denote the inverse by \(B\) so that \(B=A^{-1}\text{.}\)
-
We know that \(AB = I\text{.}\) If we write \(B = \left[\begin{array}{rr}\mathbf b_1& \mathbf b_2\end{array}\right]\text{,}\) then we have
\begin{equation*} AB = \left[\begin{array}{rr} A\mathbf b_1 & A\mathbf b_2 \end{array}\right] = \left[\begin{array}{rr} \mathbf e_1 & \mathbf e_2 \end{array}\right] = I\text{.} \end{equation*}
This means that we need to solve the equations
\begin{equation*} \begin{aligned} A\mathbf b_1 & {}={} \mathbf e_1 \\ A\mathbf b_2 & {}={} \mathbf e_2 \\ \end{aligned}\text{.} \end{equation*}Using the Sage cell below, solve these equations for the columns of \(B\text{.}\)
-
What is the matrix \(B\text{?}\) Check that \(AB = I\) and \(BA = I\text{.}\)
-
To find the columns of \(B\text{,}\) we solved two equations, \(A\mathbf b_1=\mathbf e_1\) and \(A\mathbf b_2=\mathbf e_2\text{.}\) We could do this by augmenting \(A\) two separate times, forming matrices
\begin{equation*} \begin{aligned} \left[\begin{array}{r|r} A & \mathbf e_1 \end{array}\right] & \\ \left[\begin{array}{r|r} A & \mathbf e_2 \end{array}\right] & \\ \end{aligned} \end{equation*}
and finding their reduced row echelon forms. But instead of solving these two equations separately, we could also solve them together by forming the augmented matrix \(\left[\begin{array}{r|rr} A & \mathbf e_1 & \mathbf e_2 \end{array}\right]\) and finding the row reduced echelon form. In other words, we augment \(A\) by the matrix \(I\) to form \(\left[\begin{array}{r|r} A & I \end{array} \right] \text{.}\)
Form this augmented matrix and find its reduced row echelon form to find \(A^{-1}\text{.}\)
Assuming \(A\) is invertible, we have shown that
\begin{equation*} \left[\begin{array}{r|r} A & I \end{array}\right] \sim \left[\begin{array}{r|r} I & A^{-1} \end{array}\right]\text{.} \end{equation*} -
If you have defined a matrix \(A\) in Sage, you can find it's inverse as
A.inverse()
. Use Sage to find the inverse of the matrix\begin{equation*} A = \left[\begin{array}{rrr} 1 & -2 & -1 \\ -1 & 5 & 6 \\ 5 & -4 & 6 \\ \end{array}\right]\text{.} \end{equation*} -
What happens when we try to find the inverse of the matrix
\begin{equation*} \left[\begin{array}{rr} -4 & 2 \\ -2 & 1 \\ \end{array}\right]\text{?} \end{equation*}
- Suppose that \(n\times n\) matrices \(C\) and \(D\) are both invertible. What do you find when you simplify the product \((D^{-1}C^{-1})(CD)\text{?}\) Explain why the product \(CD\) is invertible and \((CD)^{-1} = D^{-1}C^{-1}\text{.}\)
Finding the inverse of an \(n\times n\) matrix \(A\) requires us to solve \(n\) equations. If we write the inverse as
then we need to solve
We can, of course, solve each equation separately, but it is more efficient to bundle the equations together by forming the augmented matrix \(\left[\begin{array}{r|r} A & I \end{array}\right]\) and finding its row reduced echelon form. We then find
We saw earlier that, if \(A\) has an inverse, then \(A\sim I\text{.}\) We have now seen that, if \(A\sim I\text{,}\) then \(A\) has an inverse.
Finally, we see that the product of two invertible matrices \(A\) and \(B\) is also invertible. This is because
Therefore, we have \((AB)^{-1} = B^{-1}A^{-1}\text{.}\) Because the matrix product depends on the order in which we multiply matrices, use care when applying this relationship. The inverse of a product is the product of the inverses with the order of multiplication reversed.
Properties of invertible matrices.
- An \(n\times n\) matrix \(A\) is invertible if and only if \(A\sim I\text{.}\)
- If \(A\) is invertible, then the solution to the equation \(A\mathbf x = \mathbf b\) is given by \(\mathbf x = A^{-1}\mathbf b\text{.}\)
-
We can find \(A^{-1}\) by finding the reduced row echelon form of \(\left[\begin{array}{r|r} A & I \end{array}\right]\text{;}\) namely,
\begin{equation*} \left[\begin{array}{r|r} A & I \end{array}\right] \sim \left[\begin{array}{r|r} I & A^{-1} \end{array}\right]\text{.} \end{equation*}
- If \(A\) and \(B\) are two invertible \(n\times n\) matrices, then their product \(AB\) is also invertible and \((AB)^{-1} = B^{-1}A^{-1}\text{.}\)
There is a simple formula for finding the inverse of a \(2\times2\) matrix:
which can be easily checked. The condition that \(A\) be invertible is, in this case, reduced to the condition that \(ad-bc\neq 0\text{.}\) We will understand this condition better once we have explored determinants in Section 3.4 . There is a similar formula for the inverse of a \(3\times 3\) matrix, but there is not a good reason to write it here.
Triangular matrices and Gaussian elimination
Generally speaking, solving an equation \(A\mathbf x=\mathbf b\) by first finding \(A^{-1}\) and then evaluating \(\mathbf x = A^{-1}\mathbf b\) is not the best strategy since row reducing the augmented matrix \(\left[\begin{array}{r|r} A & \mathbf b \end{array}\right]\) involves considerably less work. This becomes clear once we remember that finding the inverse \(A^{-1}\) requires us to solve \(n\) equations of this form.
For the class of triangular matrices, however, finding inverses is relatively efficient and useful, as we will see in Section 5.1 .
We say that a matrix \(A\) is lower triangular if all its entries above the diagaonal are zero. Similarly, \(A\) is upper triangular if all the entries below the diagonal are zero.
For example, the matrix \(L\) below is a lower triangular matrix while \(U\) is an upper triangular one.
We can develop a simple test to determine whether an \(n\times n\) lower triangular matrix is invertible. Let's use Gaussian elimination to find the reduced row echelon form of the lower triangular matrix
Because the entries on the diagonal are nonzero, we find a pivot position in every row, which tells us that the matrix is invertible. If, however, there is a zero entry on the diagonal, the matrix cannot be invertible. Considering the matrix below, we see that having a zero on the diagonal leads to a row without a pivot position.
An \(n\times n\) triangular matrix is invertible if and only if the entries on the diagonal are all nonzero.
Up to this point, our primary tool for studying linear systems, sets of vectors, and matrices has been Gaussian elimination. As the next activity demonstrates, we can express the row operations performed in Gaussian elimination in terms of matrix multiplication. In Section 5.1 , we will use this observation to create an efficient way to solve equations of the form \(A\mathbf x=\mathbf b\text{.}\)
Activity 3.1.4.
As an example, we will consider the matrix
When performing Gaussian elimination on \(A\text{,}\) we first apply a row replacement operation in which we multiply the first row by \(-2\) and add to the second row. After this step, we have a new matrix \(A_1\text{.}\)
-
Show that multiplying \(A\) by the lower triangular matrix
\begin{equation*} L_1 = \left[\begin{array}{rrr} 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array}\right] \end{equation*}
has the same effect as this row operation; that is, show that \(L_1A = A_1\text{.}\)
-
Explain why \(L_1\) is invertible and find its inverse \(L_1^{-1}\text{.}\)
- You should see that there is a simple relationship between \(L_1\) and \(L_1^{-1}\text{.}\) Describe this relationship and explain why it holds.
-
To continue the Gaussian elimination algorithm, we need to apply two more row replacements to bring \(A\) into a triangular form \(U\) where
\begin{equation*} A = \left[\begin{array}{rrr} 1 & 2 & 1 \\ 2 & 0 & -2 \\ -1 & 2 & -1 \\ \end{array}\right] \sim \left[\begin{array}{rrr} 1 & 2 & 1 \\ 0 & -4 & -4 \\ 0 & 0 & -4 \\ \end{array}\right] = U\text{.} \end{equation*}
Find the matrices \(L_2\) and \(L_3\) that perform these row replacement operations so that \(L_3L_2L_1 A = U\text{.}\)
-
Explain why the matrix product \(L_3L_2L_1\) is invertible and use this fact to write \(A = LU\text{.}\) What is the matrix \(L\) that you find? Why do you think we denote it by \(L\text{?}\)
-
Row replacement operations may always be performed by multiplying by a lower triangular matrix. It turns out the other two row operations, scaling and interchange, may also be performed using matrix multiplication. For instance, consider the two matrices
\begin{equation*} S = \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 1 \\ \end{array}\right], \hspace{24pt} P = \left[\begin{array}{rrr} 0 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \\ \end{array}\right]\text{.} \end{equation*}
Show that multiplying \(A\) by \(S\) performs a scaling operation and that multiplying by \(P\) performs a row interchange.
- Explain why the matrices \(S\) and \(P\) are invertible and state their inverses.
We will demonstrate the ideas in this activity again using the matrix
After performing three row replacement operations, we find the row equivalent upper triangular matrix \(U\text{:}\)
The first row replacement operation multiplies the first row by \(3\) and adds the result to the second row. We can perfom this operation by multiplying \(A\) by the lower triangular matrix \(L_1\) where
The next two row replacement operations are performed by the matrices
so that \(L_3L_2L_1A = U\text{.}\)
Notice that the inverse of \(L_1\) has the simple form:
This makes sense; if we want to undo the operation of multiplying the first row by \(3\) and adding to the second row, we should multiply the first row by \(-3\) and add it to the second row. This is the effect of \(L_1^{-1}\text{.}\)
The other row operations we use in implementing Gaussian elimination can also be performed by multiplying by an invertible matrix. In particular, if we scale a row by a nonzero number \(s\text{,}\) we can undo this operation by scaling by \(\frac 1s\text{.}\) This leads to the invertible diagonal matrices, such as
Similarly, a row interchange leads to a matrix \(P\text{,}\) which is its own inverse. An example is
Summary
In this section, we found conditions guaranteeing that a matrix has an inverse. When these conditions hold, we also found an algorithm for finding the inverse.
- The \(n\times n\) matrix \(A\) is invertible if and only if it is row equivalent to \(I_n\text{,}\) the \(n\times n\) identity matrix.
- If a matrix \(A\) is invertible, then the solution to the equation \(A\mathbf x = \mathbf b\) is \(\mathbf x = A^{-1}\mathbf b\text{.}\)
-
If a matrix \(A\) is invertible, we can use Gaussian elimination to find its inverse:
\begin{equation*} \left[\begin{array}{r|r} A & I \end{array}\right] \sim \left[\begin{array}{r|r} I & A^{-1} \end{array}\right]\text{.} \end{equation*}
- The row operations used in performing Gaussian elimination can be performed by multiplying by invertible matrices. More specifically, a row replacement operation may be performed by multiplying by an invertible lower triangular matrix.
Exercises 3.1.5Exercises
Consider the matrix
- Explain why \(A\) has an inverse.
- Find the inverse of \(A\) by augmenting by the identity \(I\) to form \(\left[\begin{array}{r|r}A & I \end{array}\right]\text{.}\)
- Use your inverse to solve the equation \(A\mathbf x = \fourvec{3}{2}{-3}{-1}\text{.}\)
In this exercise, we will consider \(2\times 2\) matrices as defining linear transformations.
- Write the matrix \(A\) that performs a \(45^\circ\) rotation. What geometric operation undoes this rotation? Find the matrix that perform this operation and verify that it is \(A^{-1}\text{.}\)
- Write the matrix \(A\) that performs a \(180^\circ\) rotation. Verify that \(A^2 = I\) so that \(A^{-1} = A\text{,}\) and explain geometrically why this is the case.
- Find three more matrices \(A\) that satisfy \(A^2 = I\text{.}\)
Suppose that \(A\) is an \(n\times n\) matrix.
- Suppose that \(A^2 = AA\) is invertible with inverse \(B\text{.}\) This means that \(BA^2 = BAA = I\text{.}\) Explain why \(A\) must be invertible with inverse \(BA\text{.}\)
- Suppose that \(A^{100}\) is invertible with inverse \(B\text{.}\) Explain why \(A\) is invertible. What is \(A^{-1}\) in terms of \(A\) and \(B\text{?}\)
Our definition of an invertible matrix requires that \(A\) be a square \(n\times n\) matrix. Let's examine what happens when \(A\) is not square. For instance, suppose that
-
Verify that \(BA = I_2\text{.}\) In this case, we say that \(B\) is a
left
inverse of \(A\text{.}\)
-
If \(A\) has a left inverse \(B\text{,}\) we can still use it to find solutions to linear equations. If we know there is a solution to the equation \(A\mathbf x = \mathbf b\text{,}\) we can multiply both sides of the equation by \(B\) to find \(\mathbf x = B\mathbf b\text{.}\)
Suppose you know there is a solution to the equation \(A\mathbf x = \threevec{-1}{-3}{6}\text{.}\) Use the left inverse \(B\) to find \(\mathbf x\) and verify that it is a solution.
-
Now consider the matrix
\begin{equation*} C = \left[\begin{array}{rrr} 1 & -1 & 0 \\ -2 & 1 & 0 \\ \end{array}\right] \end{equation*}
and verify that \(C\) is also a left inverse of \(A\text{.}\) This shows that the matrix \(A\) may have more than one left inverse.
- When \(A\) is a square matrix, we said that \(BA=I\) implies that \(AB=I\text{.}\) In this problem, we have a non-square matrix \(A\) with \(BA = I\text{.}\) What happens when we compute \(AB\text{?}\)
If a matrix \(A\) is invertible, there is a sequence of row operations that transform \(A\) into the identity matrix \(I\text{.}\) We have seen that every row operation can be performed by matrix multiplication. If the \(j^{th}\) step in the Gaussian elimination process is performed by multiplying by \(E_j\text{,}\) then we have
which means that
For each of the following matrices, find a sequence of row operations that transforms the matrix to the identity \(I\text{.}\) Write the matrices \(E_j\) that perform the steps and use them to find \(A^{-1}\text{.}\)
-
\begin{equation*} A = \left[\begin{array}{rrr} 0 & 2 & 0 \\ -3 & 0 & 0 \\ 0 & 0 & 1 \\ \end{array}\right]\text{.} \end{equation*}
-
\begin{equation*} A = \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 2 & 1 & 0 & 0 \\ 0 & -3 & 1 & 0 \\ 0 & 0 & 2 & 1 \\ \end{array}\right]\text{.} \end{equation*}
-
\begin{equation*} A = \left[\begin{array}{rrr} 1 & 1 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 2 \\ \end{array}\right]\text{.} \end{equation*}
Determine whether the following statements are true or false and explain your reasoning.
- If \(A\) is invertible, then the columns of \(A\) are linearly independent.
- If \(A\) is a square matrix whose diagonal entries are all nonzero, then \(A\) is invertible.
- If \(A\) is an invertible \(n\times n\) matrix, then the columns of \(A\) span \(\mathbb R^n\text{.}\)
- If \(A\) is invertible, then there is a nontrivial solution to the homogeneous equation \(A\mathbf x = \zerovec\text{.}\)
- If \(A\) is an \(n\times n\) matrix and the equation \(A\mathbf x = \mathbf b\) has a solution for every vector \(\mathbf b\text{,}\) then \(A\) is invertible.
Provide a justification for your response to the following questions.
- Suppose that \(A\) is a square matrix with two identical columns. Can \(A\) be invertible?
- Suppose that \(A\) is a square matrix with two identical rows. Can \(A\) be invertible?
- Suppose that \(A\) is an invertible matrix and that \(AB = AC\text{.}\) Can you conclude that \(B = C\text{?}\)
- Suppose that \(A\) is an invertible \(n\times n\) matrix. What can you say about the span of the columns of \(A^{-1}\text{?}\)
- Suppose that \(A\) is an invertible matrix and that \(B\) is row equivalent to \(A\text{.}\) Can you guarantee that \(B\) is invertible?
Suppose that we start with the \(3\times3\) matrix \(A\) and perform the following sequence of row operations:
- Multiply row 1 by -2 and add to row 2.
- Multiply row 1 by 4 and add to row 3.
- Scale row 2 by \(1/2\text{.}\)
- Multiply row 2 by -1 and add to row 3.
Suppose we arrive at the upper triangular matrix
- Write the matrices \(E_1\text{,}\) \(E_2\text{,}\) \(E_3\text{,}\) and \(E_4\) that perform the four row operations.
- Find the matrix \(E = E_4E_3E_2E_1\text{.}\)
- We then have \(E_4E_3E_2E_1 A = EA = U\text{.}\) Now that we have the matrix \(E\text{,}\) find the original matrix \(A = E^{-1}U\text{.}\)
We defined an \(n\times n\) matrix to be invertible if there is a matrix \(B\) such that \(BA=I_n\text{.}\) In this exercise, we will explain why \(B\) is also invertible and that \(AB = I\text{.}\) This means that, if \(B=A^{-1}\text{,}\) then \(A = B^{-1}\text{.}\)
- Given the fact that \(BA = I_n\text{,}\) explain why the matrix \(B\) must also be a square \(n\times n\) matrix.
- Suppose that \(\mathbf b\) is a vector in \(\mathbb R^n\text{.}\) Since we have \(BA = I\text{,}\) it follows that \(B(A\mathbf b) = \mathbf b\text{.}\) Use this to explain why the columns of \(B\) span \(\mathbb R^n\text{.}\) What does this say about the pivot positions of \(B\text{?}\)
- Explain why the equation \(B\mathbf x = \zerovec\) has only the trivial solution.
-
Beginning with the equation, \(BA=I\text{,}\) multiply both sides by \(B\) to obtain \(BAB = B\text{.}\) We will rearrange this equation:
\begin{equation*} \begin{aligned} BAB & {}={} B \\ BAB - B& {}={} 0 \\ B(AB-I) & {}={} 0\text{.} \\ \end{aligned} \end{equation*}
Since the homogeneous equation \(B\mathbf x =\zerovec\) has only the trivial solution, explain why \(AB-I = 0\) and therefore, \(AB = I\text{.}\)