2.5: Matrix Inverses
- Page ID
- 58837
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Three basic operations on matrices, addition, multiplication, and subtraction, are analogs for matrices of the same operations for numbers. In this section we introduce the matrix analog of numerical division.
To begin, consider how a numerical equation \(ax = b\) is solved when \(a\) and \(b\) are known numbers. If \(a = 0\), there is no solution (unless \(b = 0\)). But if \(a \neq 0\), we can multiply both sides by the inverse \(a^{-1} = \frac{1}{a}\) to obtain the solution \(x = a^{-1}b\). Of course multiplying by \(a^{-1}\) is just dividing by \(a\), and the property of \(a^{-1}\) that makes this work is that \(a^{-1}a = 1\). Moreover, we saw in Section [sec:2_2] that the role that \(1\) plays in arithmetic is played in matrix algebra by the identity matrix \(I\). This suggests the following definition.
Matrix Inverses004202 If \(A\) is a square matrix, a matrix \(B\) is called an inverse of \(A\) if and only if
\[AB = I \quad \mbox{ and } \quad BA = I \nonumber \]
A matrix \(A\) that has an inverse is called an invertible matrix.
004207 Show that \(B = \left[ \begin{array}{rr} -1 & 1 \\ 1 & 0 \end{array} \right]\) is an inverse of \(A = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 1 \end{array} \right]\).
Compute \(AB\) and \(BA\).
\[AB = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 1 \end{array} \right] \left[ \begin{array}{rr} -1 & 1 \\ 1 & 0 \end{array} \right] = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right] \quad BA = \left[ \begin{array}{rr} -1 & 1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{rr} 0 & 1 \\ 1 & 1 \end{array} \right] = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right] \nonumber \]
Hence \(AB = I = BA\), so \(B\) is indeed an inverse of \(A\).
004217 Show that \(A = \left[ \begin{array}{rr} 0 & 0 \\ 1 & 3 \end{array} \right]\) has no inverse.
Let \(B = \left[ \begin{array}{rr} a & b \\ c & d \end{array} \right]\) denote an arbitrary \(2 \times 2\) matrix. Then
\[AB = \left[ \begin{array}{rr} 0 & 0 \\ 1 & 3 \end{array} \right] \left[ \begin{array}{rr} a & b \\ c & d \end{array} \right] = \left[ \begin{array}{cc} 0 & 0 \\ a + 3c & b + 3d \end{array} \right] \nonumber \]
so \(AB\) has a row of zeros. Hence \(AB\) cannot equal \(I\) for any \(B\).
The argument in Example [exa:004217] shows that no zero matrix has an inverse. But Example [exa:004217] also shows that, unlike arithmetic, it is possible for a nonzero matrix to have no inverse. However, if a matrix does have an inverse, it has only one.
004227 If \(B\) and \(C\) are both inverses of \(A\), then \(B = C\).
Since \(B\) and \(C\) are both inverses of \(A\), we have \(CA = I = AB\). Hence
\[B = IB = (CA)B = C(AB) = CI = C \nonumber \]
If \(A\) is an invertible matrix, the (unique) inverse of \(A\) is denoted \(A^{-1}\). Hence \(A^{-1}\) (when it exists) is a square matrix of the same size as \(A\) with the property that
\[AA^{-1} = I \quad \mbox{ and } \quad A^{-1}A = I \nonumber \]
These equations characterize \(A^{-1}\) in the following sense:
Inverse Criterion: If somehow a matrix \(B\) can be found such that \(AB = I\) and \(BA = I\), then \(A\) is invertible and \(B\) is the inverse of \(A\); in symbols, \(B = A^{-1}\).
This is a way to verify that the inverse of a matrix exists. Example [exa:004241] and Example [exa:004261] offer illustrations.
004241 If \(A = \left[ \begin{array}{rr} 0 & -1 \\ 1 & -1 \end{array} \right]\), show that \(A^{3} = I\) and so find \(A^{-1}\).
We have \(A^{2} = \left[ \begin{array}{rr} 0 & -1 \\ 1 & -1 \end{array} \right] \left[ \begin{array}{rr} 0 & -1 \\ 1 & -1 \end{array} \right] = \left[ \begin{array}{rr} -1 & 1 \\ -1 & 0 \end{array} \right]\), and so
\[A^{3} = A^{2}A = \left[ \begin{array}{rr} -1 & 1 \\ -1 & 0 \end{array} \right] \left[ \begin{array}{rr} 0 & -1 \\ 1 & -1 \end{array} \right] = \left[ \begin{array}{rr} 1 & 0 \\ 0 & 1 \end{array} \right] = I \nonumber \]
Hence \(A^{3} = I\), as asserted. This can be written as \(A^{2}A = I = AA^{2}\), so it shows that \(A^{2}\) is the inverse of \(A\). That is, \(A^{-1} = A^{2} = \left[ \begin{array}{rr} -1 & 1 \\ -1 & 0 \end{array} \right]\).
The next example presents a useful formula for the inverse of a \(2 \times 2\) matrix \(A = \left[ \begin{array}{cc} a & b \\ c & d \end{array} \right]\) when it exists. To state it, we define the determinant \(\det A\) and the adjugate \(\text{adj} A\) of the matrix \(A\) as follows:
\[\det \left[ \begin{array}{cc} a & b \\ c & d \end{array} \right] = ad - bc, \quad \mbox{ and } \quad \text{adj} \left[ \begin{array}{cc} a & b \\ c & d \end{array} \right] = \left[ \begin{array}{rr} d & -b \\ -c & a \end{array} \right] \nonumber \]
004261 If \(A = \left[ \begin{array}{cc} a & b \\ c & d \end{array} \right]\), show that \(A\) has an inverse if and only if \(\det A \neq 0\), and in this case
\[A^{-1} = \frac{1}{\det A} \text{adj} A \nonumber \]
For convenience, write \(e = \det A = ad - bc\) and \(B = \text{adj} A = \left[ \begin{array}{rr} d & -b \\ -c & a \end{array} \right]\). Then \(AB = eI = BA\) as the reader can verify. So if \(e \neq 0\), scalar multiplication by \(\frac{1}{e}\) gives
\[A(\frac{1}{e}B) = I = (\frac{1}{e}B)A \nonumber \]
Hence \(A\) is invertible and \(A^{-1} = \frac{1}{e}B\). Thus it remains only to show that if \(A^{-1}\) exists, then \(e \neq 0\).
We prove this by showing that assuming \(e = 0\) leads to a contradiction. In fact, if \(e = 0\), then \(AB = eI = 0\), so left multiplication by \(A^{-1}\) gives \(A^{-1}AB = A^{-1}0\); that is, \(IB = 0\), so \(B = 0\). But this implies that \(a\), \(b\), \(c\), and \(d\) are all zero, so \(A = 0\), contrary to the assumption that \(A^{-1}\) exists.
As an illustration, if \(A = \left[ \begin{array}{rr} 2 & 4 \\ -3 & 8 \end{array} \right]\) then \(\det A = 2 \cdot 8 - 4 \cdot (-3) = 28 \neq 0\). Hence \(A\) is invertible and \(A^{-1} = \frac{1}{\det A} \text{adj} A = \frac{1}{28} \left[ \begin{array}{rr} 8 &-4 \\ 3 & 2 \end{array} \right]\), as the reader is invited to verify.
The determinant and adjugate will be defined in Chapter [chap:3] for any square matrix, and the conclusions in Example [exa:004261] will be proved in full generality.
Inverses and Linear Systems
Matrix inverses can be used to solve certain systems of linear equations. Recall that a system of linear equations can be written as a single matrix equation
\[A\mathbf{x} = \mathbf{b} \nonumber \]
where \(A\) and \(\mathbf{b}\) are known and \(\mathbf{x}\) is to be determined. If \(A\) is invertible, we multiply each side of the equation on the left by \(A^{-1}\) to get
\[\begin{aligned} A^{-1}A\mathbf{x} &= A^{-1}\mathbf{b} \\ I\mathbf{x} &= A^{-1}\mathbf{b} \\ \mathbf{x} &= A^{-1}\mathbf{b}\end{aligned} \nonumber \]
This gives the solution to the system of equations (the reader should verify that \(\mathbf{x} = A^{-1}\mathbf{b}\) really does satisfy \(A\mathbf{x} = \mathbf{b}\)). Furthermore, the argument shows that if \(\mathbf{x}\) is any solution, then necessarily \(\mathbf{x} = A^{-1}\mathbf{b}\), so the solution is unique. Of course the technique works only when the coefficient matrix \(A\) has an inverse. This proves Theorem [thm:004292].
004292 Suppose a system of \(n\) equations in \(n\) variables is written in matrix form as
\[A\mathbf{x} = \mathbf{b} \nonumber \]
If the \(n \times n\) coefficient matrix \(A\) is invertible, the system has the unique solution
\[\mathbf{x} = A^{-1}\mathbf{b} \nonumber \]
004298 Use Example [exa:004261] to solve the system \(\left\lbrace \begin{array}{rrrrr} 5x_{1} & - & 3x_{2} & = & -4 \\ 7x_{1} & + & 4x_{2} & = & 8 \end{array} \right.\).
In matrix form this is \(A\mathbf{x} = \mathbf{b}\) where \(A = \left[ \begin{array}{rr} 5 & -3 \\ 7 & 4 \end{array} \right]\), \(\mathbf{x} = \left[ \begin{array}{c} x_{1} \\ x_{2} \end{array} \right]\), and \(\mathbf{b} = \left[ \begin{array}{r} -4 \\ 8 \end{array} \right]\). Then \(\det A = 5 \cdot 4 - (-3) \cdot 7 = 41\), so \(A\) is invertible and \(A^{-1} = \frac{1}{41} \left[ \begin{array}{rr} 4 & 3 \\ -7 & 5 \end{array} \right]\) by Example [exa:004261]. Thus Theorem [thm:004292] gives
\[\mathbf{x} = A^{-1}\mathbf{b} = \frac{1}{41} \left[ \begin{array}{rr} 4 & 3 \\ -7 & 5 \end{array} \right] \left[ \begin{array}{r} -4 \\ 8 \end{array} \right] = \frac{1}{41} \left[ \begin{array}{r} 8 \\ 68 \end{array} \right] \nonumber \]
so the solution is \(x_{1} = \frac{8}{41}\) and \(x_{2} = \frac{68}{41}\).
An Inversion Method
If a matrix \(A\) is \(n \times n\) and invertible, it is desirable to have an efficient technique for finding the inverse. The following procedure will be justified in Section [sec:2_5].
Matrix Inversion Algorithm004348 If \(A\) is an invertible (square) matrix, there exists a sequence of elementary row operations that carry \(A\) to the identity matrix \(I\) of the same size, written \(A \to I\). This same series of row operations carries \(I\) to \(A^{-1}\); that is, \(I \to A^{-1}\). The algorithm can be summarized as follows:
\[\left[ \begin{array}{cc} A & I \end{array} \right] \rightarrow \left[ \begin{array}{cc} I & A^{-1} \end{array} \right] \nonumber \]
where the row operations on \(A\) and \(I\) are carried out simultaneously.
004354 Use the inversion algorithm to find the inverse of the matrix
\[A = \left[ \begin{array}{rrr} 2 & 7 & 1 \\ 1 & 4 & -1 \\ 1 & 3 & 0 \end{array} \right] \nonumber \]
Apply elementary row operations to the double matrix
\[\left[ \begin{array}{rrr} A & I \end{array} \right] = \left[ \begin{array}{rrr|rrr} 2 & 7 & 1 & 1 & 0 & 0 \\ 1 & 4 & -1 & 0 & 1 & 0 \\ 1 & 3 & 0 & 0 & 0 & 1 \end{array} \right] \nonumber \]
so as to carry \(A\) to \(I\). First interchange rows 1 and 2.
\[\left[ \begin{array}{rrr|rrr} 1 & 4 & -1 & 0 & 1 & 0 \\ 2 & 7 & 1 & 1 & 0 & 0 \\ 1 & 3 & 0 & 0 & 0 & 1 \end{array} \right] \nonumber \]
Next subtract \(2\) times row 1 from row 2, and subtract row 1 from row 3.
\[\left[ \begin{array}{rrr|rrr} 1 & 4 & -1 & 0 & 1 & 0 \\ 0 & -1 & 3 & 1 & -2 & 0 \\ 0 & -1 & 1 & 0 & -1 & 1 \end{array} \right] \nonumber \]
Continue to reduced row-echelon form.
\[\left[ \begin{array}{rrr|rrr} 1 & 0 & 11 & 4 & -7 & 0 \\ 0 & 1 & -3 & -1 & 2 & 0 \\ 0 & 0 & -2 & -1 & 1 & 1 \end{array} \right] \nonumber \]
\[\left[ \def\arraystretch{1.5} \begin{array}{rrr|rrr} 1 & 0 & 0 & \frac{-3}{2} & \frac{-3}{2} & \frac{11}{2} \\ 0 & 1 & 0 & \frac{1}{2} & \frac{1}{2} & \frac{-3}{2} \\ 0 & 0 & 1 & \frac{1}{2} & \frac{-1}{2} & \frac{-1}{2} \end{array} \right] \nonumber \]
Hence \(A^{-1} = \frac{1}{2} \left[ \begin{array}{rrr} -3 & -3 & 11 \\ 1 & 1 & -3 \\ 1 & -1 & -1 \end{array} \right]\), as is readily verified.
Given any \(n \times n\) matrix \(A\), Theorem [thm:001017] shows that \(A\) can be carried by elementary row operations to a matrix \(R\) in reduced row-echelon form. If \(R = I\), the matrix \(A\) is invertible (this will be proved in the next section), so the algorithm produces \(A^{-1}\). If \(R \neq I\), then \(R\) has a row of zeros (it is square), so no system of linear equations \(A\mathbf{x} = \mathbf{b}\) can have a unique solution. But then \(A\) is not invertible by Theorem [thm:004292]. Hence, the algorithm is effective in the sense conveyed in Theorem [thm:004371].
004371 If \(A\) is an \(n \times n\) matrix, either \(A\) can be reduced to \(I\) by elementary row operations or it cannot. In the first case, the algorithm produces \(A^{-1}\); in the second case, \(A^{-1}\) does not exist.
Properties of Inverses
The following properties of an invertible matrix are used everywhere.
Cancellation Laws004379 Let \(A\) be an invertible matrix. Show that:
- If \(AB = AC\), then \(B = C\).
- If \(BA = CA\), then \(B = C\).
Given the equation \(AB = AC\), left multiply both sides by \(A^{-1}\) to obtain \(A^{-1}AB = A^{-1}AC\). Thus \(IB = IC\), that is \(B = C\). This proves (1) and the proof of (2) is left to the reader.
Properties (1) and (2) in Example [exa:004379] are described by saying that an invertible matrix can be “left cancelled” and “right cancelled”, respectively. Note however that “mixed” cancellation does not hold in general: If \(A\) is invertible and \(AB = CA\), then \(B\) and \(C\) may not be equal, even if both are \(2 \times 2\). Here is a specific example:
\[A = \left[ \begin{array}{rr} 1 & 1 \\ 0 & 1 \end{array} \right],\ B = \left[ \begin{array}{rr} 0 & 0 \\ 1 & 2 \end{array} \right], C = \left[ \begin{array}{rr} 1 & 1 \\ 1 & 1 \end{array} \right] \nonumber \]
Sometimes the inverse of a matrix is given by a formula. Example [exa:004261] is one illustration; Example [exa:004397] and Example [exa:004423] provide two more. The idea is the Inverse Criterion: If a matrix \(B\) can be found such that \(AB = I = BA\), then \(A\) is invertible and \(A^{-1} = B\).
004397 If \(A\) is an invertible matrix, show that the transpose \(A^{T}\) is also invertible. Show further that the inverse of \(A^{T}\) is just the transpose of \(A^{-1}\); in symbols, \((A^{T})^{-1} = (A^{-1})^{T}\).
\(A^{-1}\) exists (by assumption). Its transpose \((A^{-1})^{T}\) is the candidate proposed for the inverse of \(A^{T}\). Using the inverse criterion, we test it as follows:
\[ \begin{array}{lllllll} A^{T}(A^{-1})^{T} & = & (A^{-1}A)^{T} & = & I^{T} & = & I \\ (A^{-1})^{T}A^{T} & = & (AA^{-1})^{T} & = & I^{T} & = & I \end{array} \nonumber \]
Hence \((A^{-1})^{T}\) is indeed the inverse of \(A^{T}\); that is, \((A^{T})^{-1} = (A^{-1})^{T}\).
004423 If \(A\) and \(B\) are invertible \(n \times n\) matrices, show that their product \(AB\) is also invertible and \((AB)^{-1} = B^{-1}A^{-1}\).
We are given a candidate for the inverse of \(AB\), namely \(B^{-1}A^{-1}\). We test it as follows:
\[\begin{aligned} (B^{-1}A^{-1})(AB) &= B^{-1}(A^{-1}A)B = B^{-1}IB = B^{-1}B = I \\ (AB)(B^{-1}A^{-1}) &= A(BB^{-1})A^{-1} = AIA^{-1} = AA^{-1} = I\end{aligned} \nonumber \]
Hence \(B^{-1}A^{-1}\) is the inverse of \(AB\); in symbols, \((AB)^{-1} = B^{-1}A^{-1}\).
We now collect several basic properties of matrix inverses for reference.
004442 All the following matrices are square matrices of the same size.
- \(I\) is invertible and \(I^{-1} = I\).
- If \(A\) is invertible, so is \(A^{-1}\), and \((A^{-1})^{-1} = A\).
- If \(A\) and \(B\) are invertible, so is \(AB\), and \((AB)^{-1} = B^{-1}A^{-1}\).
- If \(A_{1}, A_{2}, \dots, A_{k}\) are all invertible, so is their product \(A_{1}A_{2} \cdots A_{k}\), and
\[(A_{1}A_{2} \cdots A_{k})^{-1} = A_{k}^{-1} \cdots A_{2}^{-1}A_{1}^{-1}. \nonumber \]
- If \(A\) is invertible, so is \(A^k\) for any \(k \geq 1\), and \((A^{k})^{-1} = (A^{-1})^{k}\).
- If \(A\) is invertible and \(a \neq 0\) is a number, then \(aA\) is invertible and \((aA)^{-1} = \frac{1}{a}A^{-1}\).
- If \(A\) is invertible, so is its transpose \(A^{T}\), and \((A^{T})^{-1} = (A^{-1})^{T}\).
- This is an immediate consequence of the fact that \(I^{2} = I\).
- The equations \(AA^{-1} = I = A^{-1}A\) show that \(A\) is the inverse of \(A^{-1}\); in symbols, \((A^{-1})^{-1} = A\).
- This is Example [exa:004423].
- Use induction on \(k\). If \(k = 1\), there is nothing to prove, and if \(k = 2\), the result is property 3. If \(k > 2\), assume inductively that \((A_1A_2 \cdots A_{k-1})^{-1} = A_{k-1}^{-1} \cdots A_2^{-1}A_1^{-1}\). We apply this fact together with property 3 as follows:
\[\begin{aligned} \left[ A_{1}A_{2} \cdots A_{k-1}A_{k} \right]^{-1} &= \left[ \left(A_{1}A_{2} \cdots A_{k-1}\right)A_{k} \right]^{-1} \\ &= A_{k}^{-1}\left(A_{1}A_{2} \cdots A_{k-1}\right)^{-1} \\ &= A_{k}^{-1}\left(A_{k-1}^{-1} \cdots A_{2}^{-1}A_{1}^{-1}\right)\end{aligned} \nonumber \]
- This is property 4 with \(A_{1} = A_{2} = \cdots = A_{k} = A\).
- This is left as Exercise [ex:ex2_4_29].
- This is Example [exa:004397].
The reversal of the order of the inverses in properties 3 and 4 of Theorem [thm:004442] is a consequence of the fact that matrix multiplication is not commutative. Another manifestation of this comes when matrix equations are dealt with. If a matrix equation \(B = C\) is given, it can be left-multiplied by a matrix \(A\) to yield \(AB = AC\). Similarly, right-multiplication gives \(BA = CA\). However, we cannot mix the two: If \(B = C\), it need not be the case that \(AB = CA\) even if \(A\) is invertible, for example, \(A = \left[ \begin{array}{rr} 1 & 1 \\ 0 & 1 \end{array} \right]\), \(B = \left[ \begin{array}{rr} 0 & 0 \\ 1 & 0 \end{array} \right] = C\).
Part 7 of Theorem [thm:004442] together with the fact that \((A^{T})^{T} = A\) gives
004537 A square matrix \(A\) is invertible if and only if \(A^{T}\) is invertible.
004541 Find \(A\) if \((A^{T} - 2I)^{-1} = \left[ \begin{array}{rr} 2 & 1 \\ -1 & 0 \end{array} \right]\).
By Theorem [thm:004442](2) and Example [exa:004261], we have
\[(A^{T} - 2I) = \left[ \left(A^{T} - 2I\right)^{-1} \right]^{-1} = \left[ \begin{array}{rr} 2 & 1 \\ -1 & 0 \end{array} \right]^{-1} = \left[ \begin{array}{rr} 0 & -1 \\ 1 & 2 \end{array} \right] \nonumber \]
Hence \(A^{T} = 2I + \left[ \begin{array}{rr} 0 & -1 \\ 1 & 2 \end{array} \right] = \left[ \begin{array}{rr} 2 & -1 \\ 1 & 4 \end{array} \right]\), so \(A = \left[ \begin{array}{rr} 2 & 1 \\ -1 & 4 \end{array} \right]\) by Theorem [thm:004442](7).
The following important theorem collects a number of conditions all equivalent1 to invertibility. It will be referred to frequently below.
Inverse Theorem004553 The following conditions are equivalent for an \(n \times n\) matrix \(A\):
- \(A\) is invertible.
- The homogeneous system \(A\mathbf{x} = \mathbf{0}\) has only the trivial solution \(\mathbf{x} = \mathbf{0}\).
- \(A\) can be carried to the identity matrix \(I_{n}\) by elementary row operations.
- The system \(A\mathbf{x} = \mathbf{b}\) has at least one solution \(\mathbf{x}\) for every choice of column \(\mathbf{b}\).
- There exists an \(n \times n\) matrix \(C\) such that \(AC = I_{n}\).
We show that each of these conditions implies the next, and that (5) implies (1).
(1) \(\Rightarrow\) (2). If \(A^{-1}\) exists, then \(A\mathbf{x} = \mathbf{0}\) gives \(\mathbf{x} = I_{n}\mathbf{x} = A^{-1}A\mathbf{x} = A^{-1}\mathbf{0} = \mathbf{0}\).
(2) \(\Rightarrow\) (3). Assume that (2) is true. Certainly \(A \to R\) by row operations where \(R\) is a reduced, row-echelon matrix. It suffices to show that \(R = I_{n}\). Suppose that this is not the case. Then \(R\) has a row of zeros (being square). Now consider the augmented matrix \(\left[ \begin{array}{c|c} A & \mathbf{0} \end{array} \right]\) of the system \(A\mathbf{x} = \mathbf{0}\). Then \(\left[ \begin{array}{c|c} A & \mathbf{0} \end{array} \right] \to \left[ \begin{array}{c|c} R & \mathbf{0} \end{array} \right]\) is the reduced form, and \(\left[ \begin{array}{c|c} R & \mathbf{0} \end{array} \right]\) also has a row of zeros. Since \(R\) is square there must be at least one nonleading variable, and hence at least one parameter. Hence the system \(A\mathbf{x} = \mathbf{0}\) has infinitely many solutions, contrary to (2). So \(R = I_{n}\) after all.
(3) \(\Rightarrow\) (4). Consider the augmented matrix \(\left[ \begin{array}{c|c} A & \mathbf{b} \end{array} \right]\) of the system \(A\mathbf{x} = \mathbf{b}\). Using (3), let \(A \to I_{n}\) by a sequence of row operations. Then these same operations carry \(\left[ \begin{array}{c|c} A & \mathbf{b} \end{array} \right] \to \left[ \begin{array}{c|c} I_{n} & \mathbf{c} \end{array} \right]\) for some column \(\mathbf{c}\). Hence the system \(A\mathbf{x} = \mathbf{b}\) has a solution (in fact unique) by gaussian elimination. This proves (4).
(4) \(\Rightarrow\) (5). Write \(I_{n} = \left[ \begin{array}{cccc} \mathbf{e}_{1} & \mathbf{e}_{2} & \cdots & \mathbf{e}_{n} \end{array} \right]\) where \(\mathbf{e}_{1}, \mathbf{e}_{2}, \dots, \mathbf{e}_{n}\) are the columns of \(I_{n}\). For each \(j = 1, 2, \dots, n\), the system \(A\mathbf{x} = \mathbf{e}_{j}\) has a solution \(\mathbf{c}_{j}\) by (4), so \(A\mathbf{c}_{j} = \mathbf{e}_{j}\). Now let \(C = \left[ \begin{array}{cccc} \mathbf{c}_{1} & \mathbf{c}_{2} & \cdots & \mathbf{c}_{n} \end{array} \right]\) be the \(n \times n\) matrix with these matrices \(\mathbf{c}_{j}\) as its columns. Then Definition [def:003447] gives (5):
\[AC = A \left[ \begin{array}{cccc} \mathbf{c}_{1} & \mathbf{c}_{2} & \cdots & \mathbf{c}_{n} \end{array} \right] = \left[ \begin{array}{cccc} A\mathbf{c}_{1} & A\mathbf{c}_{2} & \cdots & A\mathbf{c}_{n} \end{array} \right] = \left[ \begin{array}{cccc} \mathbf{e}_{1} & \mathbf{e}_{2} & \cdots & \mathbf{e}_{n} \end{array} \right] = I_{n} \nonumber \]
(5) \(\Rightarrow\) (1). Assume that (5) is true so that \(AC = I_{n}\) for some matrix \(C\). Then \(C\mathbf{x} = 0\) implies \(\mathbf{x} = \mathbf{0}\) (because \(\mathbf{x} = I_{n}\mathbf{x} = AC\mathbf{x} = A\mathbf{0} = \mathbf{0}\)). Thus condition (2) holds for the matrix \(C\) rather than \(A\). Hence the argument above that (2) \(\Rightarrow\) (3) \(\Rightarrow\) (4) \(\Rightarrow\) (5) (with \(A\) replaced by \(C\)) shows that a matrix \(C^\prime\) exists such that \(CC^\prime = I_{n}\). But then
\[A = AI_{n} = A(CC^\prime) = (AC)C^\prime = I_{n}C^\prime = C^\prime \nonumber \]
Thus \(CA = CC^\prime = I_{n}\) which, together with \(AC = I_{n}\), shows that \(C\) is the inverse of \(A\). This proves (1).
The proof of (5) \(\Rightarrow\) (1) in Theorem [thm:004553] shows that if \(AC = I\) for square matrices, then necessarily \(CA = I\), and hence that \(C\) and \(A\) are inverses of each other. We record this important fact for reference.
004612 If \(A\) and \(C\) are square matrices such that \(AC = I\), then also \(CA = I\). In particular, both \(A\) and \(C\) are invertible, \(C = A^{-1}\), and \(A = C^{-1}\).
Here is a quick way to remember Corollary [cor:004612]. If \(A\) is a square matrix, then
- If \(AC=I\) then \(C=A^{-1}\).
- If \(CA=I\) then \(C=A^{-1}\).
Observe that Corollary [cor:004612] is false if \(A\) and \(C\) are not square matrices. For example, we have
\[\left[ \begin{array}{rrr} 1 & 2 & 1 \\ 1 & 1 & 1 \end{array} \right] \left[ \begin{array}{rr} -1 & 1 \\ 1 & -1 \\ 0 & 1 \end{array} \right] = I_{2} \quad \mbox{ but } \left[ \begin{array}{rr} -1 & 1 \\ 1 & -1 \\ 0 & 1 \end{array} \right] \left[ \begin{array}{rrr} 1 & 2 & 1 \\ 1 & 1 & 1 \end{array} \right] \neq I_{3} \nonumber \]
In fact, it is verified in the footnote on page that if \(AB = I_{m}\) and \(BA = I_{n}\), where \(A\) is \(m \times n\) and \(B\) is \(n \times m\), then \(m = n\) and \(A\) and \(B\) are (square) inverses of each other.
An \(n \times n\) matrix \(A\) has \(rank \;n\) if and only if (3) of Theorem [thm:004553] holds. Hence
004623 An \(n \times n\) matrix \(A\) is invertible if and only if \(rank \;A = n\).
Here is a useful fact about inverses of block matrices.
004627 Let \(P = \left[ \begin{array}{cc} A & X \\ 0 & B \end{array} \right]\) and \(Q = \left[ \begin{array}{cc} A & 0 \\ Y & B \end{array} \right]\) be block matrices where \(A\) is \(m \times m\) and \(B\) is \(n \times n\) (possibly \(m \neq n\)).
- Show that \(P\) is invertible if and only if \(A\) and \(B\) are both invertible. In this case, show that
\[P^{-1} = \left[ \begin{array}{cc} A^{-1} & -A^{-1}XB^{-1} \\ 0 & B^{-1} \end{array} \right] \nonumber \]
- Show that \(Q\) is invertible if and only if \(A\) and \(B\) are both invertible. In this case, show that
\[Q^{-1} = \left[ \begin{array}{cc} A^{-1} & 0 \\ -B^{-1}YA^{-1} & B^{-1} \end{array} \right] \nonumber \]
We do (a.) and leave (b.) for the reader.
- If \(A^{-1}\) and \(B^{-1}\) both exist, write \(R = \left[ \begin{array}{cc} A^{-1} & -A^{-1}XB^{-1} \\ 0 & B^{-1} \end{array} \right]\). Using block multiplication, one verifies that \(PR = I_{m+n} = RP\), so \(P\) is invertible, and \(P^{-1} = R\). Conversely, suppose that \(P\) is invertible, and write \(P^{-1} = \left[ \begin{array}{cc} C & V \\ W & D \end{array} \right]\) in block form, where \(C\) is \(m \times m\) and \(D\) is \(n \times n\).
Then the equation \(PP^{-1} = I_{n+m}\) becomes
\[\left[ \begin{array}{cc} A & X \\ 0 & B \end{array} \right] \left[ \begin{array}{cc} C & V \\ W & D \end{array} \right] = \left[ \begin{array}{cc} AC + XW & AV + XD \\ BW & BD \end{array} \right] = I_{m + n} = \left[ \begin{array}{cc} I_{m} & 0 \\ 0 & I_{n} \end{array} \right] \nonumber \]
using block notation. Equating corresponding blocks, we find
\[AC + XW = I_{m}, \quad BW = 0, \quad \mbox{ and } BD = I_{n} \nonumber \]
Hence \(B\) is invertible because \(BD = I_{n}\) (by Corollary [cor:004537]), then \(W = 0\) because \(BW = 0\), and finally, \(AC = I_{m}\) (so \(A\) is invertible, again by Corollary [cor:004537]).
Inverses of Matrix Transformations
Let \(T = T_{A} : \mathbb{R}^{n} \to \mathbb{R}^{n}\) denote the matrix transformation induced by the \(n \times n\) matrix \(A\). Since \(A\) is square, it may very well be invertible, and this leads to the question:
What does it mean geometrically for \(T\) that \(A\) is invertible?
To answer this, let \(T^\prime = T_{A^{-1}} : \mathbb{R}^{n} \to \mathbb{R}^{n}\) denote the transformation induced by \(A^{-1}\). Then
\[\label{eq:inverse1} \begin{array}{lll} T^\prime \left[ T(\mathbf{x}) \right] = A^{-1} \left[ A\mathbf{x} \right] = I\mathbf{x} = \mathbf{x} & & \\ & & \mbox{for all } \mathbf{x} \mbox{ in } \mathbb{R}^{n} \\ T \left[ T^\prime(\mathbf{x}) \right] = A \left[ A^{-1}\mathbf{x} \right] = I\mathbf{x} = \mathbf{x} & & \end{array} \]
The first of these equations asserts that, if \(T\) carries \(\mathbf{x}\) to a vector \(T(\mathbf{x})\), then \(T^\prime\) carries \(T(\mathbf{x})\) right back to \(\mathbf{x}\); that is \(T^\prime\) “reverses” the action of \(T\). Similarly \(T\) “reverses” the action of \(T^\prime\). Conditions ([eq:inverse1]) can be stated compactly in terms of composition:
\[\label{eq:inverse2} T^\prime \circ T = 1_{\mathbb{R}^{n}} \quad \mbox{ and } \quad T \circ T^\prime = 1_{\mathbb{R}^{n}} \]
When these conditions hold, we say that the matrix transformation \(T^\prime\) is an inverse of \(T\), and we have shown that if the matrix \(A\) of \(T\) is invertible, then \(T\) has an inverse (induced by \(A^{-1}\)).
The converse is also true: If \(T\) has an inverse, then its matrix \(A\) must be invertible. Indeed, suppose \(S : \mathbb{R}^{n} \to \mathbb{R}^{n}\) is any inverse of \(T\), so that \(S \circ T = 1_{\mathbb{R}_{n}}\) and \(T \circ S = 1_{\mathbb{R}_{n}}\). It can be shown that \(S\) is also a matrix transformation. If \(B\) is the matrix of \(S\), we have
\[BA\mathbf{x} = S \left[ T(\mathbf{x}) \right] = (S \circ T)(\mathbf{x}) = 1_{\mathbb{R}^{n}}(\mathbf{x}) = \mathbf{x} = I_{n}\mathbf{x} \quad \mbox{ for all } \mathbf{x} \mbox{ in } \mathbb{R}^{n} \nonumber \]
It follows by Theorem [thm:002985] that \(BA = I_{n}\), and a similar argument shows that \(AB = I_{n}\). Hence \(A\) is invertible with \(A^{-1} = B\). Furthermore, the inverse transformation \(S\) has matrix \(A^{-1}\), so \(S = T^\prime\) using the earlier notation. This proves the following important theorem.
004693 Let \(T : \mathbb{R}^{n} \to \mathbb{R}^{n}\) denote the matrix transformation induced by an \(n \times n\) matrix \(A\). Then
\(A\) is invertible if and only if \(T\) has an inverse.
In this case, \(T\) has exactly one inverse (which we denote as \(T^{-1}\)), and \(T^{-1} : \mathbb{R}^{n} \to \mathbb{R}^{n}\) is the transformation induced by the matrix \(A^{-1}\). In other words
\[\left(T_{A}\right)^{-1} = T_{A^{-1}} \nonumber \]
The geometrical relationship between \(T\) and \(T^{-1}\) is embodied in equations ([eq:inverse1]) above:
\[T^{-1} \left[ T(\mathbf{x}) \right] = \mathbf{x} \quad \mbox{ and } \quad T \left[ T^{-1}(\mathbf{x}) \right] = \mathbf{x} \quad \mbox{ for all } \mathbf{x} \mbox{ in } \mathbb{R}^{n} \nonumber \]
These equations are called the fundamental identities relating \(T\) and \(T^{-1}\). Loosely speaking, they assert that each of \(T\) and \(T^{-1}\) “reverses” or “undoes” the action of the other.
This geometric view of the inverse of a linear transformation provides a new way to find the inverse of a matrix \(A\). More precisely, if \(A\) is an invertible matrix, we proceed as follows:
- Let \(T\) be the linear transformation induced by \(A\).
- Obtain the linear transformation \(T^{-1}\) which “reverses” the action of \(T\).
- Then \(A^{-1}\) is the matrix of \(T^{-1}\).
Here is an example.
004725
Find the inverse of \(A = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right]\) by viewing it as a linear transformation \(\mathbb{R}^{2} \to \mathbb{R}^{2}\).
If \(\mathbf{x} = \left[ \begin{array}{c} x \\ y \end{array} \right]\) the vector \(A\mathbf{x} = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{c} y \\ x \end{array} \right]\) is the result of reflecting \(\mathbf{x}\) in the line \(y = x\) (see the diagram). Hence, if \(Q_{1} : \mathbb{R}^{2} \to \mathbb{R}^{2}\) denotes reflection in the line \(y = x\), then \(A\) is the matrix of \(Q_{1}\). Now observe that \(Q_{1}\) reverses itself because reflecting a vector \(\mathbf{x}\) twice results in \(\mathbf{x}\). Consequently \(Q_{1}^{-1} = Q_{1}\). Since \(A^{-1}\) is the matrix of \(Q_{1}^{-1}\) and \(A\) is the matrix of \(Q\), it follows that \(A^{-1} = A\). Of course this conclusion is clear by simply observing directly that \(A^{2} = I\), but the geometric method can often work where these other methods may be less straightforward.
- If \(p\) and \(q\) are statements, we say that \(p\) implies \(q\) (written \(p \Rightarrow q\)) if \(q\) is true whenever \(p\) is true. The statements are called equivalent if both \(p \Rightarrow q\) and \(q \Rightarrow p\) (written \(p \Leftrightarrow q\), spoken “\(p\) if and only if \(q\)”). See Appendix [chap:appbproofs].↩