3.6: The Invertible Matrix Theorem
- Page ID
- 70200
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)- Theorem: the invertible matrix theorem.
This section consists of a single important theorem containing many equivalent conditions for a matrix to be invertible. This is one of the most important theorems in this textbook. We will append two more criteria in Section 5.1.
Let \(A\) be an \(n\times n\) matrix, and let \(T\colon\mathbb{R}^n \to\mathbb{R}^n \) be the matrix transformation \(T(x) = Ax\). The following statements are equivalent:
- \(A\) is invertible.
- \(A\) has \(n\) pivots.
- \(\text{Nul}(A) = \{0\}\).
- The columns of \(A\) are linearly independent.
- The columns of \(A\) span \(\mathbb{R}^n \).
- \(Ax=b\) has a unique solution for each \(b\) in \(\mathbb{R}^n \).
- \(T\) is invertible.
- \(T\) is one-to-one.
- \(T\) is onto.
- Proof
-
\((1\iff 2)\text{:}\) The matrix \(A\) has \(n\) pivots if and only if its reduced row echelon form is the identity matrix \(I_n\). This happens exactly when the procedure in Section 3.5, Theorem 3.5.1, to compute the inverse succeeds.
\((2\iff 3)\text{:}\) The null space of a matrix is \(\{0\}\) if and only if the matrix has no free variables, which means that every column is a pivot column, which means \(A\) has \(n\) pivots. See Recipe: Compute a Spanning Set for a Null Space in Section 2.6.
\((2\iff 4,\,2\iff 5)\text{:}\) These follow from the Recipe: Checking Linear Independence in Section 2.5 and Theorem 2.3.1, in Section 2.3, respectively, since \(A\) has \(n\) pivots if and only if has a pivot in every row/column.
\((4+5\iff 6)\text{:}\) We know \(Ax=b\) has at least one solution for every \(b\) if and only if the columns of \(A\) span \(\mathbb{R}^n \) by Theorem 3.2.2 in Section 3.2, and \(Ax=b\) has at most one solution for every \(b\) if and only if the columns of \(A\) are linearly independent by Theorem 3.2.1 in Section 3.2. Hence \(Ax=b\) has exactly one solution for every \(b\) if and only if its columns are linearly independent and span \(\mathbb{R}^n \).
\((1\iff 7)\text{:}\) This is the content of Theorem 3.5.3 in Section 3.5.
\((7\implies 8+9)\text{:}\) See Proposition 3.5.2 in Section 3.5.
\((8\iff 4,\,9\iff 5)\text{:}\) See Theorem 3.2.2 in Section 3.2 and Theorem 3.2.1 in Section 3.2.
To reiterate, the invertible matrix theorem means:
There are two kinds of square matrices:
- invertible matrices, and
- non-invertible matrices.
For invertible matrices, all of the statements of the invertible matrix theorem are true.
For non-invertible matrices, all of the statements of the invertible matrix theorem are false.
The reader should be comfortable translating any of the statements in the invertible matrix theorem into a statement about the pivots of a matrix.
The following conditions are also equivalent to the invertibility of a square matrix \(A\). They are all simple restatements of conditions in the invertible matrix theorem.
- The reduced row echelon form of \(A\) is the identity matrix \(I_n\).
- \(Ax=0\) has no solutions other than the trivial one.
- \(\text{nullity}(A) = 0\).
- The columns of \(A\) form a basis for \(\mathbb{R}^n \).
- \(Ax=b\) is consistent for all \(b\) in \(\mathbb{R}^n \).
- \(\text{Col}(A) = \mathbb{R}^n .\)
- \(\dim\text{Col}(A) = n.\)
- \(\text{rank}(A) = n.\)
Now we can show that to check \(B = A^{-1}\text{,}\) it's enough to show \(AB=I_n\) or \(BA=I_n\).
Let \(A\) be an \(n\times n\) matrix, and suppose that there exists an \(n\times n\) matrix \(B\) such that \(AB=I_n\) or \(BA=I_n\). Then \(A\) is invertible and \(B = A^{-1}\).
- Proof
-
Suppose that \(AB = I_n\). We claim that \(T(x)=Ax\) is onto. Indeed, for any \(b\) in \(\mathbb{R}^n \text{,}\) we have
\[ b = I_nb = (AB)b = A(Bb), \nonumber \]
so \(T(Bb) = b\text{,}\) and hence \(b\) is in the range of \(T\). Therefore, \(A\) is invertible by the Theorem \(\PageIndex{1}\). Since \(A\) is invertible, we have
\[ A^{-1} = A^{-1} I_n = A^{-1} (AB) = (A^{-1} A)B = I_n B = B, \nonumber \]
so \(B = A^{-1}.\)
Now suppose that \(BA = I_n\). We claim that \(T(x) = Ax\) is one-to-one. Indeed, suppose that \(T(x) = T(y)\). Then \(Ax = Ay\text{,}\) so \(BAx = BAy\). But \(BA = I_n\text{,}\) so \(I_nx = I_ny\text{,}\) and hence \(x=y\). Therefore, \(A\) is invertible by the Theorem \(\PageIndex{1}\). One shows that \(B = A^{-1}\) as above.
We conclude with some common situations in which the invertible matrix theorem is useful.
Is this matrix invertible?
\[ A = \left(\begin{array}{ccc}1&2&-1\\2&4&7\\-2&-4&1\end{array}\right) \nonumber \]
Solution
The second column is a multiple of the first. The columns are linearly dependent, so \(A\) does not satisfy condition 4 of the Theorem \(\PageIndex{1}\). Therefore, \(A\) is not invertible.
Let \(A\) be an \(n\times n\) matrix and let \(T(x) = Ax\). Suppose that the range of \(T\) is \(\mathbb{R}^n \). Show that the columns of \(A\) are linearly independent.
Solution
The range of \(T\) is the column space of \(A\text{,}\) so \(A\) satisfies condition 5 of the Theorem \(\PageIndex{1}\). Therefore, \(A\) also satisfies condition 4, which says that the columns of \(A\) are linearly independent.
Let \(A\) be a \(3\times 3\) matrix such that
\[ A\left(\begin{array}{c}1\\7\\0\end{array}\right) = A\left(\begin{array}{c}2\\0\\-1\end{array}\right). \nonumber \]
Show that the rank of \(A\) is at most \(2\).
Solution
If we set
\[ b = A\left(\begin{array}{c}1\\7\\0\end{array}\right) = A\left(\begin{array}{c}2\\0\\-1\end{array}\right), \nonumber \]
then \(Ax=b\) has multiple solutions, so it does not satisfy condition 6 of the Theorem \(\PageIndex{1}\). Therefore, it does not satisfy condition 5, so the columns of \(A\) do not span \(\mathbb{R}^3 \). Therefore, the column space has dimension strictly less than 3, the rank is at most \(2\).
Suppose that \(A\) is an \(n\times n\) matrix such that \(Ax=b\) is inconsistent some vector \(b\). Show that \(Ax=b\) has infinitely many solutions for some (other) vector \(b\).
Solution
By hypothesis, \(A\) does not satisfy condition 6 of the Theorem \(\PageIndex{1}\). Therefore, it does not satisfy condition \(3\text{,}\) so \(\text{Nul}(A)\) is an infinite set. If we take \(b=0\text{,}\) then the equation \(Ax=b\) has infinitely many solutions.