3.8: Proof of the Cofactor Expansion Theorem
- Page ID
- 58846
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Recall that our definition of the term determinant is inductive: The determinant of any \(1 \times 1\) matrix is defined first; then it is used to define the determinants of \(2 \times 2\) matrices. Then that is used for the \(3 \times 3\) case, and so on. The case of a \(1 \times 1\) matrix \(\left[ a \right]\) poses no problem. We simply define
\[\det \left[ a \right] = a \nonumber \]
as in Section 3.1. Given an \(n \times n\) matrix \(A\), define \(A_{ij}\) to be the \((n - 1) \times (n - 1)\) matrix obtained from \(A\) by deleting row \(i\) and column \(j\). Now assume that the determinant of any \((n - 1) \times (n - 1)\) matrix has been defined. Then the determinant of \(A\) is defined to be
\[\begin{aligned} \det A & = a_{11} \det A_{11} - a_{21} \det A_{21} + \cdots + (-1)^{n+1} a_{n1} \det A_{n1} \\ &= \sum_{i=1}^n (-1)^{i+1} a_{i1} \det A_{i1}\end{aligned} \nonumber \]
where summation notation has been introduced for convenience.1 Observe that, in the terminology of Section 3.1, this is just the cofactor expansion of det \(A\) along the first column, and that \((-1)^{i+j} \det A_{ij}\) is the \((i, j)\)-cofactor (previously denoted as \(c_{ij}(A)\)).2 To illustrate the definition, consider the \(2 \times 2\) matrix \(A = \left[ \begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array}\right]\). Then the definition gives
\[\det \left[ \begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array}\right] = a_{11} \det \left[ a_{22} \right] - a_{21} \det \left[ a_{12} \right] = a_{11} a_{22} - a_{21} a_{12} \nonumber \]
and this is the same as the definition in Section 3.1.
Of course, the task now is to use this definition to prove that the cofactor expansion along any row or column yields \(\det A\) (this is Theorem 3.1.1). The proof proceeds by first establishing the properties of determinants stated in Theorem 3.1.2 but for rows only (see Lemma 3.8.2). This being done, the full proof of Theorem 3.1.1 is not difficult. The proof of Lemma 3.8.2 requires the following preliminary result.
Let \(A\), \(B\), and \(C\) be \(n \times n\) matrices that are identical except that the \(p\)th row of \(A\) is the sum of the \(p\)th rows of \(B\) and \(C\). Then
\[\det A = \det B + \det C \nonumber \]
We proceed by induction on \(n\), the cases \(n = 1\) and \(n = 2\) being easily checked. Consider \(a_{i1}\) and \(A_{i1}\):
Case 1: If \(i \neq p\),
\[a_{i1} = b_{i1} = c_{i1} \quad \mbox{ and } \quad \det A_{i1} = \det B_{i1} = \det C_{i1} \nonumber \]
by induction because \(A_{i1}\), \(B_{i1}\), \(C_{i1}\) are identical except that one row of \(A_{i1}\) is the sum of the corresponding rows of \(B_{i1}\) and \(C_{i1}\).
Case 2: If \(i = p\),
\[a_{p1} = b_{p1} + c_{p1} \quad \mbox{ and } \quad A_{p1} = B_{p1} = C_{p1} \nonumber \]
Now write out the defining sum for \(\det A\), splitting off the \(p\)th term for special attention.
\[\begin{aligned} \det A &= \sum_{i \neq p} a_{i1} (-1)^{i+1} \det A_{i1} + a_{p1} (-1)^{p+1} \det A_{p1} \\ &= \sum_{i \neq p} a_{i1} (-1)^{i+1} \left[ \det B_{i1} + \det B_{i1} \right] + (b_{p1} + c_{p1}) (-1)^{p+1} \det A_{p1}\end{aligned} \nonumber \]
where \(\det A_{i1} = \det B_{i1} + \det C_{i1}\) by induction. But the terms here involving \(B_{i1}\) and \(b_{p1}\) add up to \(\det B\) because \(a_{i1} = b_{i1}\) if \(i \neq p\) and \(A_{p1} = B_{p1}\). Similarly, the terms involving \(C_{i1}\) and \(c_{p1}\) add up to \(\det C\). Hence \(\det A = \det B + \det C\), as required.
\(\square\)
Let \(A = \left[ a_{ij} \right]\) denote an \(n \times n\) matrix.
- If \(B = \left[ b_{ij} \right]\) is formed from \(A\) by multiplying a row of \(A\) by a number \(u\), then \(\det B = u \det A\).
- If \(A\) contains a row of zeros, then \(\det A = 0\).
- If \(B = \left[ b_{ij} \right]\) is formed by interchanging two rows of \(A\), then \(\det B = -\det A\).
- If \(A\) contains two identical rows, then \(\det A = 0\).
- If \(B = \left[ b_{ij} \right]\) is formed by adding a multiple of one row of \(A\) to a different row, then \(\det B = \det A\).
For later reference the defining sums for \(\det A\) and \(\det B\) are as follows:
\[\begin{align} \det A &= \sum_{i=1}^n a_{i1}(-1)^{i+1} \det A_{i1} \label{eq:cofactor1} \\ \det B &= \sum_{i=1}^n b_{i1}(-1)^{i+1} \det B_{i1} \label{eq:cofactor2} \end{align} \]
Property 1. The proof is by induction on \(n\), the cases \(n = 1\) and \(n = 2\) being easily verified. Consider the \(i\)th term in the sum \ref{eq:cofactor2} for \(\det B\) where \(B\) is the result of multiplying row \(p\) of \(A\) by \(u\).
- If \(i \neq p\), then \(b_{i1} = a_{i1}\) and \(\det B_{i1} = u \det A_{i1}\) by induction because \(B_{i1}\) comes from \(A_{i1}\) by multiplying a row by \(u\).
- If \(i = p\), then \(b_{p1} = ua_{p1}\) and \(B_{p1} = A_{p1}\).
In either case, each term in Equation \ref{eq:cofactor2} is \(u\) times the corresponding term in Equation \ref{eq:cofactor1}, so it is clear that \(\det B = u \det A\).
Property 2. This is clear by property 1 because the row of zeros has a common factor \(u = 0\).
Property 3. Observe first that it suffices to prove property 3 for interchanges of adjacent rows. (Rows \(p\) and \(q\) \((q > p)\) can be interchanged by carrying out \(2(q - p) - 1\) adjacent changes, which results in an odd number of sign changes in the determinant.) So suppose that rows \(p\) and \(p + 1\) of \(A\) are interchanged to obtain \(B\). Again consider the \(i\)th term in Equation \ref{eq:cofactor2}.
- If \(i \neq p\) and \(i \neq p + 1\), then \(b_{i1} = a_{i1}\) and \(\det B_{i1} = -\det A_{i1}\) by induction because \(B_{i1}\) results from interchanging adjacent rows in \(A_{i1}\). Hence the \(i\)th term in Equation \ref{eq:cofactor2} is the negative of the \(i\)th term in Equation \ref{eq:cofactor1}. Hence \(\det B = -\det A\) in this case.
- If \(i = p\) or \(i = p + 1\), then \(b_{p1} = a_{p+1,1}\) and \(B_{p1} = A_{p+1,1}\), whereas \(b_{p+1,1} = a_{p1}\) and \(B_{p+1,1} = A_{p1}\). Hence terms \(p\) and \(p + 1\) in Equation \ref{eq:cofactor2} are
\[b_{p1}(-1)^{p+1}\det B_{p1} = -a_{p+1,1}(-1)^{(p+1)+1} \det (A_{p+1,1}) \nonumber \]
This means that terms \(p\) and \(p + 1\) in Equation \ref{eq:cofactor2} are the same as these terms in Equation \ref{eq:cofactor1}, except that the order is reversed and the signs are changed. Thus the sum \ref{eq:cofactor2} is the negative of the sum \ref{eq:cofactor1}; that is, \(\det B = -\det A\).
Property 4. If rows \(p\) and \(q\) in \(A\) are identical, let \(B\) be obtained from \(A\) by interchanging these rows. Then \(B = A\) so \(\det A = \det B\). But \(\det B = -\det A\) by property 3 so \(\det A = -\det A\). This implies that \(\det A = 0\).
Property 5. Suppose \(B\) results from adding \(u\) times row \(q\) of \(A\) to row \(p\). Then Lemma 3.8.1 applies to \(B\) to show that \(\det B = \det A + \det C\), where \(C\) is obtained from \(A\) by replacing row \(p\) by \(u\) times row \(q\). It now follows from properties 1 and 4 that \(\det C = 0\) so \(\det B = \det A\), as asserted.
\(square\)
These facts are enough to enable us to prove Theorem 3.1.1. For convenience, it is restated here in the notation of the foregoing lemmas. The only difference between the notations is that the \((i, j)\)-cofactor of an \(n \times n\) matrix \(A\) was denoted earlier by
\[c_{ij}(A) = (-1)^{i+j} \det A_{ij} \nonumber \]
If \(A = \left[ a_{ij} \right]\) is an \(n \times n\) matrix, then
- \(\det A = \sum_{i=1}^n a_{ij} (-1)^{i+j} \det A_{ij} \quad (\mbox{cofactor expansion along column } j).\)
- \(\det A = \sum_{j=1}^n a_{ij} (-1)^{i+j} \det A_{ij} \quad (\mbox{cofactor expansion along row } i).\)
Here \(A_{ij}\) denotes the matrix obtained from \(A\) by deleting row \(i\) and column \(j\).
Lemma 3.8.2 establishes the truth of Theorem 3.1.2 for rows. With this information, the arguments in Section 3.2 proceed exactly as written to establish that \(\det A = \det A^{T}\) holds for any \(n \times n\) matrix \(A\). Now suppose \(B\) is obtained from \(A\) by interchanging two columns. Then \(B^{T}\) is obtained from \(A^{T}\) by interchanging two rows so, by property 3 of Lemma 3.8.2,
\[\det B = \det B^T = -\det A^T = -\det A \nonumber \]
Hence property 3 of Lemma 3.8.2 holds for columns too.
This enables us to prove the cofactor expansion for columns. Given an \(n \times n\) matrix \(A = \left[ a_{ij} \right]\), let \(B = \left[ b_{ij} \right]\) be obtained by moving column \(j\) to the left side, using \(j - 1\) interchanges of adjacent columns. Then \(\det B = (-1)^{j-1} \det A\) and, because \(B_{i1} = A_{ij}\) and \(b_{i1} = a_{ij}\) for all \(i\), we obtain
\[\begin{aligned} \det A & = (-1)^{j-1} \det B = (-1)^{j-1} \sum_{i=1}^n b_{i1}(-1)^{i+1} \det B_{i1}\\ &= \sum_{i=1}^n a_{ij} (-1)^{i+j} \det A_{ij}\end{aligned} \nonumber \]
This is the cofactor expansion of \(\det A\) along column \(j\).
Finally, to prove the row expansion, write \(B = A^{T}\). Then \(B_{ij} = (A_{ij}^T)\) and \(b_{ij} = a_{ji}\) for all \(i\) and \(j\). Expanding det \(B\) along column \(j\) gives
\[\begin{aligned} \det A &=\det A^T = \det B = \sum_{i=1}^n b_{ij} (-1)^{i+j} \det B_{ij} \\ &= \sum_{i=1}^n a_{ji}(-1)^{j+i} \det \left[ (A_{ji}^T)\right] = \sum_{i=1}^n a_{ji} (-1)^{j+i} \det A_{ji}\end{aligned} \nonumber \]
This is the required expansion of \(\det A\) along row \(j\).
- Summation notation is a convenient shorthand way to write sums of similar expressions. For example \(a_1 +a_2 +a_3 +a_4 = \sum_{i=1}^4 a_i\), \(a_5b_5 + a_6b_6 + a_7b_7 + a_8b_8 = \sum_{k=5}^8 a_kb_k\), and \(1^2 +2^2 + 3^2+ 4^2 + 5^2 = \sum_{j=1}^5 j^2\).
- Note that we used the expansion along row 1 at the beginning of Section 3.1. The column 1 expansion definition is more convenient here.


