3.2: The Matrix Trace

Last updated
Save as PDF

Page ID: 63393

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Learning Objectives

T/F: We only compute the trace of square matrices.
T/F: One can tell if a matrix is invertible by computing the trace.

In the previous section, we learned about an operation we can perform on matrices, namely the transpose. Given a matrix \(A\), we can “find the transpose of \(A\),” which is another matrix. In this section we learn about a new operation called the trace. It is a different type of operation than the transpose. Given a matrix \(A\), we can “find the trace of \(A\),” which is not a matrix but rather a number. We formally define it here.

Definition: The Trace

Let \(A\) be an \(n\times n\) matrix. The trace of \(A\), denoted \(\text{tr}(A)\), is the sum of the diagonal elements of \(A\). That is,

\[\text{tr}(A)=a_{11}+a_{22}+\cdots +a_{nn}. \nonumber \]

This seems like a simple definition, and it really is. Just to make sure it is clear, let’s practice.

Example \(\PageIndex{1}\)

Find the trace of \(A\), \(B\), \(C\), and \(I_{4}\), where

\[A=\left[\begin{array}{cc}{1}&{2}\\{3}&{4}\end{array}\right],\quad B=\left[\begin{array}{ccc}{1}&{2}&{0}\\{3}&{8}&{1}\\{-2}&{7}&{-5}\end{array}\right]\quad\text{and}\quad C=\left[\begin{array}{ccc}{1}&{2}&{3}\\{4}&{5}&{6}\end{array}\right]. \nonumber \]

Solution

To find the trace of \(A\), note that the diagonal elements of \(A\) are \(1\) and \(4\). Therefore, \(\text{tr}(A)=1+4=5\).

We see that the diagonal elements of \(B\) are \(1,\: 8\) and \(-5\), so \(\text{tr}(B)=1+8-5=4\).

The matrix \(C\) is not a square matrix, and our definition states that we must start with a square matrix. Therefore \(\text{tr}(C)\) is not defined.

Finally, the diagonal of \(I_{4}\) consists of four 1s. Therefore \(\text{tr}(I_{4}) = 4\).

Now that we have defined the trace of a matrix, we should think like mathematicians and ask some questions. The first questions that should pop into our minds should be along the lines of “How does the trace work with other matrix operations?"\(^{1}\) We should think about how the trace works with matrix addition, scalar multiplication, matrix multiplication, matrix inverses, and the transpose.

We’ll give a theorem that will formally tell us what is true in a moment, but first let’s play with two sample matrices and see if we can see what will happen. Let

\[A=\left[\begin{array}{ccc}{2}&{1}&{3}\\{2}&{0}&{-1}\\{3}&{-1}&{3}\end{array}\right]\quad\text{and}\quad B=\left[\begin{array}{ccc}{2}&{0}&{1}\\{-1}&{2}&{0}\\{0}&{2}&{-1}\end{array}\right]. \nonumber \]

It should be clear that \(\text{tr}(A)=5\) and \(\text{tr}(B)=3\). What is \(\text{tr}(A+B)\)?

\[\begin{align}\begin{aligned}\text{tr}(A+B)&=\text{tr}\left(\left[\begin{array}{ccc}{2}&{1}&{3}\\{2}&{0}&{-1}\\{3}&{-1}&{3}\end{array}\right]+\left[\begin{array}{ccc}{2}&{0}&{1}\\{-1}&{2}&{0}\\{0}&{2}&{-1}\end{array}\right]\right) \\ &=\text{tr}\left(\left[\begin{array}{ccc}{4}&{1}&{4}\\{1}&{2}&{-1}\\{3}&{1}&{2}\end{array}\right]\right) \\ &=8\end{aligned}\end{align} \nonumber \]

So we notice that \(\text{tr}(A+B)=\text{tr}(A)+\text{tr}(B)\). This probably isn’t a coincidence.

How does the trace work with scalar multiplication? If we multiply \(A\) by \(4\), then the diagonal elements will be \(8,\: 0\) and \(12\), so \(\text{tr}(4A)=20\). Is it a coincidence that this is \(4\) times the trace of \(A\)?

Let’s move on to matrix multiplication. How will the trace of \(AB\) relate to the traces of \(A\) and \(B\)? Let’s see:

\[\begin{align}\begin{aligned}\text{tr}(AB)&=\text{tr}\left(\left[\begin{array}{ccc}{2}&{1}&{3}\\{2}&{0}&{-1}\\{3}&{-1}&{3}\end{array}\right]\left[\begin{array}{ccc}{2}&{0}&{1}\\{-1}&{2}&{0}\\{0}&{2}&{-1}\end{array}\right]\right) \\ &=\text{tr}\left(\left[\begin{array}{ccc}{3}&{8}&{-1}\\{4}&{-2}&{3}\\{7}&{4}&{0}\end{array}\right]\right) \\ &=1\end{aligned}\end{align} \nonumber \]

It isn’t exactly clear what the relationship is among \(\text{tr}(A)\), \(\text{tr}(B)\) and \(\text{tr}(AB)\). Before moving on, let’s find \(\text{tr}(BA)\):

\[\begin{align}\begin{aligned}\text{tr}(BA)&=\text{tr}\left(\left[\begin{array}{ccc}{2}&{0}&{1}\\{-1}&{2}&{0}\\{0}&{2}&{-1}\end{array}\right]\left[\begin{array}{ccc}{2}&{1}&{3}\\{2}&{0}&{-1}\\{3}&{-1}&{3}\end{array}\right]\right) \\ &=\text{tr}\left(\left[\begin{array}{ccc}{7}&{1}&{9}\\{2}&{-1}&{-5}\\{1}&{1}&{-5}\end{array}\right]\right) \\ &=1\end{aligned}\end{align} \nonumber \]

We notice that \(\text{tr}(AB)=\text{tr}(BA)\). Is this coincidental?

How are the traces of \(A\) and \(A^{-1}\) related? We compute \(A^{-1}\) and find that

\[A^{-1}=\left[\begin{array}{ccc}{1/17}&{6/17}&{1/17}\\{9/17}&{3/17}&{-8/17}\\{2/17}&{-5/17}&{2/17}\end{array}\right]. \nonumber \]

Therefore \(\text{tr}(A^{-1})=6/17\). Again, the relationship isn’t clear.\(^{2}\)

Finally, let’s see how the trace is related to the transpose. We actually don’t have to formally compute anything. Recall from the previous section that the diagonals of \(A\) and \(A^{T}\) are identical; therefore, \(\text{tr}(A)=\text{tr}(A^{T})\). That, we know for sure, isn’t a coincidence.

We now formally state what equalities are true when considering the interaction of the trace with other matrix operations.

Theorem \(\PageIndex{1}\): Properties of the Matrix Trace

Let \(A\) and \(B\) be \(n\times n\) matrices. Then:

\(\text{tr}(A+B)=\text{tr}(A)+\text{tr}(B)\)
\(\text{tr}(A-B)=\text{tr}(A)-\text{tr}(B)\)
\(\text{tr}(kA)=k\cdot\text{tr}(A)\)
\(\text{tr}(AB)=\text{tr}(BA)\)
\(\text{tr}(A^{T})=\text{tr}(A)\)

One of the key things to note here is what this theorem does not say. It says nothing about how the trace relates to inverses. The reason for the silence in these areas is that there simply is not a relationship.

We end this section by again wondering why anyone would care about the trace of matrix. One reason mathematicians are interested in it is that it can give a measurement of the “size"\(^{3}\) of a matrix.

Consider the following \(2 \times 2\) matrices:

\[A=\left[\begin{array}{cc}{1}&{-2}\\{1}&{1}\end{array}\right]\quad\text{and}\quad B=\left[\begin{array}{cc}{6}&{7}\\{11}&{-4}\end{array}\right]. \nonumber \]

These matrices have the same trace, yet \(B\) clearly has bigger elements in it. So how can we use the trace to determine a “size” of these matrices? We can consider \(\text{tr}(A^{T}A)\) and \(\text{tr}(B^{T}B)\).

\[\begin{align}\begin{aligned}\text{tr}(A^{T}A)&=\text{tr}\left(\left[\begin{array}{cc}{1}&{1}\\{-2}&{1}\end{array}\right]\left[\begin{array}{cc}{1}&{-2}\\{1}&{1}\end{array}\right]\right) \\ &=\text{tr}\left(\left[\begin{array}{cc}{2}&{-1}\\{-1}&{5}\end{array}\right]\right) \\ &=7\end{aligned}\end{align} \nonumber \]

\[\begin{align}\begin{aligned}\text{tr}(B^{T}B)&=\text{tr}\left(\left[\begin{array}{cc}{6}&{11}\\{7}&{-4}\end{array}\right]\left[\begin{array}{cc}{6}&{7}\\{11}&{-4}\end{array}\right]\right) \\ &=\text{tr}\left(\left[\begin{array}{cc}{157}&{-2}\\{-2}&{65}\end{array}\right]\right) \\ &=222\end{aligned}\end{align} \nonumber \]

Our concern is not how to interpret what this “size” measurement means, but rather to demonstrate that the trace (along with the transpose) can be used to give (perhaps useful) information about a matrix.\(^{4}\)

Footnotes

[1] Recall that we asked a similar question once we learned about the transpose.

[2] Something to think about: we know that not all square matrices are invertible. Would we be able to tell just by the trace? That seems unlikely.

[3] There are many different measurements of a matrix size. In this text, we just refer to its dimensions. Some measurements of size refer the magnitude of the elements in the matrix. The next section describes yet another measurement of matrix size.

[4] This example brings to light many interesting ideas that we’ll flesh out just a little bit here.

Notice that the elements of \(A\) are \(1\), \(-2\), \(1\) and \(1\). Add the squares of these numbers: \(1^2 + (-2)^2 + 1^2 + 1^2 = 7 =\text{tr}(A^{T}A)\).
Notice that the elements of \(B\) are \(6\), \(7\), \(11\) and \(-4\). Add the squares of these numbers: \(6^2 + 7^2 + 11^2 + (-4)^2 = 222 =\text{tr}(B^{T}B)\).
Can you see why this is true? When looking at multiplying \(A^{T}A\), focus only on where the elements on the diagonal come from since they are the only ones that matter when taking the trace.
You can confirm on your own that regardless of the dimensions of \(A\), \(\text{tr}(A^{T}A)=\text{tr}(AA^{T})\). To see why this is true, consider the previous point. (Recall also that \(A^{T}A\) and \(AA^{T}\) are always square, regardless of the dimensions of \(A\).)
Mathematicians are actually more interested in \(\sqrt{\text{tr}(A^{T}A)}\) than just \(\text{tr}(A^{T}A)\). The reason for this is a bit complicated; the short answer is that “it works better.” The reason “it works better” is related to the Pythagorean Theorem, all of all things. If we know that the legs of a right triangle have length \(a\) and \(b\), we are more interested in \(\sqrt{a^2+b^2}\) than just \(a^2+b^2\). Of course, this explanation raises more questions than it answers; our goal here is just to whet your appetite and get you to do some more reading. A Numerical Linear Algebra book would be a good place to start.