4.3: Diagonalization, similarity, and powers of a matrix
\(\newcommand{\twovec}[2]{\begin{pmatrix} #1 \\ #2 \end{pmatrix} } \) \(\newcommand{\threevec}[3]{\begin{pmatrix} #1 \\ #2 \\ #3 \end{pmatrix} } \) \(\newcommand{\fourvec}[4]{\begin{pmatrix} #1 \\ #2 \\ #3 \\ #4 \end{pmatrix} } \) \(\newcommand{\fivevec}[5]{\begin{pmatrix} #1 \\ #2 \\ #3 \\ #4 \\ #5 \end{pmatrix} } \)
The first example we considered in this chapter was the matrix \(A=\left[\begin{array}{rr} 1 & 2 \\ 2 & 1 \\ \end{array}\right] \text{,}\) which has eigenvectors \(\mathbf v_1=\twovec{1}{1}\) and \(\mathbf v_2 = \twovec{-1}{1}\) and associated eigenvalues \(\lambda_1=3\) and \(\lambda_2=-1\text{.}\) In Subsection 4.1.2 , we described how \(A\) is, in some sense, equivalent to the diagonal matrix \(D = \left[\begin{array}{rr} 3 & 0 \\ 0 & -1\\ \end{array}\right] \text{.}\)
This equivalence is summarized by Figure 4.3.1. The diagonal matrix \(D\) has the geometric effect of stretching vectors horizontally by a factor of \(3\) and flipping vectors vertically. The matrix \(A\) has the geometric effect of stretching vectors by a factor of \(3\) in the direction \(\mathbf v_1\) and flipping them in the direction of \(\mathbf v_2\text{.}\) The geometric effect of \(A\) is the same as that of \(D\) when viewed in a basis of eigenvectors of \(A\text{.}\)
Now that we have developed some algebraic techniques for finding eigenvalues and eigenvectors, we will explore this observation more deeply. In particular, we will make precise the sense in which \(A\) and \(D\) are equivalent by using the coordinate system defined by the basis of eigenvectors \(\mathbf v_1\) and \(\mathbf v_2\text{.}\)
Preview Activity 4.3.1.
Let's recall how a vector in \(\mathbb R^2\) can be represented in a coordinate system defined by a basis \(\mathcal{B}=\{\mathbf v_1, \mathbf v_2\}\text{.}\)
-
Suppose that we consider the basis \(\mathcal{B}\) defined by
\[ \mathbf v_1 = \twovec{1}{1},\qquad \mathbf v_2 = \twovec{-1}{0}\text{.} \]
Find the vector \(\mathbf x\) whose representation in the coordinate system defined by \(\mathcal{B}\) is \(\left\{\mathbf{x}\right\}_{\mathcal{B}} = \twovec{-3}{2}\text{.}\)
- Consider the vector \(\mathbf x=\twovec{4}{5}\) and find its representation \(\left\{\mathbf{x}\right\}_{\mathcal{B}} \) in the coordinate system defined by \(\mathcal{B}\text{.}\)
- How do we use the matrix \(C_{\mathcal{B}} = \left[\begin{array}{rr} \mathbf v_1 & \mathbf v_2 \end{array}\right]\) to convert a vector's representation \(\left\{\mathbf{x}\right\}_{\mathcal{B}} \) in the coordinate system defined by \(\mathcal{B}\) into its standard representation \(\mathbf x\text{?}\) How do we use this matrix to convert \(\mathbf x\) into \(\left\{\mathbf{x}\right\}_{\mathcal{B}}\text{?}\)
- Suppose that we have a matrix \(A\) whose eigenvectors are \(\mathbf v_1\) and \(\mathbf v_2\) and associated eigenvalues are \(\lambda_1=4\) and \(\lambda_2 = 2\text{.}\) Express the vector \(A(-3\mathbf v_1 +5\mathbf v_2)\) as a linear combination of \(\mathbf v_1\) and \(\mathbf v_2\text{.}\)
- If \(\left\{\mathbf{x}\right\}_{\mathcal{B}}\} = \twovec{-3}{5}\text{,}\) find \( \left\{\mathbf{Ax}\right\}_{\mathcal{B}} \text{.}\)
Diagonalization of matrices
As we have investigated eigenvalues and eigenvectors of matrices in this chapter, we have frequently asked whether we can find a basis of eigenvectors, as in Question 4.1.7. In fact, Proposition 4.2.3 tells us that if \(A\) is an \(n\times n\) matrix having distinct and real eigenvalues, then there is a basis for \(\mathbb R^n\) consisting of eigenvectors of \(A\text{.}\) There are, in addition, other conditions on \(A\) that guarantee such a basis, as we will see in subsequent chapters, but for now, suffice it to say that for many matrices, we can find a basis of eigenvectors. We will now see how such a matrix \(A\) is equivalent to a diagonal matrix \(D\text{.}\)
Remember also that we have seen how to use a basis \(\mathcal{B}=\{\mathbf v_1,\mathbf v_2,\ldots,\mathbf v_n\}\) of \(\mathbb R^n\) to construct a coordinate system for \(\mathbb R^n\text{.}\) In particular, \(\left\{\mathbf{x}\right\}_{\mathcal{B}} = \fourvec{c_1}{c_2}{\vdots}{c_n}\) if \(\mathbf x = c_1\mathbf v_1 + c_2\mathbf v_2 + \ldots + c_n\mathbf v_n\text{.}\) We also used matrix multiplication to express this fact: if \(C_{\mathcal{B}} = \left[\begin{array}{rrrr} \mathbf v_1 & \mathbf v_2 & \ldots & \mathbf v_n \end{array}\right]\text{,}\) then
\[ \mathbf x = C_{\mathcal{B}}\left\{\mathbf{x}\right\}_{\mathcal{B}}, \qquad \left\{\mathbf{x}\right\}_{\mathcal{B}} = C_{\mathcal{B}}^{-1}\mathbf x\text{.} \]
Activity 4.3.2.
Once again, we will consider the matrices
\[ A = \left[\begin{array}{rr} 1 & 2 \\ 2 & 1 \\ \end{array}\right],\qquad D = \left[\begin{array}{rr} 3 & 0 \\ 0 & -1 \\ \end{array}\right]\text{.} \]
The matrix \(A\) has eigenvectors \(\mathbf v_1=\twovec{1}{1}\) and \(\mathbf v_2=\twovec{-1}{1}\) and eigenvalues \(\lambda_1=3\) and \(\lambda_2=-1\text{.}\) We will consider the basis of \(\mathbb R^2\) consisting of eigenvectors \(\mathcal{B}= \{\mathbf v_1, \mathbf v_2\}\text{.}\)
- If \(\mathbf x= 2\mathbf v_1 - 3\mathbf v_2\text{,}\) write \(A\mathbf x\) as a linear combination of \(\mathbf v_1\) and \(\mathbf v_2\text{.}\)
- If \(\left\{\mathbf{x}\right\}_{\mathcal{B}}=\twovec{2}{-3}\text{,}\) find \(\left\{A\mathbf{x}\right\}_{\mathcal{B}}\text{,}\) the representation of \(A\mathbf x\) in the coordinate system defined by \(\mathcal{B}\text{.}\)
- If \(\left\{\mathbf{x}\right\}_{\mathcal{B}}=\twovec{c_1}{c_2}\text{,}\) find \(\left\{A\mathbf{x}\right\}_{\mathcal{B}}\text{,}\) the representation of \(A\mathbf x\) in the coordinate system defined by \(\mathcal{B}\text{.}\)
- Explain why \(\left\{A\mathbf{x}\right\}_{\mathcal{B}} = D\left\{\mathbf{x}\right\}_{\mathcal{B}} \text{.}\)
-
Explain why \(C_{\mathcal{B}}^{-1}A\mathbf x = DC_{\mathcal{B}}^{-1}\mathbf x\) for all vectors \(\mathbf x\) and hence
\[ C_{\mathcal{B}}^{-1}A = DC_{\mathcal{B}}^{-1}\text{.} \]
-
Explain why \(A = C_{\mathcal{B}}DC_{\mathcal{B}}^{-1}\) and verify this relationship by computing \(C_{\mathcal{B}}DC_{\mathcal{B}}^{-1}\) in the Sage cell below.
The key to understanding the equivalence of a matrix \(A\) and a diagonal matrix \(D\) is through the coordinate system defined by a basis consisting of eigenvectors of \(A\text{.}\) We will assume that \(A\) is an \(n\times n\) matrix and that there is a basis \(\mathcal{B}=\{\mathbf v_1,\mathbf v_2,\ldots,\mathbf v_n\}\) consisting of eigenvectors of \(A\) with associated eigenvalues \(\lambda_1, \lambda_2,\ldots,\lambda_n\text{.}\)
We know that if
\[ \mathbf x = c_1\mathbf v_1 + c_2\mathbf v_2 + \ldots + c_n\mathbf v_n\text{,} \]
then
\[ A\mathbf x = \lambda_1c_1\mathbf v_1 + \lambda_2c_2\mathbf v_2 + \ldots + \lambda_nc_n\mathbf v_n \text{.} \]
This fact is conveniently expressed using the coordinate system defined by \(\mathcal{B}\text{;}\) in particular,
\[ \left\{\mathbf{x}\right\}_{\mathcal{B}} = \fourvec{c_1}{c_2}{\vdots}{c_n},\qquad \left\{A\mathbf{x}\right\}_{\mathcal{B}} = \fourvec{\lambda_1c_1}{\lambda_2c_2}{\vdots}{\lambda_nc_n}\text{.} \]
Forming the diagonal matrix
\[ D = \left[\begin{array}{cccc} \lambda_1 & 0 & \ldots & 0 \\ 0 & \lambda_2 & \ldots & 0 \\ \vdots & \vdots & \ddots & 0 \\ 0 & 0 & \ldots & \lambda_n \\ \end{array}\right]\text{,} \]
we see that
\[ \left\{A\mathbf{x}\right\}_{\mathcal{B}} = D\left\{\mathbf{x}\right\}_{\mathcal{B}}\text{.} \]
We now use the fact that the matrix \(C_{\mathcal{B}} = \left[\begin{array}{cccc} \mathbf v_1 & \mathbf v_2 & \ldots & \mathbf v_n \end{array}\right]\) performs the change of coordinates; that is, \(\left\{A\mathbf{x}\right\}_{\mathcal{B}} = C_{\mathcal{B}}^{-1}A\mathbf x\) and \(\left\{\mathbf{x}\right\}_{\mathcal{B}} = C_{\mathcal{B}}^{-1}\mathbf x\text{.}\) This says that
\[ C_{\mathcal{B}}^{-1}A\mathbf x = DC_{\mathcal{B}}^{-1}\mathbf x\text{,} \]
for all vectors \(\mathbf x\text{,}\) which means that \(C_{\mathcal{B}}^{-1}A = DC_{\mathcal{B}}^{-1}\) or
\[ A = C_{\mathcal{B}}DC_{\mathcal{B}}^{-1}\text{.} \]
So that the form of this expression stands out more clearly, it is customary to denote the matrix \(C_{\mathcal{B}}\) as \(P\) so that we have \(P = C_{\mathcal{B}}\) and hence
\[ A = PDP^{-1}\text{.} \]
We say that the matrix \(A\) is diagonalizable if there is a diagonal matrix \(D\) and invertible matrix \(P\) such that
\[ A = PDP^{-1}\text{.} \]
This is the sense in which we mean that \(A\) is equivalent to a diagonal matrix \(D\text{.}\) The expression \(A=PDP^{-1}\) says that \(A\text{,}\) expressed in the basis defined by the columns of \(P\text{,}\) has the same geometric effect as \(D\text{,}\) expressed in the standard basis \(\mathbf e_1, \mathbf e_2,\ldots,\mathbf e_n\text{.}\)
We have now seen the following proposition.
If \(A\) is an \(n\times n\) matrix and there is a basis \(\{\mathbf v_1,\mathbf v_2,\ldots,\mathbf v_n\}\) consisting of eigenvectors of \(A\) having associated eigenvalues \(\lambda_1, \lambda_2, \ldots, \lambda_n\text{,}\) then \(A\) is diagonalizable. That is, we can write \(A=PDP^{-1}\) where \(D\) is the diagonal matrix whose diagonal entries are the eigenvalues of \(A\)
\[ D = \left[\begin{array}{cccc} \lambda_1 & 0 & \ldots & 0 \\ 0 & \lambda_2 & \ldots & 0 \\ \vdots & \vdots & \ddots & 0 \\ 0 & 0 & \ldots & \lambda_n \\ \end{array}\right] \]
and the matrix \(P = \left[\begin{array}{cccc} \mathbf v_1 & \mathbf v_2 & \ldots & \mathbf v_n \end{array}\right] \text{.}\)
In fact, if we only know that \(A = PDP^{-1}\) where \(P = \left[\begin{array}{cccc} \mathbf v_1 & \mathbf v_2 & \ldots \mathbf v_n \end{array}\right]\text{,}\) we can say that the vectors \(\mathbf v_j\) are eigenvectors of \(A\) and that the associated eigenvalue is the \(j^{th}\) diagonal entry of \(D\text{.}\)
We will try to find a diagonalization of \(A = \left[\begin{array}{rr} -5 & 6 \\ -3 & 4 \\ \end{array}\right] \text{.}\)
First, we find the eigenvalues of \(A\) by solving the characteristic equation
\[ \det(A-\lambda I) = (-5-\lambda)(4-\lambda)+18 = (-2-\lambda)(1-\lambda) = 0\text{.} \]
This shows that the eigenvalues of \(A\) are \(\lambda_1 = -2\) and \(\lambda_2 = 1\text{.}\)
By constructing \(Nul(A-(-2)I)\text{,}\) we find a basis for \(E_{-2}\) consisting of the vector \(\mathbf v_1 = \twovec{2}{1}\text{.}\) Similarly, a basis for \(E_1\) consists of the vector \(\mathbf v_2 = \twovec{1}{1}\text{.}\) This shows that we can construct a basis \(\mathcal{B}=\{\mathbf v_1,\mathbf v_2\}\) of \(\mathbb R^2\) consisting of eigenvectors of \(A\text{.}\)
We now form the matrices
\[ D = \left[\begin{array}{rr} -2 & 0 \\ 0 & 1 \\ \end{array}\right],\qquad P = \left[\begin{array}{cc} \mathbf v_1 & \mathbf v_2 \end{array}\right] = \left[\begin{array}{rr} 2 & 1 \\ 1 & 1 \\ \end{array}\right] \]
and verify that
\[ PDP^{-1} = \left[\begin{array}{rr} 2 & 1 \\ 1 & 1 \\ \end{array}\right] \left[\begin{array}{rr} -2 & 0 \\ 0 & 1 \\ \end{array}\right] \left[\begin{array}{rr} 1 & -1 \\ -1 & 2 \\ \end{array}\right] = \left[\begin{array}{rr} -5 & 6 \\ -3 & 4 \\ \end{array}\right] = A\text{.} \]
There are, of course, many ways to diagonalize \(A\text{.}\) For instance, we could change the order of the eigenvalues and eigenvectors and write
\[ D = \left[\begin{array}{rr} 1 & 0 \\ 0 & -2 \\ \end{array}\right],\qquad P = \left[\begin{array}{cc} \mathbf v_2 & \mathbf v_1 \end{array}\right] = \left[\begin{array}{rr} 1 & 2 \\ 1 & 1 \\ \end{array}\right]\text{.} \]
If we choose a different basis for the eigenspaces, we will also find a different matrix \(P\) that diagonalizes \(A\text{.}\) The point is that there are many ways in which \(A\) can be written in the form \(A=PDP^{-1}\text{.}\)
We will try to find a diagonalization of \(A = \left[\begin{array}{rr} 0 & 4 \\ -1 & 4 \\ \end{array}\right] \text{.}\)
Once again, we find the eigenvalues by solving the characteristic equation:
\[ \det(A-\lambda I) = -\lambda(4-\lambda) + 4 = (2-\lambda)^2 = 0\text{.} \]
In this case, there is a single eigenvalue \(\lambda=2\text{.}\)
We find a basis for the eigenspace \(E_2\) by describing \(Nul(A-2I)\text{:}\)
\[ A-2I = \left[\begin{array}{rr} -2 & 4 \\ -1 & 2 \\ \end{array}\right] \sim \left[\begin{array}{rr} 1 & -2 \\ 0 & 0 \\ \end{array}\right]\text{.} \]
This shows that the eigenspace \(E_2\) is one-dimensional with \(\mathbf v_1=\twovec{2}{1}\) forming a basis.
In this case, there is not a basis of \(\mathbb R^2\) consisting of eigenvectors of \(A\text{,}\) which tells us that \(A\) is not diagonalizable.
Suppose we know that \(A=PDP^{-1}\) where
\[ D = \left[\begin{array}{rr} 2 & 0 \\ 0 & -2 \\ \end{array}\right],\qquad P = \left[\begin{array}{cc} \mathbf v_2 & \mathbf v_1 \end{array}\right] = \left[\begin{array}{rr} 1 & 1 \\ 1 & 2 \\ \end{array}\right]\text{.} \]
In this case, we know that the columns of \(P\) form eigenvectors of \(A\text{.}\) For instance, \(\mathbf v_1 = \twovec{1}{1}\) is an eigenvector of \(A\) with eigenvalue \(\lambda_1 = 2\text{.}\) Also, \(\mathbf v_2 = \twovec{1}{2}\) is an eigenvector with eigenvalue \(\lambda_2=-2\text{.}\)
We can verify this by computing
\[ A = PDP^{-1} = \left[\begin{array}{rr} 6 & -4 \\ 8 & -6 \\ \end{array}\right]\text{.} \]
Then, we can compute that \(A\mathbf v_1 = \twovec{1}{1}=2\mathbf v_1\) and \(A\mathbf v_2 = \twovec{1}{2} = -2\mathbf v_2\text{.}\)
Activity 4.3.3.
-
Find a diagonalization of \(A\text{,}\) if one exists, when
\[ A = \left[\begin{array}{rr} 3 & -2 \\ 6 & -5 \\ \end{array}\right]\text{.} \]
-
Can the diagonal matrix
\[ A = \left[\begin{array}{rr} 2 & 0 \\ 0 & -5 \\ \end{array}\right] \]
be diagonalized? If so, explain how to find the matrices \(P\) and \(D\text{.}\)
-
Find a diagonalization of \(A\text{,}\) if one exists, when
\[ A = \left[\begin{array}{rrr} -2 & 0 & 0 \\ 1 & -3& 0 \\ 2 & 0 & -3 \\ \end{array}\right]\text{.} \]
-
Find a diagonalization of \(A\text{,}\) if one exists, when
\[ A = \left[\begin{array}{rrr} -2 & 0 & 0 \\ 1 & -3& 0 \\ 2 & 1 & -3 \\ \end{array}\right]\text{.} \]
-
Suppose that \(A=PDP^{-1}\) where
\[ D = \left[\begin{array}{rr} 3 & 0 \\ 0 & -1 \\ \end{array}\right],\qquad P = \left[\begin{array}{cc} \mathbf v_2 & \mathbf v_1 \end{array}\right] = \left[\begin{array}{rr} 2 & 2 \\ 1 & -1 \\ \end{array}\right]\text{.} \]
- Explain why \(A\) is invertible.
- Find a diagonalization of \(A^{-1}\text{.}\)
- Find a diagonalization of \(A^3\text{.}\)
Powers of a diagonalizable matrix
In several earlier examples, we have been interested in computing powers of a given matrix. For instance, in Activity 4.1.3, we are given the matrix \(A = \left[\begin{array}{rr} 0.8 & 0.6 \\ 0.2 & 0.4 \\ \end{array}\right]\) and an initial vector \(\mathbf x_0=\twovec{1000}{0}\text{,}\) and we wanted to compute
\[ \begin{aligned} \mathbf x_1 & {}={} A\mathbf x_0 \\ \mathbf x_2 & {}={} A\mathbf x_1 = A^2\mathbf x_0 \\ \mathbf x_3 & {}={} A\mathbf x_2 = A^3\mathbf x_0\text{.} \\ \end{aligned} \]
More generally, we would like to find \(\mathbf x_k=A^k\mathbf x_0\) and determine what happens as \(k\) becomes very large. If a matrix \(A\) is diagonalizable, writing \(A=PDP^{-1}\) can help us understand powers of \(A\) easily.
Activity 4.3.4.
-
Let's begin with the diagonal matrix
\[ D = \left[\begin{array}{rr} 2 & 0 \\ 0 & -1 \\ \end{array}\right]\text{.} \]
Find the powers \(D^2\text{,}\) \(D^3\text{,}\) and \(D^4\text{.}\) What is \(D^k\) for a general value of \(k\text{?}\)
- Suppose that \(A\) is a matrix with eigenvector \(\mathbf v\) and associated eigenvalue \(\lambda\text{;}\) that is, \(A\mathbf v = \lambda\mathbf v\text{.}\) By considering \(A^2\mathbf v\text{,}\) explain why \(\mathbf v\) is also an eigenvector of \(A\) with eigenvalue \(\lambda^2\text{.}\)
-
Suppose that \(A= PDP^{-1}\) where
\[ D = \left[\begin{array}{rr} 2 & 0 \\ 0 & -1 \\ \end{array}\right]\text{.} \]
Remembering that the columns of \(P\) are eigenvectors of \(A\text{,}\) explain why \(A^2\) is diagonalizable and find a diagonalization of it.
-
Give another explanation of the diagonalizability of \(A^2\) by writing
\[ A^2 = (PDP^{-1})(PDP^{-1}) = PD(P^{-1}P)DP^{-1}\text{.} \]
- In the same way, find a diagonalization of \(A^3\text{,}\) \(A^4\text{,}\) and \(A^k\text{.}\)
- Suppose that \(A\) is a diagonalizable \(2\times2\) matrix with eigenvalues \(\lambda_1 = 0.5\) and \(\lambda_2=0.1\text{.}\) What happens to \(A^k\) as \(k\) becomes very large?
We begin by noting that the eigenvectors of a matrix \(A\) are also eigenvectors of the powers of \(A\text{.}\) For instance, if \(A\mathbf v = \lambda\mathbf v\text{,}\) then
\[ A^2\mathbf v = A(A\mathbf v) = A(\lambda\mathbf v) = \lambda A\mathbf v = \lambda^2\mathbf v\text{.} \]
In this way, we see that \(\mathbf v\) is an eigenvector of \(A^2\) with eigenvalue \(\lambda^2\text{.}\) Furthermore, for any \(k\text{,}\) \(\mathbf v\) is an eigenvector of \(A^k\) with eigenvalue \(\lambda^k\text{.}\)
Now if \(A\) is diagonalizable, we can write \(A=PDP^{-1}\) where the columns of \(P\) are eigenvectors of \(A\) and the diagonal entries of \(D\) are the eigenvalues. If \(D = \left[\begin{array}{rr} \lambda_1 & 0 \\ 0 & \lambda_2 \\ \end{array}\right] \text{,}\) then
\[ A^2 = P\left[\begin{array}{rr} \lambda_1^2 & 0 \\ 0 & \lambda_2^2 \\ \end{array}\right] P^{-1} = PD^2P^{-1}\text{.} \]
We have the same matrix \(P\) in this expression since the eigenvectors of \(A^2\) are also the eigenvectors of \(A\text{.}\)
Another way to see this is to note that
\[ \begin{aligned} A^2 & {}={} (PDP^{-1})(PDP^{-1}) \\ & {}={} PD(P^{-1}P)DP^{-1} \\ & {}={} PDIDP^{-1} \\ & {}={} PDDP^{-1} \\ & {}={} PD^2P^{-1}\text{.} \end{aligned} \]
Similarly, any power of \(A\) is diagaonalizable; in particular, \(A^k = PD^kP^{-1}\text{.}\)
In the next section, we will see some important uses of our ability to deal with powers in this way. Until then, consider the case where \(D = \left[\begin{array}{rr} 0.5 & 0 \\ 0 & 0.1 \\ \end{array}\right]\) so that \(D^k = \left[\begin{array}{rr} 0.5^k & 0 \\ 0 & 0.1^k \\ \end{array}\right] \text{.}\) As \(k\) becomes very large, the diagonal entries become increasingly close to zero. This means that \(D^k\) becomes increasingly close to the zero matrix \(\left[\begin{array}{rr} 0 & 0 \\ 0 & 0 \\ \end{array}\right]\) as does \(A^k = PD^kP^{-1}\text{.}\) In other words, no matter what vector \(\mathbf x_0\) we begin with, the vectors \(A^k\mathbf x_0\) becomes increasingly close to \(\mathbf{0} \text{.}\)
Similarity and complex eigenvalues
We have been interested in diagonalizing a matrix \(A\) because doing so relates a matrix \(A\) to a simpler diagonal matrix \(D\text{.}\) If we write \(A=PDP^{-1}\text{,}\) we see that multiplying a vector by \(A\) in the coordinates defined by the columns of \(P\) is the same as multiplying by \(D\) in standard coordinates. Under this change of coordinates, \(A\) and \(D\) have the same effect on vectors.
More generally, if we have two matrices \(A\) and \(B\) such that \(A=PBP^{-1}\text{,}\) we may regard multiplication by \(A\) and \(B\) as having the same effect on vectors under the change of coordinates defined by the columns of \(P\text{.}\) That is, if \(\mathcal{B}\) is the basis formed by the columns of \(P\text{,}\) then \(\left\{A\mathbf{x}\right\}_{\mathcal{B}} = B\left\{\mathbf{x}\right\}_{\mathcal{B}}\text{.}\) This leads to the following definition.
We say that \(A\) is similar to \(B\) if there is an invertible matrix \(P\) such that \(A = PBP^{-1}\text{.}\)
Notice that a matrix is diagonalizable if and only if it is similar to a diagonal matrix. We have, however, seen several examples of a matrix \(A\) that is not diagonalizable. In this case, it is natural to ask if there is some simpler matrix that is similar to \(A\text{.}\)
Let's consider the matrix \(A = \left[\begin{array}{rr} -2 & 2 \\ -5 & 4 \\ \end{array}\right]\) whose characteristic equation is
\[ \det(A-\lambda I) = (-2-\lambda)(4-\lambda)+10 = 2 - 2\lambda + \lambda^2 = 0\text{.} \]
Applying the quadratic formula to find the eigenvalues, we obtain
\[ \lambda = \frac{2\pm\sqrt{(-2)^2-4\cdot1\cdot2}}{2}=1\pm i\text{.} \]
Here we see that the matrix \(A\) has two complex eigenvalues and is therefore not diagonalizable.
In case a matrix \(A\) has complex eigenvalues, we will find a simpler matrix \(C\) that is similar to \(A\text{.}\) In particular, if \(A\) has an eigenvalue \(\lambda = a+bi\text{,}\) then \(A\) is similar to \(C=\left[\begin{array}{rr} a & -b \\ b & a \\ \end{array}\right] \text{.}\)
The next activity shows that \(C\) has a simple geometric effect on \(\mathbb R^2\text{.}\) First, however, we will rewrite \(C\) in polar coordinates, as shown in the figure. We form the point \((a,b)\text{,}\) which defines \(r\text{,}\) the distance from the origin, and \(\theta\text{,}\) the angle formed with the positive horizontal axis. We then have
\[ \begin{aligned} a & {}={} r\cos\theta \\ b & {}={} r\sin\theta\text{.} \\ \end{aligned} \]
Notice that \(r=\sqrt{a^2+b^2}\text{.}\)
Activity 4.3.5.
-
We will rewrite \(C\) in terms of \(r\) and \(\theta\text{.}\) Explain why
\[ \left[\begin{array}{rr} a & -b \\ b & a \\ \end{array}\right] = \left[\begin{array}{rr} r\cos\theta & -r\sin\theta \\ r\sin\theta & r\cos\theta \\ \end{array}\right] = \left[\begin{array}{rr} r & 0 \\ 0 & r \\ \end{array}\right] \left[\begin{array}{rr} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \\ \end{array}\right]\text{.} \]
- Explain why \(C\) has the geometric effect of rotating vectors by \(\theta\) and stretching them by a factor of \(r\text{.}\)
-
Let's now consider the matrix \(A\) from Example 4.3.8:
\[ A = \left[\begin{array}{rr} -2 & 2 \\ -5 & 4 \\ \end{array}\right] \]
whose eigenvalues are \(\lambda_1 = 1+i\) and \(\lambda_2 = 1-i\text{.}\) We will choose to focus on one of the eigenvalues \(\lambda_1 = a+bi= 1+i. \)
Form the matrix \(C\) using these values of \(a\) and \(b\text{.}\) Then rewrite the point \((a,b)\) in polar coordinates by identifying the values of \(r\) and \(\theta\text{.}\) Explain the geometric effect of multiplying vectors of \(C\text{.}\)
-
Suppose that \(P=\left[\begin{array}{rr} 1 & 1 \\ 2 & 1 \\ \end{array}\right] \text{.}\) Verify that \(A = PCP^{-1}\text{.}\)
- Explain why \(A^k = PC^kP^{-1}\text{.}\)
- We formed the matrix \(C\) by choosing the eigenvalue \(\lambda_1=1+i\text{.}\) Suppose we had instead chosen \(\lambda_2 = 1-i\text{.}\) Form the matrix \(C'\) and use polar coordinates to describe the geometric effect of \(C\text{.}\)
- Using the matrix \(P' = \left[\begin{array}{rr} 1 & -1 \\ 2 & -1 \\ \end{array}\right] \text{,}\) show that \(A = P'C'P'^{-1}\text{.}\)
If the \(2\times2\) matrix \(A\) has a complex eigenvalue \(\lambda = a + bi\text{,}\) this activity demonstrates the fact that \(A\) is similar to the matrix \(C = \left[\begin{array}{rr} a & -b \\ b & a \\ \end{array}\right] \text{.}\) When we consider the matrix \(A = \left[\begin{array}{rr} -2 & 2 \\ -5 & 4 \\ \end{array}\right] \text{,}\) we find the complex eigenvalue \(\lambda=1+i\text{,}\) which leads to the matrix
\[ C = \left[\begin{array}{rr} 1 & -1 \\ 1 & 1 \\ \end{array}\right] = \left[\begin{array}{rr} \sqrt{2} & 0 \\ 0 & \sqrt{2} \\ \end{array}\right] \left[\begin{array}{rr} \cos(45^\circ) & -\sin(45^\circ) \\ \sin(45^\circ) & \cos(45^\circ) \\ \end{array}\right]\text{.} \]
The matrix has the geometric effect of rotating vectors by \(45^\circ\) and stretching them by a factor of \(\sqrt{2}\text{,}\) as shown in the figure.
As we saw in the activity, our original matrix \(A\) is similar to \(C\text{.}\) That is, we saw that there is a matrix \(P\) such that \(A=PCP^{-1}\text{.}\) This means that, when expressed in the coordinates defined by the columns of \(P\text{,}\) multiplying a vector by \(A\) is equivalent to multiplying by \(C\text{;}\) that is, if \(\mathcal{B}\) is the basis formed by the columns of \(A\text{,}\) then \(\left\{A\mathbf{x}\right\}_{\mathcal{B}} = C\left\{\mathbf{x}\right\}_{\mathcal{B}}\text{.}\)
Had we chosen the other eigenvalue \(\lambda_2 = 1-i\text{,}\) we would have formed the matrix
\[ C' = \left[\begin{array}{rr} 1 & 1 \\ -1 & 1 \\ \end{array}\right] = \left[\begin{array}{rr} \sqrt{2} & 0 \\ 0 & \sqrt{2} \\ \end{array}\right] \left[\begin{array}{rr} \cos(-45^\circ) & -\sin(-45^\circ) \\ \sin(-45^\circ) & \cos(-45^\circ) \\ \end{array}\right]\text{.} \]
In other words, this matrix \(C'\) rotates vectors by \(-45^\circ\) and stretches them by a factor of \(\sqrt{2}\text{.}\) The original matrix \(A\) is also similar to \(C'\text{.}\)
Depending on which complex eigenvalue we choose, we find a matrix \(C\) that performs either a counterclockwise or a clockwise rotation. In our future uses, we will focus on \(r\text{,}\) the streching factor, and not be concerned about the direction of the rotation.
Summary
The ideas in this section demonstrate how the eigenvalues and eigenvectors of a matrix \(A\) can provide us with a new coordinate system in which multiplying by \(A\) reduces to a simpler operation.
- We said that \(A\) is diagonalizable if we can write \(A = PDP^{-1}\) where \(D\) is a diagonal matrix. The columns of \(P\) consist of eigenvectors of \(A\) and the diagonal entries of \(D\) are the associated eigenvalues.
- An \(n\times n\) matrix \(A\) is diagonalizable if and only if there is a basis of \(\mathbb R^n\) consisting of eigenvectors of \(A\text{.}\)
- We said that \(A\) and \(B\) are similar if there is an invertible matrix \(P\) such that \(A=PBP^{-1}\text{.}\) In this case, \(A^k = PB^kP^{-1}\text{.}\)
- If \(A\) is a \(2\times2\) matrix with complex eigenvalue \(\lambda = a+bi\text{,}\) then \(A\) is similar to \(C = \left[\begin{array}{rr} a & -b \\ b & a \\ \end{array} \right] \text{.}\) Writing the point \((a,b)\) in polar coordinates \(r\) and \(\theta\text{,}\) we see that \(C\) rotates vectors through an angle \(\theta\) and stretches them by a factor of \(r=\sqrt{a^2+b^2}\text{.}\)
Exercises 4.3.5Exercises
Determine whether the following matrices are diagonalizable. If so, find matrices \(D\) and \(P\) such that \(A=PDP^{-1}\text{.}\)
- \(A = \left[\begin{array}{rr} -2 & -2 \\ -2 & 1 \\ \end{array}\right] \text{.}\)
- \(A = \left[\begin{array}{rr} -1 & 1 \\ -1 & -3 \\ \end{array}\right] \text{.}\)
- \(A = \left[\begin{array}{rr} 3 & -4 \\ 2 & -1 \\ \end{array}\right] \text{.}\)
- \(A = \left[\begin{array}{rrr} 1 & 0 & 0 \\ 2 & -2 & 0 \\ 0 & 1 & 4 \\ \end{array}\right] \text{.}\)
- \(A = \left[\begin{array}{rrr} 1 & 2 & 2 \\ 2 & 1 & 2 \\ 2 & 2 & 1 \\ \end{array}\right] \text{.}\)
Determine whether the following matrices have complex eigenvalues. If so, find the matrix \(C\) such that \(A = PCP^{-1}\text{.}\)
- \(A = \left[\begin{array}{rr} -2 & -2 \\ -2 & 1 \\ \end{array}\right] \text{.}\)
- \(A = \left[\begin{array}{rr} -1 & 1 \\ -1 & -3 \\ \end{array}\right] \text{.}\)
- \(A = \left[\begin{array}{rr} 3 & -4 \\ 2 & -1 \\ \end{array}\right] \text{.}\)
Determine whether the following statements are true or false and provide a justification for your response.
- If \(A\) is invertible, then \(A\) is diagonalizable.
- If \(A\) and \(B\) are similar and \(A\) is invertible, then \(B\) is also invertible.
- If \(A\) is a diagonalizable \(n\times n\) matrix, then there is a basis of \(\mathbb R^n\) consisting of eigenvectors of \(A\text{.}\)
- If \(A\) is diagonalizable, then \(A^{10}\) is also diagonalizable.
- If \(A\) is diagonalizable, then \(A\) is invertible.
Provide a justification for your response to the following questions.
- If \(A\) is a \(3\times3\) matrix having eigenvalues \(\lambda = 2, 3, -4\text{,}\) can you guarantee that \(A\) is diagonalizable?
- If \(A\) is a \(2\times 2\) matrix with a complex eigenvalue, can you guarantee that \(A\) is diagonalizable?
- If \(A\) is similar to the matrix \(B = \left[\begin{array}{rrr} -5 & 0 & 0 \\ 0 & -5 & 0 \\ 0 & 0 & 3 \\ \end{array}\right] \text{,}\) is \(A\) diagonalizable?
- What matrices are similar to the identity matrix?
- If \(A\) is a diagonalizable \(2\times2\) matrix with a single eigenvalue \(\lambda = 4\text{,}\) what is \(A\text{?}\)
Describe geometric effect that the following matrices have on \(\mathbb R^2\text{:}\)
- \(\displaystyle A = \left[\begin{array}{rr} 2 & 0 \\ 0 & 2 \\ \end{array}\right]\)
- \(\displaystyle A = \left[\begin{array}{rr} 4 & 2 \\ 0 & 4 \\ \end{array}\right]\)
- \(\displaystyle A = \left[\begin{array}{rr} 3 & -6 \\ 6 & 3 \\ \end{array}\right]\)
- \(\displaystyle A = \left[\begin{array}{rr} 4 & 0 \\ 0 & -2 \\ \end{array}\right]\)
- \(\displaystyle A = \left[\begin{array}{rr} 1 & 3 \\ 3 & 1 \\ \end{array}\right]\)
We say that \(A\) is similar to \(B\) if there is a matrix \(P\) such that \(A = PBP^{-1}\text{.}\)
- If \(A\) is similar to \(B\text{,}\) explain why \(B\) is similar to \(A\text{.}\)
- If \(A\) is similar to \(B\) and \(B\) is similar to \(C\text{,}\) explain why \(A\) is similar to \(C\text{.}\)
- If \(A\) is similar to \(B\) and \(B\) is diagonalizable, explain why \(A\) is diagonalizable.
- If \(A\) and \(B\) are similar, explain why \(A\) and \(B\) have the same characteristic polynomial; that is, explain why \(\det(A-\lambda I) = \det(B-\lambda I)\text{.}\)
- If \(A\) and \(B\) are similar, explain why \(A\) and \(B\) have the same eigenvalues.
Suppose that \(A = PDP^{-1}\) where
\[ D = \left[\begin{array}{rr} 1 & 0 \\ 0 & 0 \\ \end{array}\right],\qquad P = \left[\begin{array}{rr} 1 & -2 \\ 2 & 1 \\ \end{array}\right]\text{.} \]
- Explain the geometric effect that \(D\) has on vectors in \(\mathbb R^2\text{.}\)
- Explain the geometric effect that \(A\) has on vectors in \(\mathbb R^2\text{.}\)
- What can you say about \(A^2\) and other powers of \(A\text{?}\)
- Is \(A\) invertible?
When \(A\) is a \(2\times2\) matrix with a complex eigenvalue \(\lambda = a+bi\text{,}\) we have said that there is a matrix \(P\) such that \(A=PCP^{-1}\) where \(C=\left[\begin{array}{rr} a & -b \\ b & a \\ \end{array}\right] \text{.}\) In this exercise, we will learn how to find the matrix \(P\text{.}\) As an example, we will consider the matrix \(A = \left[\begin{array}{rr} 2 & 2 \\ -1 & 4 \\ \end{array}\right] \text{.}\)
- Show that the eigenvalues of \(A\) are complex.
- Choose one of the complex eigenvalues \(\lambda=a+bi\) and construct the usual matrix \(C\text{.}\)
- Using the same eigenvalue, we will find an eigenvector \(\mathbf v\) where the entries of \(\mathbf v\) are complex numbers. As always, we will describe \(Nul(A-\lambda I)\) by constructing the matrix \(A-\lambda I\) and finding its reduced row echelon form. In doing so, we will necessarily need to use complex arithmetic.
- We have now found a complex eigenvector \(\mathbf v\text{.}\) Write \(\mathbf v = \mathbf v_1 - i \mathbf v_2\) to identify vectors \(\mathbf v_1\) and \(\mathbf v_2\) having real entries.
- Construct the matrix \(P = \left[\begin{array}{rr} \mathbf v_1 & \mathbf v_2 \end{array}\right]\) and verify that \(A=PCP^{-1}\text{.}\)
For each of the following matrices, sketch the vector \(\mathbf x = \twovec{1}{0}\) and powers \(A^k\mathbf x\) for \(k=1,2,3,4\text{.}\)
-
\(A = \left[\begin{array}{rr} 0 & -1.4 \\ 1.4 & 0 \\ \end{array}\right] \text{.}\)
-
\(A = \left[\begin{array}{rr} 0 & -0.8 \\ 0.8 & 0 \\ \end{array}\right] \text{.}\)
-
\(A = \left[\begin{array}{rr} 0 & -1 \\ 1 & 0 \\ \end{array}\right] \text{.}\)
-
Consider a matrix of the form \(C=\left[\begin{array}{rr} a & -b \\ b & a \\ \end{array}\right]\) with \(r=\sqrt{a^2+b^2}\text{.}\) What happens when \(k\) becomes very large when
- \(r \lt 1\text{.}\)
- \(r = 1\text{.}\)
- \(r \gt 1\text{.}\)
For each of the following matrices and vectors, sketch the vector \(\mathbf x\) along with \(A^k\mathbf x\) for \(k=1,2,3,4\text{.}\)
-
\[ \begin{aligned} A & {}={} \left[\begin{array}{rr} 1.4 & 0 \\ 0 & 0.7 \\ \end{array}\right] \\ \\ \mathbf x & {}={} \twovec{1}{2}\text{.} \end{aligned}\text{.} \]
-
\[ \begin{aligned} A & {}={} \left[\begin{array}{rr} 0.6 & 0 \\ 0 & 0.9 \\ \end{array}\right] \\ \\ \mathbf x & {}={} \twovec{4}{3}\text{.} \end{aligned} \]
-
\[ \begin{aligned} A & {}={} \left[\begin{array}{rr} 1.2 & 0 \\ 0 & 1.4 \\ \end{array}\right] \\ \\ \mathbf x& {}={}\twovec{2}{1}\text{.} \end{aligned} \]
-
\[ \begin{aligned} A & {}={} \left[\begin{array}{rr} 0.95 & 0.25 \\ 0.25 & 0.95 \\ \end{array}\right] \\ \\ \mathbf x& {}={}\twovec{3}{0}\text{.} \end{aligned} \]
Find the eigenvalues and eigenvectors of \(A\) to create your sketch.
- If \(A\) is a \(2\times2\) matrix with eigenvalues \(\lambda_1=0.7\) and \(\lambda_2=0.5\) and \(\mathbf x\) is any vector, what happens to \(A^k\mathbf x\) when \(k\) becomes very large?