Processing math: 100%
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

7.2: Diagonalization

( \newcommand{\kernel}{\mathrm{null}\,}\)

Outcomes
  1. Determine when it is possible to diagonalize a matrix.
  2. When possible, diagonalize a matrix.

Similarity and Diagonalization

We begin this section by recalling the definition of similar matrices. Recall that if A,B are two n×n matrices, then they are similar if and only if there exists an invertible matrix P such that A=P1BP

In this case we write AB. The concept of similarity is an example of an equivalence relation.

Lemma 7.2.1: Similarity is an Equivalence Relation

Similarity is an equivalence relation, i.e. for n×n matrices A,B, and C,

  1. AA (reflexive)
  2. If AB, then BA (symmetric)
  3. If AB and BC, then AC (transitive)
Proof

It is clear that AA, taking P=I.

Now, if AB, then for some P invertible, A=P1BP and so PAP1=B But then (P1)1AP1=B which shows that BA.

Now suppose AB and BC. Then there exist invertible matrices P,Q such that A=P1BP, B=Q1CQ Then, A=P1(Q1CQ)P=(QP)1C(QP) showing that A is similar to C.

Another important concept necessary to this section is the trace of a matrix. Consider the definition.

Definition 7.2.1: Trace of a Matrix

If A=[aij] is an n×n matrix, then the trace of A is trace(A)=ni=1aii.

In words, the trace of a matrix is the sum of the entries on the main diagonal.

Lemma 7.2.2: Properties of Trace

For n×n matrices A and B, and any kR,

  1. trace(A+B)=trace(A)+trace(B)
  2. trace(kA)=ktrace(A)
  3. trace(AB)=trace(BA)

The following theorem includes a reference to the characteristic polynomial of a matrix. Recall that for any n×n matrix A, the characteristic polynomial of A is cA(x)=det(xIA).

Theorem 7.2.1: Properties of Similar Matrices

If A and B are n×n matrices and AB, then

  1. det(A)=det(B)
  2. rank(A)=rank(B)
  3. trace(A)=trace(B)
  4. cA(x)=cB(x)
  5. A and B have the same eigenvalues

We now proceed to the main concept of this section. When a matrix is similar to a diagonal matrix, the matrix is said to be diagonalizable. We define a diagonal matrix D as a matrix containing a zero in every entry except those on the main diagonal. More precisely, if dij is the ijth entry of a diagonal matrix D, then dij=0 unless i=j. Such matrices look like the following. D=[00] where is a number which might not be zero.

The following is the formal definition of a diagonalizable matrix.

Definition 7.2.2: Diagonalizable

Let A be an n×n matrix. Then A is said to be diagonalizable if there exists an invertible matrix P such that P1AP=D where D is a diagonal matrix.

Notice that the above equation can be rearranged as A=PDP1. Suppose we wanted to compute A100. By diagonalizing A first it suffices to then compute (PDP1)100, which reduces to PD100P1. This last computation is much simpler than A100. While this process is described in detail later, it provides motivation for diagonalization.

Diagonalizing a Matrix

The most important theorem about diagonalizability is the following major result.

Theorem 7.2.2: Eigenvectors and Diagonalizable Matrices

An n×n matrix A is diagonalizable if and only if there is an invertible matrix P given by P=[X1X2Xn] where the Xk are eigenvectors of A.

Moreover if A is diagonalizable, the corresponding eigenvalues of A are the diagonal entries of the diagonal matrix D.

Proof

Suppose P is given as above as an invertible matrix whose columns are eigenvectors of A. Then P1 is of the form P1=[WT1WT2WTn] where WTkXj=δkj, which is the Kronecker’s symbol defined by δij={1 if i=j0 if ij

Then P1AP=[WT1WT2WTn][AX1AX2AXn]=[WT1WT2WTn][λ1X1λ2X2λnXn]=[λ100λn]

Conversely, suppose A is diagonalizable so that P1AP=D. Let P=[X1X2Xn] where the columns are the Xk and D=[λ100λn] Then AP=PD=[X1X2Xn][λ100λn] and so [AX1AX2AXn]=[λ1X1λ2X2λnXn] showing the Xk are eigenvectors of A and the λk are eigenvectors.

Notice that because the matrix P defined above is invertible it follows that the set of eigenvectors of A, {X1,X2,,Xn}, form a basis of Rn.

We demonstrate the concept given in the above theorem in the next example. Note that not only are the columns of the matrix P formed by eigenvectors, but P must be invertible so must consist of a wide variety of eigenvectors. We achieve this by using basic eigenvectors for the columns of P.

Example 7.2.1: Diagonalize a Matrix

Let A=[200141244] Find an invertible matrix P and a diagonal matrix D such that P1AP=D.

Solution

By Theorem 7.2.2 we use the eigenvectors of A as the columns of P, and the corresponding eigenvalues of A as the diagonal entries of D.

First, we will find the eigenvalues of A. To do so, we solve det(λIA)=0 as follows. det(λ[100010001][200141244])=0

This computation is left as an exercise, and you should verify that the eigenvalues are λ1=2,λ2=2, and λ3=6.

Next, we need to find the eigenvectors. We first find the eigenvectors for λ1,λ2=2. Solving (2IA)X=0 to find the eigenvectors, we find that the eigenvectors are t[210]+s[101] where t,s are scalars. Hence there are two basic eigenvectors which are given by X1=[210],X2=[101]

You can verify that the basic eigenvector for λ3=6 is X3=[012]

Then, we construct the matrix P as follows. P=[X1X2X3]=[210101012] That is, the columns of P are the basic eigenvectors of A. Then, you can verify that P1=[14121412112141214] Thus, P1AP=[14121412112141214][200141244][210101012]=[200020006]

You can see that the result here is a diagonal matrix where the entries on the main diagonal are the eigenvalues of A. We expected this based on Theorem 7.2.2. Notice that eigenvalues on the main diagonal must be in the same order as the corresponding eigenvectors in P.

Consider the next important theorem.

Theorem 7.2.3: Linearly Independent Eigenvectors

Let A be an n×n matrix, and suppose that A has distinct eigenvalues λ1,λ2,,λm. For each i, let Xi be a λi-eigenvector of A. Then {X1,X2,,Xm} is linearly independent.

The corollary that follows from this theorem gives a useful tool in determining if A is diagonalizable.

Corollary 7.2.1: Distinct Eigenvalues

Let A be an n×n matrix and suppose it has n distinct eigenvalues. Then it follows that A is diagonalizable.

It is possible that a matrix A cannot be diagonalized. In other words, we cannot find an invertible matrix P so that P1AP=D.

Consider the following example.

Example 7.2.2: A Matrix which cannot be Diagonalized

Let A=[1101] If possible, find an invertible matrix P and diagonal matrix D so that P1AP=D.

Solution

Through the usual procedure, we find that the eigenvalues of A are λ1=1,λ2=1. To find the eigenvectors, we solve the equation (λIA)X=0. The matrix (λIA) is given by [λ110λ1]

Substituting in λ=1, we have the matrix [111011]=[0100]

Then, solving the equation (λIA)X=0 involves carrying the following augmented matrix to its reduced row-echelon form. [010000][010000]

Then the eigenvectors are of the form t[10] and the basic eigenvector is X1=[10]

In this case, the matrix A has one eigenvalue of multiplicity two, but only one basic eigenvector. In order to diagonalize A, we need to construct an invertible 2×2 matrix P. However, because A only has one basic eigenvector, we cannot construct this P. Notice that if we were to use X1 as both columns of P, P would not be invertible. For this reason, we cannot repeat eigenvectors in P.

Hence this matrix cannot be diagonalized.

The idea that a matrix may not be diagonalizable suggests that conditions exist to determine when it is possible to diagonalize a matrix. We saw earlier in Corollary 7.2.1 that an n×n matrix with n distinct eigenvalues is diagonalizable. It turns out that there are other useful diagonalizability tests.

First we need the following definition.

Definition 7.2.3: Eigenspace

Let A be an n×n matrix and λR. The eigenspace of A corresponding to λ, written Eλ(A) is the set of all eigenvectors corresponding to λ.

In other words, the eigenspace Eλ(A) is all X such that AX=λX. Notice that this set can be written Eλ(A)=null(λIA), showing that Eλ(A) is a subspace of Rn.

Recall that the multiplicity of an eigenvalue λ is the number of times that it occurs as a root of the characteristic polynomial.

Consider now the following lemma.

Lemma 7.2.3: Dimension of the Eigenspace

If A is an n×n matrix, then dim(Eλ(A))m where λ is an eigenvalue of A of multiplicity m.

This result tells us that if λ is an eigenvalue of A, then the number of linearly independent λ-eigenvectors is never more than the multiplicity of λ. We now use this fact to provide a useful diagonalizability condition.

Theorem 7.2.4: Diagonalizability Condition

Let A be an n×n matrix A. Then A is diagonalizable if and only if for each eigenvalue λ of A, dim(Eλ(A)) is equal to the multiplicity of λ.

Complex Eigenvalues

In some applications, a matrix may have eigenvalues which are complex numbers. For example, this often occurs in differential equations. These questions are approached in the same way as above.

Consider the following example.

Example 7.2.3: A Real Matrix with Complex Eigenvalues

Let A=[100021012] Find the eigenvalues and eigenvectors of A.

Solution

We will first find the eigenvalues as usual by solving the following equation.

det(λ[100010001][100021012])=0 This reduces to (λ1)(λ24λ+5)=0. The solutions are λ1=1,λ2=2+i and λ3=2i.

There is nothing new about finding the eigenvectors for λ1=1 so this is left as an exercise.

Consider now the eigenvalue λ2=2+i. As usual, we solve the equation (λIA)X=0 as given by ((2+i)[100010001][100021012])X=[000] In other words, we need to solve the system represented by the augmented matrix [1+i0000i1001i0]

We now use our row operations to solve the system. Divide the first row by (1+i) and then take i times the second row and add to the third row. This yields [10000i100000] Now multiply the second row by i to obtain the reduced row-echelon form, given by [100001i00000] Therefore, the eigenvectors are of the form t[0i1] and the basic eigenvector is given by X2=[0i1]

As an exercise, verify that the eigenvectors for λ3=2i are of the form t[0i1] Hence, the basic eigenvector is given by X3=[0i1]

As usual, be sure to check your answers! To verify, we check that AX3=(2i)X3 as follows. [100021012][0i1]=[012i2i]=(2i)[0i1]

Therefore, we know that this eigenvector and eigenvalue are correct.

Notice that in Example 7.2.3, two of the eigenvalues were given by λ2=2+i and λ3=2i. You may recall that these two complex numbers are conjugates. It turns out that whenever a matrix containing real entries has a complex eigenvalue λ, it also has an eigenvalue equal to ¯λ, the conjugate of λ.


This page titled 7.2: Diagonalization is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Ken Kuttler (Lyryx) via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?