Skip to main content
Mathematics LibreTexts

2.10: Appendix- Diagonalization and Linear Systems

  • Page ID
    106209
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    As we have seen, the matrix formulation for linear systems can be powerful, especially for \(n\) differential equations involving \(n\) unknown functions. Our ability to proceed towards solutions depended upon the solution of eigenvalue problems. However, in the case of repeated eigenvalues we saw some additional complications. This all depends deeply on the background linear algebra. Namely, we relied on being able to diagonalize the given coefficient matrix. In this section we will discuss the limitations of diagonalization and introduce the Jordan canonical form.

    We begin with the notion of similarity. Matrix \(A\) is similar to matrix B if and only if there exists a nonsingular matrix P such that

    \[B=P^{-1} A P \label{2.113} \]

    Recall that a nonsingular matrix has a nonzero determinant and is invertible.

    We note that the similarity relation is an equivalence relation. Namely, it satisfies the following

    1. \(A\) is similar to itself.
    2. If \(A\) is similar to \(B\), then \(B\) is similar to \(A\).
    3. If \(A\) is similar to \(B\) and \(B\) is similar to \(C\), the \(A\) is similar to \(C\).

    Also, if \(A\) is similar to \(B\), then they have the same eigenvalues. This follows from a simple computation of the eigenvalue equation. Namely,

    \[\begin{aligned}
    0 &=\operatorname{det}(B-\lambda I) \\
    &=\operatorname{det}\left(P^{-1} A P-\lambda P^{-1} I P\right) \\
    &=\operatorname{det}(P)^{-1} \operatorname{det}(A-\lambda I) \operatorname{det}(P) \\
    &=\operatorname{det}(A-\lambda I)
    \end{aligned} \label{2.114} \]

    Therefore, \(\operatorname{det}(A-\lambda I)=0\) and \(\lambda\) is an eigenvalue of both \(A\) and \(B\).
    An \(n \times n\) matrix \(A\) is diagonalizable if and only if \(A\) is similar to a diagonal matrix \(D\); i.e., there exists a nonsingular matrix \(P\) such that

    \[D=P^{-1} A P \label{2.115} \]

    One of the most important theorems in linear algebra is the Spectral Theorem. This theorem tells us when a matrix can be diagonalized. In fact, it goes beyond matrices to the diagonalization of linear operators. We learn in linear algebra that linear operators can be represented by matrices once we pick a particular representation basis. Diagonalization is simplest for finite dimensional vector spaces and requires some generalization for infinite dimensional vectors spaces. Examples of operators to which the spectral theorem applies are self-adjoint operators (more generally normal operators on Hilbert spaces). We will explore some of these ideas later in the course. The spectral theorem provides a canonical decomposition, called the spectral decomposition, or eigendecomposition, of the underlying vector space on which it acts.

    The next theorem tells us how to diagonalize a matrix:

    Theorem 2.23.

    Let \(A\) be an \(n \times n\) matrix. Then \(A\) is diagonalizable if and only if \(A\) has \(n\) linearly independent eigenvectors. If so, then

    \[D=P^{-1} A P \nonumber \]

    If \(\left\{v_{1}, \ldots, v_{n}\right\}\) are the eigenvectors of \(A\) and \(\left\{\lambda_{1}, \ldots, \lambda_{n}\right\}\) are the corresponding eigenvalues, then \(v_{j}\) is the \(j\)th column of \(P\) and \(D_{j j}=\lambda_{j}\).

    A simpler determination results by noting

    Theorem 2.24.

    Let \(A\) be an \(n \times n\) matrix with \(n\) real and distinct eigenvalues. Then \(A\) is diagonalizable.

    Therefore, we need only look at the eigenvalues and determine diagonalizability. In fact, one also has from linear algebra the following result

    Theorem 2.25.

    Let \(A\) be an \(n \times n\) real symmetric matrix. Then \(A\) is diagonalizable.

    Recall that a symmetric matrix is one whose transpose is the same as the matrix, or \(A_{i j}=A_{j i}\).

    Example 2.26. Consider the matrix

    \(A=\left(\begin{array}{lll}
    1 & 2 & 2 \\
    2 & 3 & 0 \\
    2 & 0 & 3
    \end{array}\right)\)

    This is a real symmetric matrix. The characteristic polynomial is found to be

    \[\operatorname{det}(A-\lambda I)=-(\lambda-5)(\lambda-3)(\lambda+1)=0 \nonumber \]

    As before, we can determine the corresponding eigenvectors (for \(\lambda=-1,3,5\), respectively) as

    \(\left(\begin{array}{c}
    -2 \\
    1 \\
    1
    \end{array}\right), \quad\left(\begin{array}{c}
    0 \\
    -1 \\
    1
    \end{array}\right), \quad\left(\begin{array}{l}
    1 \\
    1 \\
    1
    \end{array}\right)\)

    We can use these to construct the diagonalizing matrix \(P\). Namely, we have

    \[P^{-1} A P=\left(\begin{array}{ccc}
    -2 & 0 & 1 \\
    1 & -1 & 1 \\
    1 & 1 & 1
    \end{array}\right)^{-1}\left(\begin{array}{lll}
    1 & 2 & 2 \\
    2 & 3 & 0 \\
    2 & 0 & 3
    \end{array}\right)\left(\begin{array}{ccc}
    -2 & 0 & 1 \\
    1 & -1 & 1 \\
    1 & 1 & 1
    \end{array}\right)=\left(\begin{array}{ccc}
    -1 & 0 & 0 \\
    0 & 3 & 0 \\
    0 & 0 & 5
    \end{array}\right) \label{2.116} \]

    Now diagonalization is an important idea in solving linear systems of first order equations, as we have seen for simple systems. If our system is originally diagonal, that means our equations are completely uncoupled. Let our system take the form

    \[\dfrac{dy}{dt} = Dy \label{2.117} \]

    where \(D\) is diagonal with entries \(\lambda_{i}, i=1, \ldots, n\). The system of equations, \(y_{i}^{\prime}=\lambda_{i} y_{i}\), has solutions

    \[y_{i}(t)=c_{c} e^{\lambda_{i} t}. \nonumber \]

    Thus, it is easy to solve a diagonal system.
    Let \(A\) be similar to this diagonal matrix. Then

    \[\dfrac{d \mathbf{y}}{d t}=P^{-1} A P \mathbf{y} \label{2.118} \]

    This can be rewritten as

    \[\dfrac{d P \mathbf{y}}{d t}=A P \mathbf{y} \nonumber \]

    Defining \(\mathbf{x}=P \mathbf{y}\), we have

    \[\dfrac{d \mathbf{x}}{d t}=A \mathbf{x} \label{2.119} \]

    This simple derivation shows that if \(A\) is diagonalizable, then a transformation of the original system in \(\mathbf{x}\) to new coordinates, or a new basis, results in a simpler system in \(\mathbf{y}\).

    However, it is not always possible to diagonalize a given square matrix. This is because some matrices do not have enough linearly independent vectors, or we have repeated eigenvalues. However, we have the following theorem:

    Theorem 2.27.

    Every \(n \times n\) matrix \(A\) is similar to a matrix of the form

    \(J=\operatorname{diag}\left[J_{1}, J_{2}, \ldots, J_{n}\right]\),

    where

    \[J_{i}=\left(\begin{array}{ccccc}
    \lambda_{i} & 1 & 0 & \cdots & 0 \\
    0 & \lambda_{i} & 1 & \cdots & 0 \\
    \vdots & \ddots & \ddots & \ddots & \vdots \\
    0 & \cdots & 0 & \lambda_{i} & 1 \\
    0 & 0 & \cdots & 0 & \lambda_{i}
    \end{array}\right) \label{2.120} \]

    We will not go into the details of how one finds this Jordan Canonical Form or proving the theorem. In practice you can use a computer algebra system to determine this and the similarity matrix. However, we would still need to know how to use it to solve our system of differential equations.

    Example 2.28. Let's consider a simple system with the \(3 \times 3\) Jordan block

    \(A=\left(\begin{array}{lll}
    2 & 1 & 0 \\
    0 & 2 & 1 \\
    0 & 0 & 2
    \end{array}\right)\)

    The corresponding system of coupled first order differential equations takes the form

    \[\begin{aligned}
    &\dfrac{d x_{1}}{d t}=2 x_{1}+x_{2} \\
    &\dfrac{d x_{2}}{d t}=2 x_{2}+x_{3} \\
    &\dfrac{d x_{3}}{d t}=2 x_{3}
    \end{aligned} \label{2.121} \]

    The last equation is simple to solve, giving \(x_{3}(t)=c_{3} e^{2 t}\). Inserting into the second equation, you have a

    \[\dfrac{d x_{2}}{d t}=2 x_{2}+c_{3} e^{2 t}. \nonumber \]

    Using the integrating factor, \(e^{-2 t}\), one can solve this equation to get \(x_{2}(t)= \left(c_{2}+c_{3} t\right) e^{2 t}\). Similarly, one can solve the first equation to obtain \(x_{1}(t)= \left(c_{1}+c_{2} t+\dfrac{1}{2} c_{3} t^{2}\right) e^{2 t}\)

    This should remind you of a problem we had solved earlier leading to the generalized eigenvalue problem in (2.43). This suggests that there is a more general theory when there are multiple eigenvalues and relating to Jordan canonical forms.
    Let's write the solution we just obtained in vector form. We have

    \[\mathbf{x}(t)=\left[c_{1}\left(\begin{array}{l}
    1 \\
    0 \\
    0
    \end{array}\right)+c_{2}\left(\begin{array}{l}
    t \\
    1 \\
    0
    \end{array}\right)+c_{3}\left(\begin{array}{c}
    \dfrac{1}{2} t^{2} \\
    t \\
    1
    \end{array}\right)\right] e^{2 t} \label{2.122} \]

    It looks like this solution is a linear combination of three linearly independent solutions,

    \[\begin{aligned}
    &\mathbf{x}=\mathbf{v}_{1} e^{2 \lambda t} \\
    &\mathbf{x}=\left(t \mathbf{v}_{1}+\mathbf{v}_{2}\right) e^{\lambda t} \\
    &\mathbf{x}=\left(\dfrac{1}{2} t^{2} \mathbf{v}_{1}+t \mathbf{v}_{2}+\mathbf{v}_{3}\right) e^{\lambda t}
    \end{aligned} \label{2.123} \]

    where \(\lambda=2\) and the vectors satisfy the equations

    \[\begin{aligned}
    &(A-\lambda I) \mathbf{v}_{1}=0 \\
    &(A-\lambda I) \mathbf{v}_{2}=\mathbf{v}_{1} \\
    &(A-\lambda I) \mathbf{v}_{3}=\mathbf{v}_{2}
    \end{aligned} \label{2.124} \]

    and

    \[\begin{aligned}
    &(A-\lambda I) \mathbf{v}_{1}=0 \\
    &(A-\lambda I)^{2} \mathbf{v}_{2}=0 \\
    &(A-\lambda I)^{3} \mathbf{v}_{3}=0
    \end{aligned} \label{2.125} \]

    It is easy to generalize this result to build linearly independent solutions corresponding to multiple roots (eigenvalues) of the characteristic equation.


    This page titled 2.10: Appendix- Diagonalization and Linear Systems is shared under a CC BY-NC-SA 3.0 license and was authored, remixed, and/or curated by Russell Herman via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.

    • Was this article helpful?