Skip to main content
Mathematics LibreTexts

7.3: Orthogonal Diagonalization

  • Page ID
    134852
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    There is a natural way to define a symmetric linear operator \(T\) on a finite dimensional inner product space \(V\). If \(T\) is such an operator, it is shown in this section that \(V\) has an orthogonal basis consisting of eigenvectors of \(T\). This yields another proof of the principal axis theorem in the context of inner product spaces.

    Theorem \(\PageIndex{1}\)

    Let \(T: V \rightarrow V\) be a linear operator on a finite dimensional space \(V\). Then the following conditions are equivalent.
    1. \(V\) has a basis consisting of eigenvectors of \(T\).
    2. There exists a basis \(B\) of \(V\) such that \(M_B(T)\) is diagonal.


    Proof. We have \(M_B(T)=\left[C_B\left[T\left(\mathbf{b}_1\right)\right] C_B\left[T\left(\mathbf{b}_2\right)\right] \cdots C_B\left[T\left(\mathbf{b}_n\right)\right]\right]\) where \(B=\left\{\mathbf{b}_1, \mathbf{b}_2, \ldots, \mathbf{b}_n\right\}\) is any basis of \(V\). By comparing columns:
    \[
    M_B(T)=\left[\begin{array}{cccc}
    \lambda_1 & 0 & \cdots & 0 \\
    0 & \lambda_2 & \cdots & 0 \\
    \vdots & \vdots & & \vdots \\
    0 & 0 & \cdots & \lambda_n
    \end{array}\right] \text { if and only if } T\left(\mathbf{b}_i\right)=\lambda_i \mathbf{b}_i \text { for each } i
    \]
    Theorem 10.3.1 follows.

    Definition: Diagonalizable

    A linear operator \(T\) on a finite dimensional space \(V\) is called diagonalizable if \(V\) has a basis consisting of eigenvectors of \(T\).

    Example \(\PageIndex{1}\)

    Let \(T: \mathbf{P}_2 \rightarrow \mathbf{P}_2\) be given by
    \[
    T\left(a+b x+c x^2\right)=(a+4 c)-2 b x+(3 a+2 c) x^2
    \]
    Find the eigenspaces of \(T\) and hence find a basis of eigenvectors.

    Solution

    If \(B_0=\left\{1, x, x^2\right\}\), then
    \[
    M_{B_0}(T)=\left[\begin{array}{rrr}
    1 & 0 & 4 \\
    0 & -2 & 0 \\
    3 & 0 & 2
    \end{array}\right]
    \]

    \[
    \begin{array}{l}
    \text { so } c_T(x)=(x+2)^2(x-5) \text {, and the eigenvalues of } T \text { are } \lambda=-2 \text { and } \lambda=5 \text {. One sees that } \\
    \left\{\left[\begin{array}{l}
    0 \\
    1 \\
    0
    \end{array}\right],\left[\begin{array}{r}
    4 \\
    0 \\
    -3
    \end{array}\right],\left[\begin{array}{l}
    1 \\
    0 \\
    1
    \end{array}\right]\right\} \text { is a basis of eigenvectors of } M_{B_0}(T) \text {, so } B=\left\{x, 4-3 x^2, 1+x^2\right\} \text { is a } \\
    \text { basis of } \mathbf{P}_2 \text { consisting of eigenvectors of } T \text {. }
    \end{array}
    \]


    If \(V\) is an inner product space, the expansion theorem gives a simple formula for the matrix of a linear operator with respect to an orthogonal basis.

    Theorem \(\PageIndex{2}\)

    Let \(T: V \rightarrow V\) be a linear operator on an inner product space \(V\). If \(B=\left\{\boldsymbol{b}_1, \boldsymbol{b}_2, \ldots, \boldsymbol{b}_n\right\}\) is an orthogonal basis of \(V\), then
    \[
    M_B(T)=\left[\frac{\left\langle\boldsymbol{b}_i, T\left(\boldsymbol{b}_j\right)\right\rangle}{\left\|\boldsymbol{b}_i\right\|^2}\right]
    \]


    Proof. Write \(M_B(T)=\left[a_{i j}\right]\). The \(j\) th column of \(M_B(T)\) is \(C_B\left[T\left(\mathbf{e}_j\right)\right]\), so
    \[
    T\left(\mathbf{b}_j\right)=a_{1 j} \mathbf{b}_1+\cdots+a_{i j} \mathbf{b}_i+\cdots+a_{n j} \mathbf{b}_n
    \]
    On the other hand, the expansion theorem (Theorem 10.2.4) gives
    \[
    \mathbf{v}=\frac{\left\langle\mathbf{b}_1, \mathbf{v}\right\rangle}{\left\|\mathbf{b}_1\right\|^2} \mathbf{b}_1+\cdots+\frac{\left\langle\mathbf{b}_i, \mathbf{v}\right\rangle}{\left\|\mathbf{b}_i\right\|^2} \mathbf{b}_i+\cdots+\frac{\left\langle\mathbf{b}_n, \mathbf{v}\right\rangle}{\left\|\mathbf{b}_n\right\|^2} \mathbf{b}_n
    \]
    for any \(\mathbf{v}\) in \(V\). The result follows by taking \(\mathbf{v}=T\left(\mathbf{b}_j\right)\).

    Example \(\PageIndex{2}\)

    Let \(T: \mathbb{R}^3 \rightarrow \mathbb{R}^3\) be given by
    \[
    T(a, b, c)=(a+2 b-c, 2 a+3 c,-a+3 b+2 c)
    \]
    If the dot product in \(\mathbb{R}^3\) is used, find the matrix of \(T\) with respect to the standard basis \(B=\left\{\mathbf{e}_1, \mathbf{e}_2\right.\), \(\left.\mathbf{e}_3\right\}\) where \(\mathbf{e}_1=(1,0,0), \mathbf{e}_2=(0,1,0), \mathbf{e}_3=(0,0,1)\).

    Solution

    The basis \(B\) is orthonormal, so Theorem 10.3.2 gives
    \[
    M_B(T)=\left[\begin{array}{lll}
    \mathbf{e}_1 \cdot T\left(\mathbf{e}_1\right) & \mathbf{e}_1 \cdot T\left(\mathbf{e}_2\right) & \mathbf{e}_1 \cdot T\left(\mathbf{e}_3\right) \\
    \mathbf{e}_2 \cdot T\left(\mathbf{e}_1\right) & \mathbf{e}_2 \cdot T\left(\mathbf{e}_2\right) & \mathbf{e}_2 \cdot T\left(\mathbf{e}_3\right) \\
    \mathbf{e}_3 \cdot T\left(\mathbf{e}_1\right) & \mathbf{e}_3 \cdot T\left(\mathbf{e}_2\right) & \mathbf{e}_3 \cdot T\left(\mathbf{e}_3\right)
    \end{array}\right]=\left[\begin{array}{rrr}
    1 & 2 & -1 \\
    2 & 0 & 3 \\
    -1 & 3 & 2
    \end{array}\right]
    \]
    Of course, this can also be found in the usual way.

    It is not difficult to verify that an \(n \times n\) matrix \(A\) is symmetric if and only if \(\mathbf{x} \cdot(A \mathbf{y})=(A \mathbf{x}) \cdot \mathbf{y}\) holds for all columns \(\mathbf{x}\) and \(\mathbf{y}\) in \(\mathbb{R}^n\). The analog for operators is as follows:

    Theorem \(\PageIndex{3}\)

    Let \(V\) be a finite dimensional inner product space. The following conditions are equivalent for a linear operator \(T: V \rightarrow V\).
    1. \(\langle\boldsymbol{v}, T(\mathbf{w})\rangle=\langle T(\mathbf{v}), \mathbf{w}\rangle\) for all \(\mathbf{v}\) and \(\mathbf{w}\) in \(V\).
    2. The matrix of \(T\) is symmetric with respect to every orthonormal basis of \(V\).
    3. The matrix of \(T\) is symmetric with respect to some orthonormal basis of \(V\).
    4. There is an orthonormal basis \(B=\left\{\boldsymbol{f}_1, \boldsymbol{f}_2, \ldots, \boldsymbol{f}_n\right\}\) of \(V\) such that \(\left\langle\boldsymbol{f}_i, T\left(\boldsymbol{f}_j\right)\right\rangle=\left\langle T\left(\boldsymbol{f}_i\right), \boldsymbol{f}_j\right\rangle\) holds for all \(i\) and \(j\).

    Proof. 1. \(\Rightarrow\) 2. Let \(B=\left\{\mathbf{f}_1, \ldots, \mathbf{f}_n\right\}\) be an orthonormal basis of \(V\), and write \(M_B(T)=\left[a_{i j}\right]\). Then \(a_{i j}=\left\langle\mathbf{f}_i\right.\), \(\left.T\left(\mathbf{f}_j\right)\right\rangle\) by Theorem 10.3.2. Hence (1.) and axiom \(\mathrm{P} 2\) give
    \[
    a_{i j}=\left\langle\mathbf{f}_i, T\left(\mathbf{f}_j\right)\right\rangle=\left\langle T\left(\mathbf{f}_i\right), \mathbf{f}_j\right\rangle=\left\langle\mathbf{f}_j, T\left(\mathbf{f}_i\right)\right\rangle=a_{j i}
    \]
    for all \(i\) and \(j\). This shows that \(M_B(T)\) is symmetric.
    2. \(\Rightarrow 3\). This is clear.
    3. \(\Rightarrow\) 4. Let \(B=\left\{\mathbf{f}_1, \ldots, \mathbf{f}_n\right\}\) be an orthonormal basis of \(V\) such that \(M_B(T)\) is symmetric. By (3.) and Theorem 10.3.2, \(\left\langle\mathbf{f}_i, T\left(\mathbf{f}_j\right)\right\rangle=\left\langle\mathbf{f}_j, T\left(\mathbf{f}_i\right)\right\rangle\) for all \(i\) and \(j\), so (4.) follows from axiom P2.
    4. \(\Rightarrow 1\). Let \(\mathbf{v}\) and \(\mathbf{w}\) be vectors in \(V\) and write them as \(\mathbf{v}=\sum_{i=1}^n v_i \mathbf{f}_i\) and \(\mathbf{w}=\sum_{j=1}^n w_j \mathbf{f}_j\). Then
    \[
    \begin{aligned}
    \langle\mathbf{v}, T(\mathbf{w})\rangle=\left\langle\sum_i v_i \mathbf{f}_i, \sum_j w_j T \mathbf{f}_j\right\rangle & =\sum_i \sum_j v_i w_j\left\langle\mathbf{f}_i, T\left(\mathbf{f}_j\right)\right\rangle \\
    & =\sum_i \sum_j v_i w_j\left\langle T\left(\mathbf{f}_i\right), \mathbf{f}_j\right\rangle \\
    & =\left\langle\sum_i v_i T\left(\mathbf{f}_i\right), \sum_j w_j \mathbf{f}_j\right\rangle \\
    & =\langle T(\mathbf{v}), \mathbf{w}\rangle
    \end{aligned}
    \]
    where we used (4.) at the third stage. This proves (1.).
    A linear operator \(T\) on an inner product space \(V\) is called symmetric if \(\langle\mathbf{v}, T(\mathbf{w})\rangle=\langle T(\mathbf{v}), \mathbf{w}\rangle\) holds for all \(\mathbf{v}\) and \(\mathbf{w}\) in \(V\).

    Example \(\PageIndex{3}\)

    If \(A\) is an \(n \times n\) matrix, let \(T_A: \mathbb{R}^n \rightarrow \mathbb{R}^n\) be the matrix operator given by \(T_A(\mathbf{v})=A \mathbf{v}\) for all columns \(\mathbf{v}\). If the dot product is used in \(\mathbb{R}^n\), then \(T_A\) is a symmetric operator if and only if \(A\) is a symmetric matrix.

    Solution

    If \(E\) is the standard basis of \(\mathbb{R}^n\), then \(E\) is orthonormal when the dot product is used. We have \(M_E\left(T_A\right)=A\) (by Example 9.1.4), so the result follows immediately from part (3) of Theorem 10.3.3.

    It is important to note that whether an operator is symmetric depends on which inner product is being used (see Exercise 2).

    If \(V\) is a finite dimensional inner product space, the eigenvalues of an operator \(T: V \rightarrow V\) are the same as those of \(M_B(T)\) for any orthonormal basis \(B\) (see Theorem 9.3.3). If \(T\) is symmetric, \(M_B(T)\) is a symmetric matrix and so has real eigenvalues by Theorem 5.5.7. Hence we have the following:

    Theorem \(\PageIndex{4}\)

    A symmetric linear operator on a finite dimensional inner product space has real eigenvalues.

    If \(U\) is a subspace of an inner product space \(V\), recall that its orthogonal complement is the subspace \(U^{\perp}\) of \(V\) defined by
    \[
    U^{\perp}=\{\mathbf{v} \text { in } V \mid\langle\mathbf{v}, \mathbf{u}\rangle=0 \text { for all } \mathbf{u} \text { in } U\}
    \]

    Theorem \(\PageIndex{5}\)

    Let \(T: V \rightarrow V\) be a symmetric linear operator on an inner product space \(V\), and let \(U\) be a \(T\)-invariant subspace of \(V\). Then:
    1. The restriction of \(T\) to \(U\) is a symmetric linear operator on \(U\).
    2. \(U^{\perp}\) is also T-invariant.

    Proof.
    1. \(U\) is itself an inner product space using the same inner product, and condition 1 in Theorem 10.3.3 that \(T\) is symmetric is clearly preserved.
    2. If \(\mathbf{v}\) is in \(U^{\perp}\), our task is to show that \(T(\mathbf{v})\) is also in \(U^{\perp}\); that is, \(\langle T(\mathbf{v}), \mathbf{u}\rangle=0\) for all \(\mathbf{u}\) in \(U\). But if \(\mathbf{u}\) is in \(U\), then \(T(\mathbf{u})\) also lies in \(U\) because \(U\) is \(T\)-invariant, so
    \[
    \langle T(\mathbf{v}), \mathbf{u}\rangle=\langle\mathbf{v}, T(\mathbf{u})\rangle
    \]
    using the symmetry of \(T\) and the definition of \(U^{\perp}\).

    The principal axis theorem (Theorem 8.2.2) asserts that an \(n \times n\) matrix \(A\) is symmetric if and only if \(\mathbb{R}^n\) has an orthogonal basis of eigenvectors of \(A\). The following result not only extends this theorem to an arbitrary \(n\)-dimensional inner product space, but the proof is much more intuitive.

    Theorem \(\PageIndex{6}\)

    Principal Axis Theorem
    The following conditions are equivalent for a linear operator \(T\) on a finite dimensional inner product space \(V\).
    1. T is symmetric.
    2. Vhas an orthogonal basis consisting of eigenvectors of \(T\).

    Proof. 1. \(\Rightarrow 2\). Assume that \(T\) is symmetric and proceed by induction on \(n=\operatorname{dim} V\). If \(n=1\), every nonzero vector in \(V\) is an eigenvector of \(T\), so there is nothing to prove. If \(n \geq 2\), assume inductively that the theorem holds for spaces of dimension less than \(n\). Let \(\lambda_1\) be a real eigenvalue of \(T\) (by Theorem 10.3.4) and choose an eigenvector \(\mathbf{f}_1\) corresponding to \(\lambda_1\). Then \(U=\mathbb{R} \mathbf{f}_1\) is \(T\)-invariant, so \(U^{\perp}\) is also \(T\)-invariant by Theorem 10.3.5 (T is symmetric). Because \(\operatorname{dim} U^{\perp}=n-1\) (Theorem 10.2.6), and because the restriction of \(T\) to \(U^{\perp}\) is a symmetric operator (Theorem 10.3.5), it follows by induction that \(U^{\perp}\) has an orthogonal basis \(\left\{\mathbf{f}_2, \ldots, \mathbf{f}_n\right\}\) of eigenvectors of \(T\). Hence \(B=\left\{\mathbf{f}_1, \mathbf{f}_2, \ldots, \mathbf{f}_n\right\}\) is an orthogonal basis of \(V\), which proves (2.).
    2. \(\Rightarrow 1\). If \(B=\left\{\mathbf{f}_1, \ldots, \mathbf{f}_n\right\}\) is a basis as in (2.), then \(M_B(T)\) is symmetric (indeed diagonal), so \(T\) is symmetric by Theorem 10.3.3.

    The matrix version of the principal axis theorem is an immediate consequence of Theorem 10.3.6. If \(A\) is an \(n \times n\) symmetric matrix, then \(T_A: \mathbb{R}^n \rightarrow \mathbb{R}^n\) is a symmetric operator, so let \(B\) be an orthonormal basis of \(\mathbb{R}^n\) consisting of eigenvectors of \(T_A\) (and hence of \(A\) ). Then \(P^T A P\) is diagonal where \(P\) is the orthogonal matrix whose columns are the vectors in \(B\) (see Theorem 9.2.4).

    Similarly, let \(T: V \rightarrow V\) be a symmetric linear operator on the \(n\)-dimensional inner product space \(V\) and let \(B_0\) be any convenient orthonormal basis of \(V\). Then an orthonormal basis of eigenvectors of \(T\) can be computed from \(M_{B_0}(T)\). In fact, if \(P^T M_{B_0}(T) P\) is diagonal where \(P\) is orthogonal, let \(B=\left\{\mathbf{f}_1, \ldots, \mathbf{f}_n\right\}\) be the vectors in \(V\) such that \(C_{B_0}\left(\mathbf{f}_j\right)\) is column \(j\) of \(P\) for each \(j\). Then \(B\) consists of eigenvectors of \(T\) by Theorem 9.3.3, and they are orthonormal because \(B_0\) is orthonormal. Indeed
    \[
    \left\langle\mathbf{f}_i, \mathbf{f}_j\right\rangle=C_{B_0}\left(\mathbf{f}_i\right) \cdot C_{B_0}\left(\mathbf{f}_j\right)
    \]
    holds for all \(i\) and \(j\), as the reader can verify. Here is an example.

    Example \(\PageIndex{4}\)

    Let \(T: \mathbf{P}_2 \rightarrow \mathbf{P}_2\) be given by
    \[
    T\left(a+b x+c x^2\right)=(8 a-2 b+2 c)+(-2 a+5 b+4 c) x+(2 a+4 b+5 c) x^2
    \]
    Using the inner product \(\left\langle a+b x+c x^2, a^{\prime}+b^{\prime} x+c^{\prime} x^2\right\rangle=a a^{\prime}+b b^{\prime}+c c^{\prime}\), show that \(T\) is symmetric and find an orthonormal basis of \(\mathbf{P}_2\) consisting of eigenvectors.

    Solution

    If \(B_0=\left\{1, x, x^2\right\}\), then \(M_{B_0}(T)=\left[\begin{array}{rrr}8 & -2 & 2 \\ -2 & 5 & 4 \\ 2 & 4 & 5\end{array}\right]\) is symmetric, so \(T\) is symmetric. This matrix was analyzed in Example 8.2.5, where it was found that an orthonormal basis of eigenvectors is \(\left\{\frac{1}{3}\left[\begin{array}{lll}1 & 2 & -2\end{array}\right]^T, \frac{1}{3}\left[\begin{array}{lll}2 & 1 & 2\end{array}\right]^T, \frac{1}{3}\left[\begin{array}{lll}-2 & 2 & 1\end{array}\right]^T\right\}\). Because \(B_0\) is orthonormal, the corresponding orthonormal basis of \(\mathbf{P}_2\) is
    \[
    B=\left\{\frac{1}{3}\left(1+2 x-2 x^2\right), \frac{1}{3}\left(2+x+2 x^2\right), \frac{1}{3}\left(-2+2 x+x^2\right)\right\} .
    \]


    This page titled 7.3: Orthogonal Diagonalization is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by W. Keith Nicholson (Lyryx Learning Inc.) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.