9.1: The Spectral Representation of a Symmetric Matrix
Introduction
Our goal is to show that if \(B\) is symmetric then
- each \(\lambda_{j}\) is real,
- each \(P_{j}\) is symmetric and
- each \(D_{j}\) vanishes.
Let us begin with an example.
The transfer function of
\[B = \begin{pmatrix} {1}&{1}&{1}\\ {1}&{1}&{1}\\ {1}&{1}&{1} \end{pmatrix} \nonumber\]
is
\[R(s) = \frac{1}{s(s-3)} \begin{pmatrix} {s-2}&{1}&{1}\\ {1}&{s-2}&{1}\\ {1}&{1}&{s-2} \end{pmatrix} \nonumber\]
\[R(s) = \frac{1}{s} \begin{pmatrix} {2/3}&{-1/3}&{-1/3}\\ {-1/3}&{2/3}&{-1/3}\\ {-1/3}&{-1/3}&{-1/3} \end{pmatrix}+\frac{1}{s-3} \begin{pmatrix} {1/3}&{1/3}&{1/3}\\ {1/3}&{1/3}&{1/3}\\ {1/3}&{1/3}&{1/3} \end{pmatrix}\nonumber\]
\[R(s) = \frac{1}{s-\lambda_{1}}P_{1}+\frac{1}{s-\lambda_{2}}P_{2} \nonumber\]
and so indeed each of the bullets holds true. With each of the \(D_{j}\) falling by the wayside you may also expect that the respective geometric and algebraic multiplicities coincide.
The Spectral Representation
We have amassed anecdotal evidence in support of the claim that each \(D_{j}\) in the spectral representation
\[B = \sum_{j=1}^{h} \lambda_{j}P_{j}+\sum_{j=1}^{h}D_{j} \nonumber\]
is the zero matrix when \(B\) is symmetric, i.e., when \(B = B^T\), or, more generally, when \(B = B^H\) where \(B^H \equiv \overline{B}^T\) Matrices for which \(B = B^H\) are called Hermitian . Of course real symmetric matrices are Hermitian.
Taking the conjugate transpose throughout we find,
\[B^{H} = \sum_{j=1}^{h} \overline{\lambda_{j}}P_{j}^{H}+\sum_{j=1}^{h}D_{j}^{H} \nonumber\]
That is, the \(\overline{\lambda_{j}}\) are the eigenvalues of \(B^H\) with corresponding projections \(P_{j}^H\) and nilpotents \(D_{j}^{H}\) Hence, if \(B = B^H\), we find on equating terms that
\[\lambda_{j} = \overline{\lambda_{j}} \nonumber\]
\[P_{j} = P_{j}^H \nonumber\]
and
\[D_{j} = D_{j}^H \nonumber\]
The former states that the eigenvalues of an Hermitian matrix are real. Our main concern however is with the consequences of the latter. To wit, notice that for arbitrary \(x\)
\[(||D_{j}^{m_j-1}x||)^2 = x^{H}(D_{j}^{m_j-1})^{H}D_{j}^{m_j-1}x \nonumber\]
\[(||D_{j}^{m_j-1}x||)^2 = x^{H}D_{j}^{m_j-1}D_{j}^{m_j-1}x \nonumber\]
\[(||D_{j}^{m_j-1}x||)^2 = x^{H}D_{j}^{m_j-2}D_{j}^{m_j}x \nonumber\]
\[(||D_{j}^{m_j-1}x||)^2 = 0 \nonumber\]
As \(D_{j}^{m_j-1}x = 0\) for every \(x\) it follows (recall the previous exercise) that \(D_{j}^{m_j-1} = 0\). Continuing in this fashion we find \(D_{j}^{m_j-2} = 0\), and so, eventually, \(D_{j} = 0\). If, in addition, \(B\) is real then as the eigenvalues are real and all the \(D_{j}\) vanish, the \(P_{j}\) must also be real. We have now established
If \(B\) is real and symmetric then
\[B = \sum_{j=1}^{h} \lambda_{j} P_{j} \nonumber\]
where the \(\lambda_{j}\) are real and the \(P_{j}\) are real orthogonal projections that sum to the identity and whose pairwise products vanish.
One indication that things are simpler when using the spectral representation is
\[B^{100} = \sum_{j=1}^{h} \lambda_{j}^{100} P_{j} \nonumber\]
As this holds for all powers it even holds for power series. As a result,
\[e^B = \sum_{j=1}^{h} e^{\lambda_{j}} P_{j} \nonumber\]
It is also extremely useful in attempting to solve
\[Bx = b \nonumber\]
for \(x\). Replacing \(B\) by its spectral representation and \(b\) by \(Ib\) or, more to the point by \(\sum_{j} P_{j}b\) we find
\[\sum_{j=1}^{h} \lambda_{j} P_{j} x = \sum_{j=1}^{h} P_{j} b \nonumber\]
Multiplying through by \(P_{1}\) gives \(\lambda_{1}P_{1}x = P_{1}b\) or \(P_{1}x = \frac{P_{1}b}{\lambda_{1}}\). Multiplying through by the subsequent \(P_{j}\)'s gives \(P_{j}x = \frac{P_{j}b}{\lambda_{j}}\)
Hence,
\[x = \sum_{j=1}^{h} P_{j}x \nonumber\]
\[\sum_{j=1}^{h} \frac{1}{\lambda_{j}}P_{j}b \nonumber\]
We clearly run in to trouble when one of the eigenvalues vanishes. This, of course, is to be expected. For a zero eigenvalue indicates a nontrivial null space which signifies dependencies in the columns of \(B\) and hence the lack of a unique solution to \(Bx = b\).
Another way in which may be viewed is to note that, when \(B\) is symmetric, this previous equation takes the form
\[(zI-B)^{-1} = \sum_{j=1}^{h} \frac{1}{z-\lambda_{j}} P_{j} \nonumber\]
Now if 0 is not an eigenvalue we may set \(z=0\) in the above and arrive at
\[B^{-1} = \sum_{j=1}^{h} \frac{1}{\lambda_{j}} P_{j} \nonumber\]
Hence, the solution to \(Bx = b\)B is
\[x = B^{-1}b = \sum_{j=1}^{h} \frac{1}{\lambda_{j}} P_{j} b \nonumber\]
We have finally reached a point where we can begin to define an inverse even for matrices with dependent columns, i.e., with a zero eigenvalue. We simply exclude the offending term in link. Supposing that \(\lambda_{h} = 0\) we define the pseudo-inverse of \(B\) to be
\[B^{+} = \sum_{j=1}^{h-1} \frac{1}{\lambda_{j}} P_{j} \nonumber\]
Let us now see whether it is deserving of its name. More precisely, when \(b \in \mathscr{R}(B)\) we would expect that \(x = B^{+} b\) indeed satisfies \(Bx = b\). Well
\[BB^{+} b = B \sum_{j=1}^{h-1} \frac{1}{\lambda_{j}} P_{j} b = \sum_{j=1}^{h-1} \frac{1}{\lambda_{j}} BP_{j}b = \sum_{j=1}^{h-1} \frac{1}{\lambda_{j}} \lambda_{j}P_{j}b = \sum_{j=1}^{h-1}P_{j}b \nonumber\]
It remains to argue that the latter sum really is \(b\). We know that
\[\forall b,b \in \mathscr{R}(B) : (b = \sum_{j=1}^{h} P_{j}b) \nonumber\]
The latter informs us that \(b \perp N(B^T)\). As \(B = B^T\), we have, in fact, that \(b \perp N(B)\). AS \(P_{h}\) is nothing but orthogonal projection onto \(N(B)\) it follows that \(P_{h}b = 0\) and so \(B(B^{+}b)=b\), that is, \(x = B^{+}b\) is a solution to \(Bx = b\) The representation is unarguably terse and in fact is often written out in terms of individual eigenvectors. Let us see how this is done. Note that if \(x \in \mathscr{R}(P_{1})\) then \(x=P_{1}x\) and so,
\[Bx = BP_{1}x = \sum_{j=1}^{h} \lambda_{j} P_{j}P_{1}x = \lambda_{1}P_{1}x = \lambda_{1}x \nonumber\]
i.e., \(x\) is an eigenvector of \(B\) associated with \(\lambda_{1}\). Similarly, every (nonzero) vector \(\mathscr{R}(P_{j})\) is an eigenvector of \(B\) associated with \(\lambda_{j}\).
Next let us demonstrate that each element of \(\mathscr{R}(P_{j})\) is orthogonal to each element of \(\mathscr{R}(P_{k})\) when \(j \ne k\). If \(x \in \mathscr{R}(P_{j})\) and \(x \in \mathscr{R}(P_{k})\) then
\[x^{T}y = (P_{j}x)^{T}P_{k}y = x^{T}P_{j}P_{k}y = 0 \nonumber\]
With this we note that if \(\{x_{j,1}, x_{j,2}, \cdots, x_{j,n_{j}}\}\) constitutes a basis for \(\mathscr{R}(P_{j})\) then in fact the union of such bases,
\[\{x_{j,p} | (1 \le j \le h) \wedge (1 \le p \le n_{j})\} \nonumber\]
forms a linearly independent set. Notice now that this set has
\[\sum_{j=1}^{h} n_{j} \nonumber\]
elements. That these dimensions indeed sum to the ambient dimension, \(n\), follows directly from the fact that the underlying \(P_{j}\)
sum to the \(n-by-n\) identity matrix. We have just proven.
If \(B\) is real and symmetric and \(n-by-n\), then \(B\) has a set of nn linearly independent eigenvectors.
Getting back to a more concrete version of link we now assemble matrices from the individual bases
\[E_{j} \equiv \{x_{j,1}, x_{j,2}, \cdots, x_{j,n_{j}}\} \nonumber\]
and note, once again, that \(P_{j} = E_{j}(E_{j}^{T}E_{j})^{-1}E_{j}^{T} \nonumber\), and so,
\[B = \sum_{j=1}^{h} \lambda_{j}E_{j}(E_{j}^{T}E_{j})^{-1}E_{j}^{T} \nonumber\]
I understand that you may feel a little overwhelmed with this formula. If we work a bit harder we can remove the presence of the annoying inverse. What I mean is that it is possible to choose a basis for each \(\mathscr{R}(P_{j})\) for which each of the corresponding \(E_{j}\) satisfy \(E_{j}^{T}E_{j} = I\). As this construction is fairly general let us devote a separate section to it (see Gram-Schmidt Orthogonalization).