15: Diagonalizing Symmetric Matrices
( \newcommand{\kernel}{\mathrm{null}\,}\)
Symmetric matrices have many applications. For example, if we consider the shortest distance between pairs of important cities, we might get a table like this:
DavisSeattleSanFranciscoDavis0200080Seattle200002010SanFrancisco8020100
Encoded as a matrix, we obtain:
M=(02000802000020108020100)=MT.
Definition: symmetric Matrix
A matrix is symmetric if it obeys M=MT.
One nice property of symmetric matrices is that they always have real eigenvalues. Review exercise 1 guides you through the general proof, but here's an example for 2×2 matrices:
Example 15.1:
For a general symmetric 2×2 matrix, we have:
Pλ(abbd)=det(λ−a−b−bλ−d)=(λ−a)(λ−d)−b2=λ2−(a+d)λ−b2+ad⇒λ=a+d2±√b2+(a−d2)2.
Notice that the discriminant 4b2+(a−d)2 is always positive, so that the eigenvalues must be real.
Now, suppose a symmetric matrix M has two distinct eigenvalues λ≠μ and eigenvectors x and y:
Mx=λx,My=μy.
Consider the dot product x⋅y=xTy=yTx and calculate:
xTMy=xTμy=μx⋅y, and xTMy=(yTMx)T (by transposing a 1×1 matrix)=xTMTy=xTMy=xTλy=λx⋅y.
Subtracting these two results tells us that:
0=xTMy−xTMy=(μ−λ)x⋅y.
Since μ and λ were assumed to be distinct eigenvalues, λ−μ is non-zero, and so x⋅y=0. We have proved the following theorem.
Theorem
Eigenvectors of a symmetric matrix with distinct eigenvalues are orthogonal.
Example 15.2:
The matrix M=(2112) has eigenvalues determined by
det(M−λI)=(2−λ)2−1=0.
So the eigenvalues of M are 3 and 1, and the associated eigenvectors turn out to be (11) and (1−1). It is easily seen that these eigenvectors are orthogonal:
(11)⋅(1−1)=0
In chapter 14 we saw that the matrix P built from any orthonormal basis (v1,…,vn) for Rn as its columns,
P=(v1⋯vn),
was an orthogonal matrix:
P−1=PT, or PPT=I=PTP.
Moreover, given any (unit) vector x1, one can always find vectors x2,…,xn such that (x1,…,xn) is an orthonormal basis. (Such a basis can be obtained using the Gram-Schmidt procedure.)
Now suppose M is a symmetric n×n matrix and λ1 is an eigenvalue with eigenvector x1 (this is always the case because every matrix has at least one eigenvalue--see review problem 3). Let the square matrix of column vectors P be the following:
P=(x1x2⋯xn),
where x1 through xn are orthonormal, and x1 is an eigenvector for M, but the others are not necessarily eigenvectors for M. Then
MP=(λ1x1Mx2⋯Mxn).
But P is an orthogonal matrix, so P−1=PT. Then:
P−1=PT=(xT1⋮xTn)⇒PTMP=(xT1λ1x1∗⋯∗xT2λ1x1∗⋯∗⋮⋮xTnλ1x1∗⋯∗)=(λ1∗⋯∗0∗⋯∗⋮∗⋮0∗⋯∗)=(λ10⋯00⋮ˆM0).
The last equality follows since PTMP is symmetric. The asterisks in the matrix are where “stuff'' happens; this extra information is denoted by ˆM in the final expression. We know nothing about ˆM except that it is an (n−1)×(n−1) matrix and that it is symmetric. But then, by finding an (unit) eigenvector for ˆM, we could repeat this procedure successively. The end result would be a diagonal matrix with eigenvalues of M on the diagonal. Again, we have proved a theorem:
Theorem
Every symmetric matrix is similar to a diagonal matrix of its eigenvalues. In other words,
M=MT⇔M=PDPT
where P is an orthogonal matrix and D is a diagonal matrix whose entries are the eigenvalues of M.
To diagonalize a real symmetric matrix, begin by building an orthogonal matrix from an orthonormal basis of eigenvectors:
Example 15.3:
The symmetric matrix
M=(2112),
has eigenvalues 3 and 1 with eigenvectors (11) and (1−1) respectively. After normalizing these eigenvectors, we build the orthogonal matrix:
P=(1√21√21√2−1√2).
Notice that PTP=I. Then:
MP=(3√21√23√2−1√2)=(1√21√21√2−1√2)(3001).
In short, MP=DP, so D=PTMP. Then D is the diagonalized form of M and P the associated change-of-basis matrix from the standard basis to the basis of eigenvectors.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)