Loading [MathJax]/jax/output/HTML-CSS/jax.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

11.1: The Singular Value Decomposition

( \newcommand{\kernel}{\mathrm{null}\,}\)

,The singular value decomposition is another name for the spectral representation of a rectangular matrix. Of course if A is m-by-m and mn then it does not make sense to speak of the eigenvalues of A. We may, however, rely on the previous section to give us relevant spectral representations of the two symmetric matrices

  • ATA
  • AAT

That these two matrices together may indeed tell us 'everything' about A can be gleaned from

N(ATA)=N(A)

N(AAT)=N(AT)

R(ATA)=R(AT)

R(AAT)=R(A)

You have proven the first of these in a previous exercise. The proof of the second is identical. The row and column space results follow from the first two via orthogonality.

On the spectral side, we shall now see that the eigenvalues of AAT and ATA are nonnegative and that their nonzero eigenvalues coincide. Let us first confirm this on the adjacency matrix associated with the unstable swing

A=(010010100001)

The respective products are

AAT=(100020001)

ATA=(1010010010100001)

Analysis of the first is particularly simple. Its null space is clearly just the zero vector while λ1=2 and λ2=1 are its eigenvalues. Their geometric multiplicities are n1=1 and n2=2. In ATA we recognize the S matrix from the exercise in another module and recall that its eigenvalues are λ1=2, λ2=1, and λ3=0 with multiplicities n1=1, n2=2, and n3=1. Hence, at least for this A, the eigenvalues of AAT and ATA are nonnegative and their nonzero eigenvalues coincide. In addition, the geometric multiplicities of the nonzero eigenvalues sum to 3, the rank of A.

Proposition

The eigenvalues of AAT and ATA are nonnegative. Their nonzero eigenvalues, including geometric multiplicities, coincide. The geometric multiplicities of the nonzero eigenvalues sum to the rank of A.

If ATAx=λx then xTATAx=λxTx, i.e., (||Ax||)2=λ(||x||)2 and so λ0. A similar argument works for AAT.

Now suppose that λj>0 and that {xj,k}njk=1 constitutes an orthogonal basis for the eigenspace R(Pj), starting from

ATAxj,k=λjxj,k

we find, on multiplying through (from the left) by A that

AATAxj,k=λjAxj,k

i.e., λj is an eigenvalue of AAT with eigenvector Axj,k, so long as Axj,k0.

It follows from the first paragraph of this proof that ||Axj,k||=λj, which, by hypothesis, is nonzero. Hence,

1knj:(yj,kAxj,kλj

is a collection of unit eigenvectors of AAT associated with λj. Let us now show that these vectors are orthonormal for fixed j.

yTj,iyj,k=1λjxTj,iATAxj,k=xTj,ixj,k=0

We have now demonstrated that if λj>0 is an eigenvalue of ATA of geometric multiplicity nj. Reversing the argument, i.e., generating eigenvectors of ATA from those of AAT we find that the geometric multiplicities must indeed coincide.

Regarding the rank statement, we discern from Equation that if λj>0 then xj,kR(ATA). The union of these vectors indeed constitutes a basis for R(ATA), for anything orthogonal to each of these xj,k necessarily lies in the eigenspace corresponding to a zero eigenvalue, i.e., in N(ATA). As R(ATA)=R(AT) it follows that dimR(ATA)=rdimRATA=r and hence the nj, for λj>0, sum to r.

Let us now gather together some of the separate pieces of the proof. For starters, we order the eigenvalues of ATA from high to low,

λ1>λ2>>λh

and write

ATA=XΛnXT

where

Xj={xj,1,,xj,nj}:(X={X1,,Xh})

and Λn is the nbyn diagonal matrix with λ1 in the first n1 slots, λ2 in the next n2 slots, etc. Similarly

AAT=YΛmYT

where

Yj={yj,1,,yj,nj}:(Y={Y1,,Yh})

and Λm is the mmmm diagonal matrix with λ1 in the first n1 slots, λ2 in the next n2 slots, etc. The yj,k were defined in Equation under the assumption that λj>0. If λj=0 let Yj denote an orthonormal basis for N(AAT). Finally, call

σj=λj

and let Σ denote the m-by-n matrix diagonal matrix with σ1 in the first n1 slots, σ2 in the next n2 slots, etc. Notice that

ΣTΣ=Λn

ΣΣT=Λm

Now recognize that Equation may be written

Axj,k=σjyj,k

and that this is simply the column by column rendition of

AX=YΣ

As XXT=I we may multiply through (from the right) by XT and arrive at the singular value decomposition of A

A=YΣXT

Let us confirm this on the A matrix in Equation. We have

Λ4=(2000010000100000)

X=12(1001020010010020)

Λ3=(200010001)

Y=(010100001)

Hence

Λ=(200001000010)

and so A=YΣXT says that A should coincide with

(010100001)(200001000010)(12001201000001120120)

This indeed agrees with A. It also agrees (up to sign changes on the columns of X with what one receives upon typing[Y, SIG, X] = scd(A) in Matlab.

You now ask what we get for our troubles. I express the first dividend as a proposition that looks to me like a quantitative version of the fundamental theorem of linear algebra.

Proposition

If YΣXT is the singular value decomposition of A then

  1. The rank of A, call it r, is the number of nonzero elements in Σ
  2. The first r columns of X constitute an orthonormal basis for R(AT). The nr last columns of X constitute an orthonormal basis for N(A)
  3. The first r columns of Y constitute an orthonormal basis for R(A). The mr last columns of Y constitute an orthonormal basis for N(AT)

Let us now 'solve' Ax=b with the help of the pseudo-inverse of A. You know the 'right' thing to do, namely reciprocate all of the nonzero singular values. Because m is not necessarily n we must also be careful with dimensions. To be precise, let Σ+ denote the nbym matrix whose first n1 diagonal elements are 1σ1, whose next n2 diagonal elements are 1σ2 and so on. In the case that σh=0, set the final nh diagonal elements of Σ+ to zero. Now, one defines the pseudo-inverse of A to be

A+XΣ+YT

In the case of that A is that appearing in Equation we find

Σ+=(200010001000)

and so

(12001201000001120120)(1200010001000)(010100001)

therefore

A+=(01201000120001)

in agreement with what appears from pinv(A). Let us now investigate the sense in which A+ is the inverse of A. Suppose that bRm and that we wish to solve Ax=b. We suspect that A+b should be a good candidate. Observe by Equation that

(ATA)A+b=XΛnXTXΣ+YTb

because XTX=I

(ATA)A+b=XΛnΣ+YTb

(ATA)A+b=XΣTΣσ+YTb

because ΣTΣΣ+=ΣT

(ATA)A+b=XΣTYTb

(ATA)A+b=ATb

that is A+b satisfies the least-squares problem ATAx=ATb.


This page titled 11.1: The Singular Value Decomposition is shared under a CC BY 1.0 license and was authored, remixed, and/or curated by Steve Cox via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?