8.4: QR-Factorization
( \newcommand{\kernel}{\mathrm{null}\,}\)
One of the main virtues of orthogonal matrices is that they can be easily inverted—the transpose is the inverse. This fact, combined with the factorization theorem in this section, provides a useful way to simplify many matrix calculations (for example, in least squares approximation).
QR-factorization025100 Let A be an m×n matrix with independent columns. A QR-factorization1 of A expresses it as A=QR where Q is m×n with orthonormal columns and R is an invertible and upper triangular matrix with positive diagonal entries.
The importance of the factorization lies in the fact that there are computer algorithms that accomplish it with good control over round-off error, making it particularly useful in matrix calculations. The factorization is a matrix version of the Gram-Schmidt process.
Suppose A=[c1c2⋯cn] is an m×n matrix with linearly independent columns c1,c2,…,cn. The Gram-Schmidt algorithm can be applied to these columns to provide orthogonal columns f1,f2,…,fn where f1=c1 and
fk=ck−ck∙f1‖f1‖2f1+ck∙f2‖f2‖2f2−⋯−ck∙fk−1‖fk−1‖2fk−1
for each k=2,3,…,n. Now write qk=1‖fk‖fk for each k. Then q1,q2,…,qn are orthonormal columns, and the above equation becomes
‖fk‖qk=ck−(ck∙q1)q1−(ck∙q2)q2−⋯−(ck∙qk−1)qk−1
Using these equations, express each ck as a linear combination of the qi:
c1=‖f1‖q1c2=(c2∙q1)q1+‖f2‖q2c3=(c3∙q1)q1+(c3∙q2)q2+‖f3‖q3⋮⋮cn=(cn∙q1)q1+(cn∙q2)q2+(cn∙q3)q3+⋯+‖fn‖qn
These equations have a matrix form that gives the required factorization:
A=[c1c2c3⋯cn]=[q1q2q3⋯qn][‖f1‖c2∙q1c3∙q1⋯cn∙q10‖f2‖c3∙q2⋯cn∙q200‖f3‖⋯cn∙q3⋮⋮⋮⋱⋮000⋯‖fn‖]
Here the first factor Q=[q1q2q3⋯qn] has orthonormal columns, and the second factor is an n×n upper triangular matrix R with positive diagonal entries (and so is invertible). We record this in the following theorem.
QR-Factorization025133 Every m×n matrix A with linearly independent columns has a QR-factorization A=QR where Q has orthonormal columns and R is upper triangular with positive diagonal entries.
The matrices Q and R in Theorem [thm:025133] are uniquely determined by A; we return to this below.
025139 Find the QR-factorization of A=[110−101011001].
Denote the columns of A as c1, c2, and c3, and observe that {c1,c2,c3} is independent. If we apply the Gram-Schmidt algorithm to these columns, the result is:
f1=c1=[1−100],f2=c2−12f1=[121210], and f3=c3+12f1−f2=[0001].
Write qj=1‖fj‖2fj for each j, so {q1,q2,q3} is orthonormal. Then equation ([matrixFactEq]) preceding Theorem [thm:025133] gives A=QR where
Q=[q1q2q3]=[1√21√60−1√21√6002√60001]=1√6[√310−√31002000√6]R=[‖f1‖c2∙q1c3∙q10‖f2‖c3∙q200‖f3‖]=[√21√2−1√20√3√2√3√2001]=1√2[21−10√3√300√2]
The reader can verify that indeed A=QR.
If a matrix A has independent rows and we apply QR-factorization to AT, the result is:
025162 If A has independent rows, then A factors uniquely as A=LP where P has orthonormal rows and L is an invertible lower triangular matrix with positive main diagonal entries.
Since a square matrix with orthonormal columns is orthogonal, we have
025166 Every square, invertible matrix A has factorizations A=QR and A=LP where Q and P are orthogonal, R is upper triangular with positive diagonal entries, and L is lower triangular with positive diagonal entries.
In Section [sec:5_6] we found how to find a best approximation z to a solution of a (possibly inconsistent) system Ax=b of linear equations: take z to be any solution of the “normal” equations (ATA)z=ATb. If A has independent columns this z is unique (ATA is invertible by Theorem [thm:015672]), so it is often desirable to compute (ATA)−1. This is particularly useful in least squares approximation (Section [sec:5_6]). This is simplified if we have a QR-factorization of A (and is one of the main reasons for the importance of Theorem [thm:025133]). For if A=QR is such a factorization, then QTQ=In because Q has orthonormal columns (verify), so we obtain
ATA=RTQTQR=RTR
Hence computing (ATA)−1 amounts to finding R−1, and this is a routine matter because R is upper triangular. Thus the difficulty in computing (ATA)−1 lies in obtaining the QR-factorization of A.
We conclude by proving the uniqueness of the QR-factorization.
025187 Let A be an m×n matrix with independent columns. If A=QR and A=Q1R1 are QR-factorizations of A, then Q1=Q and R1=R.
Write Q=[c1c2⋯cn] and Q1=[d1d2⋯dn] in terms of their columns, and observe first that QTQ=In=QT1Q1 because Q and Q1 have orthonormal columns. Hence it suffices to show that Q1=Q (then R1=QT1A=QTA=R). Since QT1Q1=In, the equation QR=Q1R1 gives QT1Q=R1R−1; for convenience we write this matrix as
QT1Q=R1R−1=[tij]
This matrix is upper triangular with positive diagonal elements (since this is true for R and R1), so tii>0 for each i and tij=0 if i>j. On the other hand, the (i,j)-entry of QT1Q is dTicj=di∙cj, so we have di∙cj=tij for all i and j. But each cj is in span{d1,d2,…,dn} because Q=Q1(R1R−1). Hence the expansion theorem gives
cj=(d1∙cj)d1+(d2∙cj)d2+⋯+(dn∙cj)dn=t1jd1+t2jd2+⋯+tjjdi
because di∙cj=tij=0 if i>j. The first few equations here are
c1=t11d1c2=t12d1+t22d2c3=t13d1+t23d2+t33d3c4=t14d1+t24d2+t34d3+t44d4⋮⋮
The first of these equations gives 1=‖c1‖=‖t11d1‖=|t11|‖d1‖=t11, whence c1=d1. But then we have t12=d1∙c2=c1∙c2=0, so the second equation becomes c2=t22d2. Now a similar argument gives c2=d2, and then t13=0 and t23=0 follows in the same way. Hence c3=t33d3 and c3=d3. Continue in this way to get ci=di for all i. This means that Q1=Q, which is what we wanted.
- This section is not used elsewhere in the book↩