Loading [MathJax]/jax/element/mml/optable/GeneralPunctuation.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

8.4: QR-Factorization

( \newcommand{\kernel}{\mathrm{null}\,}\)

One of the main virtues of orthogonal matrices is that they can be easily inverted—the transpose is the inverse. This fact, combined with the factorization theorem in this section, provides a useful way to simplify many matrix calculations (for example, in least squares approximation).

QR-factorization025100 Let A be an m×n matrix with independent columns. A QR-factorization1 of A expresses it as A=QR where Q is m×n with orthonormal columns and R is an invertible and upper triangular matrix with positive diagonal entries.

The importance of the factorization lies in the fact that there are computer algorithms that accomplish it with good control over round-off error, making it particularly useful in matrix calculations. The factorization is a matrix version of the Gram-Schmidt process.

Suppose A=[c1c2cn] is an m×n matrix with linearly independent columns c1,c2,,cn. The Gram-Schmidt algorithm can be applied to these columns to provide orthogonal columns f1,f2,,fn where f1=c1 and

fk=ckckf1

for each k = 2, 3, \dots, n. Now write \mathbf{q}_{k} = \frac{1}{\| \mathbf{f}_{k} \|}\mathbf{f}_{k} for each k. Then \mathbf{q}_{1}, \mathbf{q}_{2}, \dots, \mathbf{q}_{n} are orthonormal columns, and the above equation becomes

\| \mathbf{f}_{k} \| \mathbf{q}_{k} = \mathbf{c}_{k} - (\mathbf{c}_{k}\bullet \mathbf{q}_{1})\mathbf{q}_{1} - (\mathbf{c}_{k}\bullet \mathbf{q}_{2})\mathbf{q}_{2} - \dots - (\mathbf{c}_{k}\bullet \mathbf{q}_{k-1})\mathbf{q}_{k-1} \nonumber

Using these equations, express each \mathbf{c}_{k} as a linear combination of the \mathbf{q}_{i}:

\begin{array}{ccl} \mathbf{c}_{1} &=& \| \mathbf{f}_{1} \| \mathbf{q}_{1} \\ \mathbf{c}_{2} &=& (\mathbf{c}_{2}\bullet \mathbf{q}_{1})\mathbf{q}_{1} + \| \mathbf{f}_{2} \| \mathbf{q}_{2} \\ \mathbf{c}_{3} &=& (\mathbf{c}_{3}\bullet \mathbf{q}_{1})\mathbf{q}_{1} + (\mathbf{c}_{3}\bullet \mathbf{q}_{2})\mathbf{q}_{2} + \| \mathbf{f}_{3} \| \mathbf{q}_{3} \\ \vdots && \vdots \\ \mathbf{c}_{n} &=& (\mathbf{c}_{n}\bullet \mathbf{q}_{1})\mathbf{q}_{1} + (\mathbf{c}_{n}\bullet \mathbf{q}_{2})\mathbf{q}_{2} + (\mathbf{c}_{n}\bullet \mathbf{q}_{3})\mathbf{q}_{3} + \dots + \| \mathbf{f}_{n} \| \mathbf{q}_{n} \end{array} \nonumber

These equations have a matrix form that gives the required factorization:

\begin{aligned} A & = \left[ \begin{array}{ccccc} \mathbf{c}_{1} & \mathbf{c}_{2} & \mathbf{c}_{3} & \cdots & \mathbf{c}_{n} \end{array} \right] \nonumber \\ &= \left[ \begin{array}{ccccc} \mathbf{q}_{1} & \mathbf{q}_{2} & \mathbf{q}_{3} & \cdots & \mathbf{q}_{n} \end{array} \right] \left[ \begin{array}{ccccc} \| \mathbf{f}_{1} \| & \mathbf{c}_{2}\bullet \mathbf{q}_{1} & \mathbf{c}_{3}\bullet \mathbf{q}_{1} & \cdots & \mathbf{c}_{n}\bullet \mathbf{q}_{1} \\ 0 & \| \mathbf{f}_{2} \| & \mathbf{c}_{3}\bullet \mathbf{q}_{2} & \cdots & \mathbf{c}_{n}\bullet \mathbf{q}_{2} \\ 0 & 0 & \| \mathbf{f}_{3} \| & \cdots & \mathbf{c}_{n}\bullet \mathbf{q}_{3} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & \| \mathbf{f}_{n} \| \end{array} \right] \label{matrixFactEq}\end{aligned}

Here the first factor Q = \left[ \begin{array}{ccccc} \mathbf{q}_{1} & \mathbf{q}_{2} & \mathbf{q}_{3} & \cdots & \mathbf{q}_{n} \end{array}\right] has orthonormal columns, and the second factor is an n \times n upper triangular matrix R with positive diagonal entries (and so is invertible). We record this in the following theorem.

QR-Factorization025133 Every m \times n matrix A with linearly independent columns has a QR-factorization A = QR where Q has orthonormal columns and R is upper triangular with positive diagonal entries.

The matrices Q and R in Theorem [thm:025133] are uniquely determined by A; we return to this below.

025139 Find the QR-factorization of A = \left[ \begin{array}{rrr} 1 & 1 & 0 \\ -1 & 0 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{array}\right].

Denote the columns of A as \mathbf{c}_{1}, \mathbf{c}_{2}, and \mathbf{c}_{3}, and observe that \{\mathbf{c}_{1}, \mathbf{c}_{2}, \mathbf{c}_{3}\} is independent. If we apply the Gram-Schmidt algorithm to these columns, the result is:

\mathbf{f}_{1} = \mathbf{c}_{1} = \left[ \begin{array}{r} 1 \\ -1 \\ 0 \\ 0 \end{array}\right], \quad \mathbf{f}_{2} = \mathbf{c}_{2} - \frac{1}{2}\mathbf{f}_{1} = \left[ \def\arraystretch{1.2} \begin{array}{r} \frac{1}{2} \\ \frac{1}{2} \\ 1 \\ 0 \end{array}\right], \mbox{ and } \quad \mathbf{f}_{3} = \mathbf{c}_{3} + \frac{1}{2}\mathbf{f}_{1} - \mathbf{f}_{2} = \left[ \begin{array}{r} 0 \\ 0 \\ 0 \\ 1 \end{array}\right]. \nonumber

Write \mathbf{q}_{j} = \frac{1}{\| \mathbf{f}_{j} \|^2}\mathbf{f}_{j} for each j, so \{\mathbf{q}_{1}, \mathbf{q}_{2}, \mathbf{q}_{3}\} is orthonormal. Then equation ([matrixFactEq]) preceding Theorem [thm:025133] gives A = QR where

\begin{aligned} Q &= \left[ \begin{array}{ccc} \mathbf{q}_{1} & \mathbf{q}_{2} & \mathbf{q}_{3} \end{array}\right] = \left[ \def\arraystretch{1.3}\begin{array}{ccc} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & 0 \\ \frac{-1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & 0 \\ 0 & \frac{2}{\sqrt{6}} & 0 \\ 0 & 0 & 1 \end{array}\right] = \frac{1}{\sqrt{6}} \left[ \begin{array}{ccc} \sqrt{3} & 1 & 0 \\ -\sqrt{3} & 1 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & \sqrt{6} \end{array} \right] \\ R &= \left[ \begin{array}{ccc} \| \mathbf{f}_{1} \| & \mathbf{c}_{2}\bullet \mathbf{q}_{1} & \mathbf{c}_{3}\bullet \mathbf{q}_{1} \\ 0 & \| \mathbf{f}_{2} \| & \mathbf{c}_{3}\bullet \mathbf{q}_{2} \\ 0 & 0 & \| \mathbf{f}_{3} \| \\ \end{array} \right] = \left[ \def\arraystretch{1.5} \begin{array}{ccc} \sqrt{2} & \frac{1}{\sqrt{2}} & \frac{-1}{\sqrt{2}} \\ 0 & \frac{\sqrt{3}}{\sqrt{2}} & \frac{\sqrt{3}}{\sqrt{2}} \\ 0 & 0 & 1 \end{array} \right] = \frac{1}{\sqrt{2}}\left[ \begin{array}{ccc} 2 & 1 & -1 \\ 0 & \sqrt{3} & \sqrt{3} \\ 0 & 0 & \sqrt{2} \end{array} \right] \end{aligned} \nonumber

The reader can verify that indeed A = QR.

If a matrix A has independent rows and we apply QR-factorization to A^{T}, the result is:

025162 If A has independent rows, then A factors uniquely as A = LP where P has orthonormal rows and L is an invertible lower triangular matrix with positive main diagonal entries.

Since a square matrix with orthonormal columns is orthogonal, we have

025166 Every square, invertible matrix A has factorizations A = QR and A = LP where Q and P are orthogonal, R is upper triangular with positive diagonal entries, and L is lower triangular with positive diagonal entries.

In Section [sec:5_6] we found how to find a best approximation \mathbf{z} to a solution of a (possibly inconsistent) system A\mathbf{x} = \mathbf{b} of linear equations: take \mathbf{z} to be any solution of the “normal” equations (A^{T}A)\mathbf{z} = A^{T}\mathbf{b}. If A has independent columns this \mathbf{z} is unique (A^{T}A is invertible by Theorem [thm:015672]), so it is often desirable to compute (A^{T}A)^{-1}. This is particularly useful in least squares approximation (Section [sec:5_6]). This is simplified if we have a QR-factorization of A (and is one of the main reasons for the importance of Theorem [thm:025133]). For if A = QR is such a factorization, then Q^{T}Q = I_{n} because Q has orthonormal columns (verify), so we obtain

A^TA = R^TQ^TQR = R^TR \nonumber

Hence computing (A^{T}A)^{-1} amounts to finding R^{-1}, and this is a routine matter because R is upper triangular. Thus the difficulty in computing (A^{T}A)^{-1} lies in obtaining the QR-factorization of A.

We conclude by proving the uniqueness of the QR-factorization.

025187 Let A be an m \times n matrix with independent columns. If A = QR and A = Q_{1}R_{1} are QR-factorizations of A, then Q_{1} = Q and R_{1} = R.

Write Q = \left[ \begin{array}{cccc} \mathbf{c}_{1} & \mathbf{c}_{2} & \cdots & \mathbf{c}_{n} \end{array} \right] and Q_{1} = \left[ \begin{array}{cccc} \mathbf{d}_{1} & \mathbf{d}_{2} & \cdots & \mathbf{d}_{n} \end{array} \right] in terms of their columns, and observe first that Q^TQ = I_{n} = Q_{1}^TQ_{1} because Q and Q_{1} have orthonormal columns. Hence it suffices to show that Q_{1} = Q (then R_{1} = Q_{1}^TA = Q^TA = R). Since Q_{1}^TQ_{1} = I_{n}, the equation QR = Q_{1}R_{1} gives Q_{1}^TQ = R_{1}R^{-1}; for convenience we write this matrix as

Q_{1}^TQ = R_{1}R^{-1} = \left[ \begin{array}{c} t_{ij} \end{array}\right] \nonumber

This matrix is upper triangular with positive diagonal elements (since this is true for R and R_{1}), so t_{ii} > 0 for each i and t_{ij} = 0 if i > j. On the other hand, the (i, j)-entry of Q_{1}^TQ is \mathbf{d}_{i}^T\mathbf{c}_{j} = \mathbf{d}_{i}\bullet \mathbf{c}_{j}, so we have \mathbf{d}_{i}\bullet \mathbf{c}_{j} = t_{ij} for all i and j. But each \mathbf{c}_{j} is in span \;\{\mathbf{d}_{1}, \mathbf{d}_{2}, \dots, \mathbf{d}_{n}\} because Q = Q_{1}(R_{1}R^{-1}). Hence the expansion theorem gives

\mathbf{c}_{j} = (\mathbf{d}_{1}\bullet \mathbf{c}_{j})\mathbf{d}_{1} + (\mathbf{d}_{2}\bullet \mathbf{c}_{j})\mathbf{d}_{2} + \dots + (\mathbf{d}_{n}\bullet \mathbf{c}_{j})\mathbf{d}_{n} = t_{1j}\mathbf{d}_{1} + t_{2j}\mathbf{d}_{2} + \dots + t_{jj}\mathbf{d}_{i} \nonumber

because \mathbf{d}_{i}\bullet \mathbf{c}_{j} = t_{ij} = 0 if i > j. The first few equations here are

\begin{array}{ccl} \mathbf{c}_{1} &=& t_{11}\mathbf{d}_{1} \\ \mathbf{c}_{2} &=& t_{12}\mathbf{d}_{1} + t_{22}\mathbf{d}_{2} \\ \mathbf{c}_{3} &=& t_{13}\mathbf{d}_{1} + t_{23}\mathbf{d}_{2} + t_{33}\mathbf{d}_{3} \\ \mathbf{c}_{4} &=& t_{14}\mathbf{d}_{1} + t_{24}\mathbf{d}_{2} + t_{34}\mathbf{d}_{3} + t_{44}\mathbf{d}_{4} \\ \vdots && \vdots \end{array} \nonumber

The first of these equations gives 1 = \| \mathbf{c}_{1} \| = \| t_{11}\mathbf{d}_{1} \| = | t_{11} | \| \mathbf{d}_{1} \| = t_{11}, whence \mathbf{c}_{1} = \mathbf{d}_{1}. But then we have t_{12} = \mathbf{d}_{1}\bullet \mathbf{c}_{2} = \mathbf{c}_{1}\bullet \mathbf{c}_{2} = 0, so the second equation becomes \mathbf{c}_{2} = t_{22}\mathbf{d}_{2}. Now a similar argument gives \mathbf{c}_{2} = \mathbf{d}_{2}, and then t_{13} = 0 and t_{23} = 0 follows in the same way. Hence \mathbf{c}_{3} = t_{33}\mathbf{d}_{3} and \mathbf{c}_{3} = \mathbf{d}_{3}. Continue in this way to get \mathbf{c}_{i} = \mathbf{d}_{i} for all i. This means that Q_{1} = Q, which is what we wanted.


  1. This section is not used elsewhere in the book↩

This page titled 8.4: QR-Factorization is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by W. Keith Nicholson (Lyryx Learning Inc.) via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?