6.3: Orthogonal bases and projections
\(\newcommand{\twovec}[2]{\begin{pmatrix} #1 \\ #2 \end{pmatrix} } \)
\(\newcommand{\threevec}[3]{\begin{pmatrix} #1 \\ #2 \\ #3 \end{pmatrix} } \)
\( \newcommand{\fourvec}[4]{\begin{pmatrix} #1 \\ #2 \\ #3 \\ #4 \end{pmatrix} } \)
We know that a linear system \(A\mathbf x=\mathbf b\) is inconsistent when \(\mathbf b\) is not in \(Col(A)\text{,}\) the column space of \(A\text{.}\) In Section 6.5 , we'll develop a strategy for dealing with inconsistent systems by finding \(\widehat{\mathbf{b}}\text{,}\) the vector in \(Col(A)\) that is closest to \(\mathbf b\text{.}\) The equation \(A\mathbf x=\widehat{\mathbf{b}}\) is then consistent and its solution set can provide us with useful information about the original system.
In this section and the next, we'll develop some techniques that enable us to find \(\widehat{\mathbf{b}}\text{,}\) the vector in a given subspace \(W\) that is closest to a given vector \(\mathbf b\text{.}\)
Preview Activity 6.3.1.
For this activity, it will be helpful to recall the distributive property of dot products:
We'll work with the basis of \(\mathbb R^2\) formed by the vectors
- Verify that the vectors \(\mathbf w_1\) and \(\mathbf w_2\) are orthogonal.
- Suppose that \(\mathbf b =\twovec74\) and find the dot products \(\mathbf w_1\cdot\mathbf b\) and \(\mathbf w_2\cdot\mathbf b\text{.}\)
-
We would like to express \(\mathbf b\) as a linear combination of \(\mathbf w_1\) and \(\mathbf w_2\text{,}\) which means that we need to find weights \(c_1\) and \(c_2\) such that
\begin{equation*} \mathbf b = c_1\mathbf w_1 + c_2\mathbf w_2\text{.} \end{equation*}
To find the weight \(c_1\text{,}\) dot both sides of this expression with \(\mathbf w_1\text{:}\)
\begin{equation*} \mathbf b\cdot\mathbf w_1 = (c_1\mathbf w_1 + c_2\mathbf w_2)\cdot\mathbf w_1\text{,} \end{equation*}and apply the distributive property.
- In a similar fashion, find the weight \(c_2\text{.}\)
- Verify that \(\mathbf b = c_1\mathbf w_1+c_2\mathbf w_2\) using the weights you have found.
We frequently ask to write a given vector as a linear combination of given basis vectors. In the past, we have done this by solving a linear system. The preview activity illustrates how this task can be simplified when the basis vectors are orthogonal to one another. We'll explore this and other uses of orthogonal bases in this section.
Orthogonal sets
The preview activity dealt with a basis of \(\mathbb R^2\) formed by two orthogonal vectors. We will more generally consider a set of orthogonal vectors, as described in the next definition.
By an orthogonal set of vectors, we mean a set of nonzero vectors each of which is orthogonal to the others.
The 3-dimensional vectors
form an orthogonal set, which can be verified by computing
Notice that this set of vectors forms a basis for \(\mathbb R^3\text{.}\)
The vectors
form an orthogonal set of 4-dimensional vectors. Since there are only three vectors, this set does not form a basis for \(\mathbb R^4\text{.}\) It does, however, form a basis for a 3-dimensional subspace \(W\) of \(\mathbb R^4\text{.}\)
Suppose that a vector \(\mathbf b\) is a linear combination of an orthogonal set of vectors \(\mathbf w_1,\mathbf w_2,\ldots,\mathbf w_n\text{;}\) that is, suppose that
Just as in the preview activity, we can find the weight \(c_1\) by dotting both sides with \(\mathbf w_1\) and applying the distributive property of dot products:
Notice how the presence of an orthogonal set causes most of the terms in the sum to vanish. In the same way, we find that
so that
We'll record this fact in the following proposition.
If a vector \(\mathbf b\) is a linear combination of an orthogonal set of vectors \(\mathbf w_1,\mathbf w_2,\ldots,\mathbf w_n\text{,}\) then
Using this proposition, we can see that an orthogonal set of vectors must be linearly independent. Suppose, for instance, that \(\mathbf w_1,\mathbf w_2,\ldots,\mathbf w_n\) is a set of nonzero orthogonal vectors and that one of the vectors is a linear combination of the others, say,
We therefore know that
which cannot happen since we know that \(\mathbf w_3\) is nonzero. This tells us that
An orthogonal set of vectors \(\mathbf w_1,\mathbf w_2,\ldots,\mathbf w_n\) is linearly independent.
If the vectors in an orthogonal set have dimension \(m\text{,}\) they form a linearly independent set in \(\mathbb R^m\) and are therefore a basis for the subspace \(W=Span \{\mathbf v_1,\mathbf v_2,\ldots,\mathbf v_n \}\text{.}\) If there are \(m\) vectors in the orthogonal set, they form a basis for \(\mathbb R^m\text{.}\)
Activity 6.3.2.
Consider the vectors
-
Verify that this set forms an orthogonal set of \(3\)-dimensional vectors.
- Explain why we now know that this set of vectors forms a basis for \(\mathbb R^3\text{.}\)
- Suppose that \(\mathbf b=\threevec24{-4}\text{.}\) Find the weights \(c_1\text{,}\) \(c_2\text{,}\) and \(c_3\) that express \(\mathbf b\) as a linear combination \(\mathbf b=c_1\mathbf w_1 + c_2\mathbf w_2 + c_3\mathbf w_3\) using Proposition 6.3.4.
-
If we multiply a vector \(\mathbf v\) by a positive scalar \(s\text{,}\) the length of \(\mathbf v\) is also multiplied by \(s\text{;}\) that is, \(|{s\mathbf v}| = |{\mathbf v}\text{.}\)
Using this observation, find a vector \(\mathbf u_1\) that is parallel to \(\mathbf w_1\) and has length 1. Such vectors are called unit vectors .
- Similarly, find a unit vector \(\mathbf u_2\) that is parallel to \(\mathbf w_2\) and a unit vector \(\mathbf u_3\) that is parallel to \(\mathbf w_3\text{.}\)
- Construct the matrix \(Q=\begin{bmatrix} \mathbf u_1 & \mathbf u_2 & \mathbf u_3 \end{bmatrix}\) and find the product \(Q^TQ\text{.}\) Use Proposition 6.2.8 to explain your result.
This activity introduces an important way of modifying an orthogonal set so that the vectors in the set have unit length. Recall that we may multiply any nonzero vector \(\mathbf w\) by a scalar so that the new vector has length 1. For instance, we know that, if \(s\) is a positive scalar, then \(|{s\mathbf w}| = s|{\mathbf w}| \text{.}\) To obtain a vector \(\mathbf u\) having unit length, we want
so that \(s=1/|{\mathbf w}| \text{.}\) Therefore,
becomes a unit vector parallel to \(\mathbf w\text{.}\)
Orthogonal sets in which the vectors have unit length are called orthonormal and are especially convenient.
An orthonormal set is an orthogonal set of vectors each of which has unit length.
The vectors
are an orthonormal set of vectors in \(\mathbb R^2\) and form an orthonormal basis for \(\mathbb R^2\text{.}\)
If we form the matrix
we find that \(Q^TQ = I\) since Proposition 6.2.8 tells us that
The previous activity and example illustrate the next proposition.
If the columns of the \(m\times n\) matrix \(Q\) form an orthonormal set, then \(Q^TQ = I_n\text{,}\) the \(n\times n\) identity matrix.
Orthgonal projections
We now turn to an important problem that will appear in many forms in the rest of our investigations. Suppose, as shown in Figure 6.3.9, that we have a subspace \(W\) of \(\mathbb R^m\) and a vector \(\mathbf b\) that is not in that subspace. We would like to find the vector \(\widehat{\mathbf{b}}\) in \(W\) that is closest to \(\mathbf b\text{.}\)
To get started, let's consider a simpler problem where we have a line \(L\) in \(\mathbb R^2\text{,}\) defined by the vector \(\mathbf w\text{,}\) and another vector \(\mathbf b\) that is not on the line, as shown on the left of Figure 6.3.10. We wish to find \(\widehat{\mathbf{b}}\text{,}\) the vector on the line that is closest to \(\mathbf b\text{,}\) as illustrated in the right of Figure 6.3.10.
To find \(\widehat{\mathbf{b}}\text{,}\) we require that \(\mathbf b-\widehat{\mathbf{b}}\) be orthogonal to \(L\text{.}\) For instance, if \(\mathbf y\) is another vector on the line, as shown in Figure 6.3.11, then the Pythagorean theorem implies that
which means that \(|{\mathbf b-\mathbf y}| \geq|\mathbf b-\widehat{\mathbf{b}}|\text{.}\) Therefore, \(\widehat{\mathbf{b}}\) is closer to \(\mathbf b\) than any other vector on the line \(L\text{.}\)
Given a vector \(\mathbf b\) in \(\mathbb R^m\) and a subspace \(W\) of \(\mathbb R^m\text{,}\) the orthogonal projection of \(\mathbf b\) onto \(W\) is the vector \(\widehat{\mathbf{b}}\) in \(W\) that is closest to \(\mathbf b\text{.}\) It is characterized by the property that \(\mathbf b-\widehat{\mathbf{b}}\) is orthogonal to \(W\text{.}\)
Activity 6.3.3.
This activity demonstrates how to determine the orthogonal projection of a vector onto a subspace of \(\mathbb R^m\text{.}\)
-
Let's begin by considering a line \(L\text{,}\) defined by the vector \(\mathbf w=\twovec21\text{,}\) and a vector \(\mathbf b=\twovec24\) not on \(L\text{,}\) as illustrated in Figure 6.3.13.
-
To find \(\widehat{\mathbf{b}}\text{,}\) first notice that \(\widehat{\mathbf{b}} = s\mathbf w\) for some scalar \(s\text{.}\) Since \(\mathbf b-\widehat{\mathbf{b}} = \mathbf b - s\mathbf w\) is orthogonal to \(\mathbf w\text{,}\) what do we know about the dot product
\begin{equation*} (\mathbf b-s\mathbf w)\cdot\mathbf w\text{?} \end{equation*}
- Apply the distributive property of dot products to find the scalar \(s\text{.}\) What is the vector \(\widehat{\mathbf{b}}\text{,}\) the orthogonal projection of \(\mathbf b\) onto \(L\text{?}\)
-
More generally, explain why the orthogonal projection of \(\mathbf b\) onto the line defined by \(\mathbf w\) is
\begin{equation*} \widehat{\mathbf{b}}= \frac{\mathbf b\cdot\mathbf w}{\mathbf w\cdot\mathbf w}~\mathbf w\text{.} \end{equation*}
-
To find \(\widehat{\mathbf{b}}\text{,}\) first notice that \(\widehat{\mathbf{b}} = s\mathbf w\) for some scalar \(s\text{.}\) Since \(\mathbf b-\widehat{\mathbf{b}} = \mathbf b - s\mathbf w\) is orthogonal to \(\mathbf w\text{,}\) what do we know about the dot product
-
The same ideas apply more generally. Suppose we have an orthogonal set of vectors \(\mathbf w_1=\threevec22{-1}\) and \(\mathbf w_2=\threevec102\) that define a plane \(W\) in \(\mathbb R^3\text{.}\) If \(\mathbf b=\threevec396\) another vector in \(\mathbb R^3\text{,}\) we seek the vector \(\widehat{\mathbf{b}}\) on the plane \(W\) closest to \(\mathbf b\text{.}\) As before, the vector \(\mathbf b-\widehat{\mathbf{b}}\) will be orthogonal to \(W\text{,}\) as illustrated in Figure 6.3.14.
- The vector \(\mathbf b-\widehat{\mathbf{b}}\) is orthogonal to \(W\text{.}\) What does this say about the dot products: \((\mathbf b-\widehat{\mathbf{b}})\cdot\mathbf w_1\) and \((\mathbf b-\widehat{\mathbf{b}})\cdot\mathbf w_2\text{?}\)
-
Since \(\widehat{\mathbf{b}}\) is in the plane \(W\text{,}\) we can write it as a linear combination \(\widehat{\mathbf{b}} = c_1\mathbf w_1 + c_2\mathbf w_2\text{.}\) Then
\begin{equation*} \mathbf b-\widehat{\mathbf{b}} = \mathbf b - (c_1\mathbf w_1+c_2\mathbf w_2)\text{.} \end{equation*}
Find the weight \(c_1\) by dotting \(\mathbf b-\widehat{\mathbf{b}}\) with \(\mathbf w_1\) and applying the distributive property of dot products. Similarly, find the weight \(c_2\text{.}\)
- What is the vector \(\widehat{\mathbf{b}}\text{,}\) the orthogonal projection of \(\mathbf w\) onto the plane \(W\text{?}\)
-
Suppose that \(W\) is a subspace of \(\mathbb R^m\) with orthogonal basis \(\mathbf w_1,\mathbf w_2,\ldots,\mathbf w_n\) and that \(\mathbf b\) is a vector in \(\mathbb R^m\text{.}\) Explain why the orthogonal projection of \(\mathbf b\) onto \(W\) is the vector
\begin{equation*} \widehat{\mathbf{b}} = \frac{\mathbf b\cdot\mathbf w_1}{\mathbf w_1\cdot\mathbf w_1}~\mathbf w_1 + \frac{\mathbf b\cdot\mathbf w_2}{\mathbf w_2\cdot\mathbf w_2}~\mathbf w_2 + \ldots + \frac{\mathbf b\cdot\mathbf w_n}{\mathbf w_n\cdot\mathbf w_n}~\mathbf w_n\text{.} \end{equation*}
-
Suppose that \(\mathbf u_1,\mathbf u_2,\ldots,\mathbf u_n\) is an
orthonormal
basis for \(W\text{;}\) that is, the vectors are orthogonal to one another and have unit length. Explain why the orthogonal projection is
\begin{equation*} \widehat{\mathbf{b}}= (\mathbf b\cdot\mathbf u_1)~\mathbf u_1 + (\mathbf b\cdot\mathbf u_2)~\mathbf u_2 + \ldots + (\mathbf b\cdot\mathbf u_n)~\mathbf u_n\text{.} \end{equation*}
- If \(Q=\begin{bmatrix} \mathbf u_1 & \mathbf u_2 & \ldots & \mathbf u_n \end{bmatrix}\) is the matrix whose columns are an orthonormal basis of \(W\text{,}\) use Proposition 6.2.8 to explain why \(\widehat{\mathbf{b}} = QQ^T\mathbf b\text{.}\)
In all the cases considered in the activity, we are looking for \(\widehat{\mathbf{b}}\text{,}\) the vector in a subspace \(W\) closest to a vector \(\mathbf b\text{,}\) which is found by requiring that \(\mathbf b-\widehat{\mathbf{b}}\) be orthogonal to \(W\text{.}\) This means that \((\mathbf b-\widehat{\mathbf{b}})\cdot\mathbf w = 0\) for any vector \(\mathbf w\) in \(W\text{.}\)
If we have an orthogonal basis \(\mathbf w_1,\mathbf w_2,\ldots,\mathbf w_n\) for \(W\text{,}\) then \(\widehat{\mathbf{b}} = c_1\mathbf w_1+c_w\mathbf w_2+\ldots c_n\mathbf w_n\text{.}\) Therefore,
This leads to the projection formula:
If \(W\) is a subspace of \(\mathbb R^m\) having an orthogonal basis \(\mathbf w_1,\mathbf w_2,\ldots, \mathbf w_n\) and \(\mathbf b\) is a vector in \(\mathbb R^m\text{,}\) then the orthogonal projection of \(\mathbf b\) onto \(W\) is
Caution.
Remember that the projection formula given in Proposition 6.3.15 applies only when the basis \(\mathbf w_1,\mathbf w_2,\ldots,\mathbf w_n\) of \(W\) is orthogonal .
If we have an orthonormal basis \(\mathbf u_1,\mathbf u_2,\ldots,\mathbf u_n\) for \(W\text{,}\) the projection formula simplifies to
If we then form the matrix
this expression may be succintly written
This leads to the following proposition.
If \(\mathbf u_1,\mathbf u_2,\ldots,\mathbf u_n\) is an orthonormal basis for a subspace \(W\) of \(\mathbb R^m\text{,}\) then the matrix transformation that projects vectors in \(\mathbb R^m\) orthogonally onto \(W\) is represented by the matrix \(QQ^T\) where
In the previous activity, we looked at the plane \(W\) defined by the two orthogonal vectors
\[ \mathbf w_1=\threevec22{-1},\hspace{24pt} \mathbf w_2=\threevec102\text{.} \]
We can form an orthonormal basis by scalar multiplying these vectors to have unit length:
\[ \mathbf u_1=\frac13\threevec22{-1} = \threevec{2/3}{2/3}{-1/3},\hspace{24pt} \mathbf u_2=\frac1{\sqrt{5}}\threevec102 = \threevec{1/\sqrt{5}}0{2/\sqrt{5}}\text{.} \]
Using these vectors, we form the matrix
\[Q = \begin{bmatrix} 2/3 & 1/\sqrt{5} \\ 2/3 & 0 \\ -1/3 & 2/\sqrt{5} \\ \end{bmatrix}\text{.} \]
The projection onto the plane \(W\) is then given by the matrix
\[ QQ^T = \begin{bmatrix} 2/3 & 1/\sqrt{5} \\ 2/3 & 0 \\ -1/3 & 2/\sqrt{5} \\ \end{bmatrix} \begin{bmatrix} 2/3 & 2/3 & -1/3 \\ 1/\sqrt{5} & 0 & 2/\sqrt{5} \\ \end{bmatrix} = \begin{bmatrix} {29}/{45} & {4}/{9} & {8}/{45} \\ {4}/{9} & {4}/{9} & -{2}/{9} \\ {8}/{45} & -{2}/{9} & {41}/{45} \end{bmatrix}\text{.}\]
Let's check that this works by considering the vector \(\mathbf b=\threevec100\) and finding \(\widehat{\mathbf{b}}\text{,}\) its orthogonal projection onto the plane \(W\text{.}\) In terms of the original basis \(\mathbf w_1\) and \(\mathbf w_2\text{,}\) the projection formula from Proposition 6.3.15 tells us that
\[
\widehat{\mathbf{b}}=\frac{\mathbf{b} \cdot \mathbf{w}_1}{\mathbf{w}_1 \cdot \mathbf{w}_1} \mathbf{w}_1+\frac{\mathbf{b} \cdot \mathbf{w}_2}{\mathbf{w}_2 \cdot \mathbf{w}_2} \mathbf{w}_2=\left[\begin{array}{c}
29 / 45 \\
4 / 9 \\
8 / 45
\end{array}\right]
\]
Alternatively, we use the matrix \(QQ^T\text{,}\) as in Proposition 6.3.16, to find that
\[
\widehat{\mathbf{b}}=Q Q^T \mathbf{b}=\left[\begin{array}{ccc}
29 / 45 & 4 / 9 & 8 / 45 \\
4 / 9 & 4 / 9 & -2 / 9 \\
8 / 45 & -2 / 9 & 41 / 45
\end{array}\right]\left[\begin{array}{l}
1 \\
0 \\
0
\end{array}\right]=\left[\begin{array}{c}
29 / 45 \\
4 / 9 \\
8 / 45
\end{array}\right]
\]
Activity 6.3.4.
-
Suppose that \(L\) is the line in \(\mathbb R^3\) defined by the vector \(\mathbf w=\threevec{1}{2}{-2}\text{.}\)
- Find an orthonormal basis \(\mathbf u\) for \(L\text{.}\)
- Construct the matrix \(Q = \begin{bmatrix}\mathbf u\end{bmatrix}\) and use it to construct the matrix \(P\) that projects vectors orthogonally onto \(L\text{.}\)
- Use your matrix to find \(\widehat{\mathbf{b}}\text{,}\) the orthogonal projection of \(\mathbf b=\threevec111\) onto \(L\text{.}\)
- Find \(Rank(P)\) and explain its geometric significance.
-
The vectors
\begin{equation*} \mathbf w_1 = \fourvec1111,\hspace{24pt} \mathbf w_2 = \fourvec011{-2} \end{equation*}
form an orthogonal basis of \(W\text{,}\) a two-dimensional subspace of \(\mathbb R^4\text{.}\)
- Use the projection formula from Proposition 6.3.15 to find \(\widehat{\mathbf{b}}\text{,}\) the orthogonal projection of \(\mathbf b=\fourvec92{-2}3\) onto \(W\text{.}\)
- Find an orthonormal basis \(\mathbf u_1\) and \(\mathbf u_2\) for \(W\) and use it to construct the matrix \(P\) that projects vectors orthogonally onto \(W\text{.}\) Check that \(P\mathbf b = \widehat{\mathbf{b}}\text{,}\) the orthogonal projection you found in the previous part of this activity.
- Find \(Rank(P)\) and explain its geometric significance.
- Find a basis for \(W^\perp\text{.}\)
-
Find a vector \(\mathbf b^\perp\) in \(W^\perp\) such that
\begin{equation*} \mathbf b = \widehat{\mathbf{b}} + \mathbf b^\perp. \end{equation*}
- Find the product \(Q^TQ\) and explain your result.
This activity demonstrates one issue of note. We found \(\widehat{\mathbf{b}}\text{,}\) the orthogonal projection of \(\mathbf b\) onto \(W\text{,}\) by requiring that \(\mathbf b-\widehat{\mathbf{b}}\) be orthogonal to \(W\text{.}\) In other words, \(\mathbf b-\widehat{\mathbf{b}}\) is a vector in the orthogonal complement \(W^\perp\text{,}\) which we may denote \(\mathbf b^\perp\text{.}\) This explains the following proposition, which is illustrated in Figure 6.3.19
If \(W\) is a subspace of \(\mathbb R^n\) with orthogonal complement \(W^\perp\text{,}\) then any \(n\)-dimensional vector \(\mathbf b\) can be uniquely written as
Let's summarize what we've found. If \(Q\) is a matrix whose columns \(\mathbf u_1, \mathbf u_2,\ldots,\mathbf u_n\) form an orthonormal set in \(\mathbb R^m\text{,}\) then
- \(Q^TQ = I_n\text{,}\) the \(n\times n\) identity matrix, because this product computes the dot products between the columns of \(Q\text{.}\)
- \(QQ^T\) is the matrix the projects vectors orthogonally onto \(W\text{,}\) the subspace of \(\mathbb R^m\) spanned by \(\mathbf u_1,\ldots,\mathbf u_n\text{.}\)
As we've said before, matrix multiplication depends on the order in which we multiply the matrices, and we see this clearly here.
Because \(Q^TQ=I\text{,}\) there is a temptation to say that \(Q\) is invertible. This is usually not the case, however. Remember that an invertible matrix must be a square matrix, and the matrix \(Q\) will only be square if \(n=m\text{.}\) In this case, there are \(m\) vectors in the orthonormal set so the subspace \(W\) spanned by the vectors \(\mathbf u_1,\mathbf u_2,\ldots,\mathbf u_m\) is \(\mathbb R^m\text{.}\) If \(\mathbf b\) is a vector in \(\mathbb R^m\text{,}\) then \(\widehat{\mathbf{b}}=QQ^T\mathbf b\) is the orthogonal projection of \(\mathbf b\) onto \(\mathbb R^m\text{.}\) In other words, \(QQ^T\mathbf b\) is the closest vector in \(\mathbb R^m\) to \(\mathbf b\text{,}\) and this closest vector must be \(\mathbf b\) itself. Therefore, \(QQ^T\mathbf b = \mathbf b\text{,}\) which means that \(QQ^T=I\text{.}\) In this case, \(Q\) is an invertible matrix.
Consider the orthonormal set of vectors
and the matrix they define
In this case, \(\mathbf u_1\) and \(\mathbf u_2\) span a plane, a 2-dimensional subspace of \(\mathbb R^3\text{.}\) We know that \(Q^TQ = I_2\) and \(QQ^T\) projects vectors orthogonally onto the plane. However, \(Q\) is not a square matrix so it cannot be invertible.
Now consider the orthonormal set of vectors
and the matrix they define
Here, \(\mathbf u_1\text{,}\) \(\mathbf u_2\text{,}\) and \(\mathbf u_3\) form a basis for \(\mathbb R^3\) so that both \(Q^TQ=I_3\) and \(QQ^T=I_3\text{.}\) Therefore, \(Q\) is a square matrix and is invertible.
Moreover, since \(Q^TQ = I\text{,}\) we see that \(Q^{-1} = Q^T\) so finding the inverse of \(Q\) is as simple as writing its transpose. Matrices with this property are very special and will play an important role in our upcoming work. We will therefore give them a special name.
A square \(m\times m\) matrix \(Q\) whose columns form an orthonormal basis for \(\mathbb R^m\) is called orthogonal .
This terminology can be a little confusing. We call a basis orthogonal if the basis vectors are orthogonal to one another. However, a matrix is orthogonal if the columns are orthogonal to one another and have unit length. It pays to keep this in mind when reading statements about orthogonal bases and orthogonal matrices. In the meantime, we record the following proposition.
An orthogonal matrix \(Q\) is invertible and its inverse \(Q^{-1} = Q^T\text{.}\)
Summary
This section introduced orthogonal sets and the projection formula that allows us to project vectors orthogonally onto a subspace.
-
Given an orthogonal set \(\mathbf w_1,\mathbf w_2,\ldots,\mathbf w_n\) that spans an \(n\)-dimensional subspace \(W\) of \(\mathbb R^m\text{,}\) the orthogonal projection of \(\mathbf b\) onto \(W\) is the vector in \(W\) closest to \(\mathbf b\) and may be written as
\begin{equation*} \widehat{\mathbf{b}} = \frac{\mathbf b\cdot\mathbf w_1}{\mathbf w_1\cdot\mathbf w_1}~\mathbf w_1 + \frac{\mathbf b\cdot\mathbf w_2}{\mathbf w_2\cdot\mathbf w_2}~\mathbf w_2 + \ldots + \frac{\mathbf b\cdot\mathbf w_n}{\mathbf w_n\cdot\mathbf w_n}~\mathbf w_n\text{.} \end{equation*}
- If \(\mathbf u_1,\mathbf u_2,\ldots,\mathbf u_n\) is an orthonormal basis of \(W\) and \(Q\) is the matrix whose columns are \(\mathbf u_i\text{,}\) then the matrix \(P=QQ^T\) projects vectors orthogonally onto \(W\text{.}\)
- If the columns of \(Q\) form an orthonormal basis for an \(n\)-dimensional subspace of \(\mathbb R^m\text{,}\) then \(Q^TQ=I_n\text{.}\)
- An orthogonal matrix \(Q\) is a square matrix whose columns form an orthonormal basis. In this case, \(QQ^T=Q^TQ = I\) so that \(Q^{-1} = Q^T\text{.}\)
Exercises 6.3.4Exercises
Suppose that
- Verify that \(\mathbf w_1\) and \(\mathbf w_2\) form an orthogonal basis for a plane \(W\) in \(\mathbb R^3\text{.}\)
- Use Proposition 6.3.15 to find \(\widehat{\mathbf{b}}\text{,}\) the orthogonal projection of \(\mathbf b=\threevec21{-1}\) onto \(W\text{.}\)
- Find an orthonormal basis \(\mathbf u_1\text{,}\) \(\mathbf u_2\) for \(W\text{.}\)
- Find the matrix \(P\) representing the matrix transformation that projects vectors in \(\mathbb R^3\) orthogonally onto \(W\text{.}\) Verify that \(\widehat{\mathbf{b}} = P\mathbf b\text{.}\)
- Determine \(Rank(P)\) and explain its geometric significance.
Consider the vectors
- Explain why these vectors form an orthogonal basis for \(\mathbb R^3\text{.}\)
- Suppose that \(A=\begin{bmatrix} \mathbf w_1 & \mathbf w_2 & \mathbf w_3 \end{bmatrix}\) and evaluate the product \(A^TA\text{.}\) Why is this product a diagonal matrix and what is the significance of the diagonal entries?
- Express the vector \(\mathbf b=\threevec{-3}{-6}3\) as a linear combination of \(\mathbf w_1\text{,}\) \(\mathbf w_2\text{,}\) and \(\mathbf w_3\text{.}\)
- Multiply the vectors \(\mathbf w_1\text{,}\) \(\mathbf w_2\text{,}\) \(\mathbf w_3\) by appropriate scalars to find an orthonormal basis \(\mathbf u_1\text{,}\) \(\mathbf u_2\text{,}\) \(\mathbf u_3\) of \(\mathbb R^3\text{.}\)
- If \(Q=\begin{bmatrix} \mathbf u_1 & \mathbf u_2 & \mathbf u_3 \end{bmatrix}\text{,}\) find the matrix product \(QQ^T\) and explain the result.
Suppose that
form an orthogonal basis for a subspace \(W\) of \(\mathbb R^4\text{.}\)
- Find \(\widehat{\mathbf{b}}\text{,}\) the orthogonal projection of \(\mathbf b=\fourvec{2}{-1}{-6}{7}\) onto \(W\text{.}\)
- Find the vector \(\mathbf b^\perp\) in \(W^\perp\) such that \(\mathbf b = \widehat{\mathbf{b}} + \mathbf b^\perp\text{.}\)
- Find a basis for \(W^\perp\text{.}\) and express \(\mathbf b^\perp\) as a linear combination of the basis vectors.
Consider the vectors
- If \(L\) is the line defined by the vector \(\mathbf w_1\text{,}\) find the vector in \(L\) closest to \(\mathbf b\text{.}\) Call this vector \(\widehat{\mathbf{b}}_1\text{.}\)
- If \(W\) is the subspace spanned by \(\mathbf w_1\) and \(\mathbf w_2\text{,}\) find the vector in \(W\) closest to \(\mathbf b\text{.}\) Call this vector \(\widehat{\mathbf{b}}_2\text{.}\)
- Determine whether \(\widehat{\mathbf{b}}_1\) or \(\widehat{\mathbf{b}}_2\) is closer to \(\mathbf b\) and explain why.
Suppose that \(\mathbf w=\threevec2{-1}2\) defines a line \(L\) in \(\mathbb R^3\text{.}\)
- Find the orthogonal projections of the vectors \(\threevec100\text{,}\) \(\threevec010\text{,}\) \(\threevec001\) onto \(L\text{.}\)
- Find the matrix \(P = \frac{1}{|{\mathbf w}^2}| \mathbf w \mathbf w^T\text{.}\)
- Use Proposition 2.5.4 to explain why the columns of \(P\) are related to the orthogonal projections you found in the first part of this exericse.
Suppose that
form the basis for a plane \(W\) in \(\mathbb R^3\text{.}\)
- Find a basis for the line that is the orthogonal complement \(W^\perp\text{.}\)
- Given the vector \(\mathbf b=\threevec6{-6}2\text{,}\) find \(\mathbf y\text{,}\) the orthogonal projection of \(\mathbf b\) onto the line \(W^\perp\text{.}\)
- Explain why the vector \(\mathbf z = \mathbf b-\mathbf y\) must be in \(W\) and write \(\mathbf z\) as a linear combination of \(\mathbf v_1\) and \(\mathbf v_2\text{.}\)
Determine whether the following statements are true or false and explain your thinking.
- If the columns of \(Q\) form an orthonormal basis for a subspace \(W\) and \(\mathbf w\) is a vector in \(W\text{,}\) then \(QQ^T\mathbf w = \mathbf w\text{.}\)
- An orthogonal set of vectors in \(\mathbb R^8\) can have no more than 8 vectors.
- If \(Q\) is a \(7\times5\) matrix whose columns are orthonormal, then \(QQ^T = I_7\text{.}\)
- If \(Q\) is a \(7\times5\) matrix whose columns are orthonormal, then \(Q^TQ = I_5\text{.}\)
- Suppose that the orthogonal projection of \(\mathbf b\) onto a subspace \(W\) satisfies \(\widehat{\mathbf{b}} = \mathbf 0\text{.}\) Then \(\mathbf b\) is in \(W^\perp\text{.}\)
Suppose that \(Q\) is an orthogonal matrix.
-
Remembering that \(\mathbf v\cdot\mathbf w=\mathbf v^T\mathbf w\text{,}\) explain why
\begin{equation*} Q\mathbf x\cdot(Q\mathbf y) = \mathbf x\cdot\mathbf y. \end{equation*}
-
Explain why \(|{Q\mathbf x}| =|{\mathbf x}|\text{.}\)
This means that the length of a vector is unchanged after multiplying by an orthogonal matrix.
- If \(\lambda\) is a real eigenvalue of \(Q\text{,}\) explain why \(\lambda=\pm1\text{.}\)
Explain why the following statements are true.
- If \(Q\) is an orthogonal matrix, then \(\det Q = \pm 1\text{.}\)
- If \(Q\) is a \(8\times 4\) matrix whose columns are orthonormal, then \(QQ^T\) is an \(8\times8\) matrix whose rank is 4.
- If \(\widehat{\mathbf{b}}\) is the orthogonal projection of \(\mathbf b\) onto a subspace \(W\text{,}\) then \(\mathbf b-\widehat{\mathbf{b}}\) is the orthogonal projection of \(\mathbf b\) onto \(W^\perp\text{.}\)
This exercise is about \(2\times2\) orthogonal matrices.
- In Section 2.6 , we saw that the matrix \(\begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix}\) represents a rotation by an angle \(\theta\text{.}\) Explain why this matrix is an orthogonal matrix.
- We also saw that the matrix \(\begin{bmatrix} \cos\theta & \sin\theta \\ \sin\theta & -\cos\theta \end{bmatrix}\) represents a reflection in a line. Explain why this matrix is an orthogonal matrix.
- Suppose that \(\mathbf u_1=\twovec{\cos\theta}{\sin\theta}\) is a 2-dimensional unit vector. Use a sketch to indicate all the possible vectors \(\mathbf u_2\) such that \(\mathbf u_1\) and \(\mathbf u_2\) form an orthonormal basis of \(\mathbb R^2\text{.}\)
- Explain why every \(2\times2\) orthogonal matrix is either a rotation or a reflection.