9.6: Orthogonal projections and minimization problems
- Page ID
- 261
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Definition 9.6.1 Let \(V \) be a finite-dimensional inner product space and \(U\subset V \) be a subset (but not necessarily a subspace) of \(V\). Then the orthogonal complement of \(U \) is defined to be the set
\[ U^\bot = \{ v \in V \mid \inner{u}{v}=0 ~\rm{for~ all}~ u\in U \} .\]
Note that, in fact, \(U^\bot \) is always a subspace of \(V \) (as you should check!) and that
\[ \{0\}^\bot = V \quad \text{and} \quad V^\bot = \{0\}. \]
In addition, if \(U_{1} \) and \(U_{2} \) are subsets of \(V \) satisfying \(U_1\subset U_2\), then \(U_2^\bot \subset U_1^\bot\). Remarkably, if \(U\subset V \) is a subspace of \(V\), then we can say quite a bit more about \(U^{\bot}\).
Theorem 9.6.2. If \(U\subset V \) is a subspace of \(V\), then \(V=U\oplus U^\bot\).
Proof. We need to show two things:
- \(V=U+U^\bot\).
- \(U\cap U^\bot = \{0\}\).
To show Condition~1 holds, let \((e_1,\ldots,e_m) \) be an orthonormal basis of \(U\). Then, for all \(v\in V\), we can write
\begin{equation} \label{eq:U Ubot}
v=\underbrace{\inner{v}{e_1} e_1 + \cdots + \inner{v}{e_m} e_m}_u +
\underbrace{v-\inner{v}{e_1} e_1 - \cdots - \inner{v}{e_m} e_m}_w. \tag{9.6.1}
\end{equation}
The vector \(u\in U\), and
\begin{equation*}
\inner{w}{e_j} = \inner{v}{e_j} - \inner{v}{e_j} =0, ~\rm{for~ all}~ {j=1,2,\ldots,m , }
\end{equation*}
since \((e_1,\ldots,e_m) \) is an orthonormal list of vectors. Hence, \(w\in U^\bot\). This implies that \(V=U+U^\bot\).
To prove that Condition~2 also holds, let \(v\in U\cap U^\bot\). Then \(v \) has to be orthogonal to every vector in \(U\), including to itself, and so \(\inner{v}{v}=0\). However, this implies \(v=0 \) so that \(U\cap U^\bot=\{0\}\).
Example 9.6.3. \(\mathbb{R}^2 \) is the direct sum of any two orthogonal lines, and \(\mathbb{R}^3 \) is the direct sum of any plane and any line orthogonal to the plane as illustrated in Figure 9.6.1. For example,
\begin{equation*}
\begin{split}
\mathbb{R}^2 &= \{(x,0) \mid x\in \mathbb{R}\} \oplus \{ (0,y) \mid y\in\mathbb{R}\},\\
\mathbb{R}^3 &= \{(x,y,0) \mid x,y\in \mathbb{R}\} \oplus \{(0,0,z) \mid z\in \mathbb{R}\}.
\end{split}
\end{equation*}
Figure 9.6.1: \( \mathbb{R^3} \) as a direct sum of a plane and a line.
Another fundamental fact about the orthogonal complement of a subspace is as follows.
Theorem 9.6.4. If \(U\subset V \) is a subspace of \(V\), then \(U=(U^\bot)^\bot\).
Proof. First we show that \(U\subset (U^\bot)^\bot\). Let \(u\in U\). Then, for all \(v\in U^\bot\), we have \(\inner{u}{v}=0\). Hence, \(u\in (U^\bot)^\bot \) by the definition of \((U^\bot)^\bot\).
Next we show that \((U^\bot)^\bot\subset U\). Suppose \(0\neq v\in (U^\bot)^\bot \) such that \(v\not\in U\), and decompose \(v \) according to Theorem 9.6.2, i.e., as
\begin{equation*}
v = u_1 + u_2 \in U \oplus U^\bot
\end{equation*}
with \(u_1\in U \) and \(u_2\in U^\bot\). Then \(u_2\neq 0 \) since \(v\not\in U\).
Furthermore, \(\inner{u_2}{v} = \inner{u_2}{u_2} \neq 0\). But then \(v \) is not in \((U^\bot)^\bot\), which contradicts our initial assumption. Hence, we must have that \((U^\bot)^\bot\subset U\).
By Theorem 9.6.2, we have the decomposition \(V=U\oplus U^\bot \) for every subspace \(U\subset V\). This allows us to define the orthogonal projection \(P_U \) of \(V \) onto \(U\).
Definition 9.6.5. Let \(U\subset V \) be a subspace of a finite-dimensional inner product space. Every \(v\in V \) can be uniquely written as \(v=u+w \) where \(u\in U \) and \(w\in U^\bot\). Define
\begin{equation*}
\begin{split}
P_U:V &\to V,\\
v &\mapsto u.
\end{split}
\end{equation*}
Note that \(P_U \) is called a projection operator since it satisfies \(P_U^2=P_U\). Further, since we also have
\begin{equation*}
\begin{split}
&\range(P_U)= U,\\
&\kernel(P_U) = U^\bot,
\end{split}
\end{equation*}
it follows that \(\range(P_U) \bot \kernel(P_U)\). Therefore, \(P_U \) is called an orthogonal projection.
The decomposition of a vector \(v\in V\) as given in Equation (9.6.1) yields the formula
\begin{equation} \label{eq:ortho decomp}
P_U v = \inner{v}{e_1} e_1 + \cdots + \inner{v}{e_m} e_m, \tag{9.6.2}
\end{equation}
where \((e_1,\ldots,e_m) \) is any orthonormal basis of \(U\). Equation (9.6.2) is a particularly useful tool for computing such things as the matrix of \(P_{U} \) with respect to the basis \((e_1,\ldots,e_m)\).
Let us now apply the inner product to the following minimization problem: Given a subspace \(U\subset V \) and a vector \(v\in V\), find the vector \(u\in U \) that is closest to the vector \(v\). In other words, we want to make \(\norm{v-u} \) as small as possible. The next proposition shows that \(P_Uv \) is the closest point in \(U \) to the vector \(v \) and that this minimum is, in fact, unique.
Proposition 9.6.6. Let \(U\subset V \) be a subspace of \(V \) and \(v\in V\). Then
\[ \norm{v-P_Uv} \le \norm{v-u} \qquad \text{for every \(u\in U\).}\]
Furthermore, equality holds if and only if \(u=P_Uv\).
Proof. Let \(u\in U \) and set \(P:=P_U \) for short. Then
\begin{equation*}
\begin{split}
\norm{v-P v}^2 &\le \norm{v-P v}^2 + \norm{Pv-u}^2\\
&= \norm{(v-P v)+(P v-u)}^2 = \norm{v-u}^2,
\end{split}
\end{equation*}
where the second line follows from the Pythagorean Theorem 9.3.2~\ref{thm:pythagoras} since \(v-Pv\in U^\bot \) and \(Pv-u\in U\). Furthermore, equality holds only if \(\norm{Pv-u}^2=0\), which is equivalent to \(Pv=u\).
Example 9.6.7. Consider the plane \(U\subset \mathbb{R}^3 \) through 0 and perpendicular to the vector \(u=(1,1,1)\). Using the standard norm on \(\mathbb{R}^3\), we can calculate the distance of the point \(v=(1,2,3) \) to \(U \) using Proposition 9.6.6. In particular, the distance \(d \) between \(v \) and \(U \) is given by \(d=\norm{v-P_Uv}\). Let \((\frac{1}{\sqrt{3}}u,u_1,u_2) \) be a basis for \(\mathbb{R}^3 \) such that \((u_1,u_2) \) is an orthonormal basis of \(U\). Then, by Equation (9.6.2), we have
\begin{align*}
v-P_Uv & = (\frac{1}{3}\inner{v}{u}u+\inner{v}{u_1}u_1+\inner{v}{u_2}u_2) - (\inner{v}{u_1}u_1 +\inner{v}{u_2}u_2)\\
& = \frac{1}{3}\inner{v}{u}u\\
& = \frac{1}{3}\inner{(1,2,3)}{(1,1,1)}(1,1,1)\\
& = (2,2,2).
\end{align*}
Hence, \(d=\norm{(2,2,2)}=2\sqrt{3}\).
Contributors
- Isaiah Lankham, Mathematics Department at UC Davis
- Bruno Nachtergaele, Mathematics Department at UC Davis
- Anne Schilling, Mathematics Department at UC Davis
Both hardbound and softbound versions of this textbook are available online at WorldScientific.com.