Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

6.3: Orthogonal Projection

( \newcommand{\kernel}{\mathrm{null}\,}\)

Learning Objectives
  1. Understand the orthogonal decomposition of a vector with respect to a subspace.
  2. Understand the relationship between orthogonal decomposition and orthogonal projection.
  3. Understand the relationship between orthogonal decomposition and the closest vector on / distance to a subspace.
  4. Learn the basic properties of orthogonal projections as linear transformations and as matrix transformations.
  5. Recipes: orthogonal projection onto a line, orthogonal decomposition by solving a system of equations, orthogonal projection via a complicated matrix product.
  6. Pictures: orthogonal decomposition, orthogonal projection.
  7. Vocabulary words: orthogonal decomposition, orthogonal projection.

Let W be a subspace of Rn and let x be a vector in Rn. In this section, we will learn to compute the closest vector xW to x in W. The vector xW is called the orthogonal projection of x onto W. This is exactly what we will use to almost solve matrix equations, as discussed in the introduction to Chapter 6.

Orthogonal Decomposition

We begin by fixing some notation.

Definition 6.3.1: Notation

Let W be a subspace of Rn and let x be a vector in Rn. We denote the closest vector to x on W by xW.

To say that xW is the closest vector to x on W means that the difference xxW is orthogonal to the vectors in W:

A 3D graph showing a purple plane labeled W, with a blue point \( x_w \) on the plane and a red point \( x \) above it. A vector points from \( x_w \) to \( x \), intersecting the plane perpendicularly.

Figure 6.3.1

In other words, if xW=xxW, then we have x=xW+xW, where xW is in W and xW is in W. The first order of business is to prove that the closest vector always exists.

Theorem 6.3.1: Orthogonal Decomposition

Let W be a subspace of Rn and let x be a vector in Rn. Then we can write x uniquely as

x=xW+xW

where xW is the closest vector to x on W and xW is in W.

Proof

Let m=dim(W), so nm=dim(W) by Fact 6.2.1 in Section 6.2. Let v1,v2,,vm be a basis for W and let vm+1,vm+2,,vn be a basis for W. We showed in the proof of this Fact 6.2.1 in Section 6.2 that {v1,v2,,vm,vm+1,vm+2,,vn} is linearly independent, so it forms a basis for Rn. Therefore, we can write

x=(c1v1++cmvm)+(cm+1vm+1++cnvn)=xW+xW,

where xW=c1v1++cmvm and xW=cm+1vm+1++cnvn. Since xW is orthogonal to W, the vector xW is the closest vector to x on W, so this proves that such a decomposition exists.

As for uniqueness, suppose that

x=xW+xW=yW+yW

for xW,yW in W and xW,yW in W. Rearranging gives

xWyW=yWxW.

Since W and W are subspaces, the left side of the equation is in W and the right side is in W. Therefore, xWyW is in W and in W, so it is orthogonal to itself, which implies xWyW=0. Hence xW=yW and xW=yW, which proves uniqueness.

Definition 6.3.2: Orthogonal Decomposition and Orthogonal Projection

Let W be a subspace of Rn and let x be a vector in Rn. The expression

x=xW+xW

for xW in W and xW in W, is called the orthogonal decomposition of x with respect to W, and the closest vector xW is the orthogonal projection of x onto W.

Since xW is the closest vector on W to x, the distance from x to the subspace W is the length of the vector from xW to x, i.e., the length of xW. To restate:

Note 6.3.1: Closest Vector and Distance

Let W be a subspace of Rn and let x be a vector in Rn.

  • The orthogonal projection xW is the closest vector to x in W.
  • The distance from x to W is xW.
Example 6.3.1: Orthogonal decomposition with respect to the xy-plane

Let W be the xy-plane in R3, so W is the z-axis. It is easy to compute the orthogonal decomposition of a vector with respect to this W:

(6.3.1)x=(123)xW=(120)xW=(003)x=(abc)xW=(ab0)xW=(00c).

We see that the orthogonal decomposition in this case expresses a vector in terms of a “horizontal” component (in the xy-plane) and a “vertical” component (on the z-axis).

3D graph with a purple grid plane labeled W. Axes are marked, and vectors are shown with labels x_proj and e_perp in green and purple.

Figure 6.3.2

3D plot with a purple plane and vectors within a box grid. A matrix is shown in a box on the top left. Contains labels for axes and vectors.

Figure 6.3.3: Orthogonal decomposition of a vector with respect to the xy-plane in R3. Note that xW is in the xy-plane and xW is in the z-axis. Click and drag the head of the vector x to see how the orthogonal decomposition changes.
Example 6.3.2: Orthogonal decomposition of a vector in W

If x is in a subspace W, then the closest vector to x in W is itself, so x=xW and xW=0. Conversely, if x=xW then x is contained in W because xW is contained in W.

Example 6.3.3: Orthogonal decomposition of a vector in W

If W is a subspace and x is in W, then the orthogonal decomposition of x is x=0+x, where 0 is in W and x is in W. It follows that xW=0. Conversely, if xW=0 then the orthogonal decomposition of x is x=xW+xW=0+xW, so x=xW is in W.

Example 6.3.4: Interactive: Orthogonal decomposition in R2

A graph showing vectors: a purple vector labeled \( \begin{pmatrix} 3.00 \\ 2.00 \end{pmatrix} \), a green vector \( \begin{pmatrix} 4.00 \\ -1.33 \end{pmatrix} \), and a black vector sum \( \begin{pmatrix} 7.00 \\ 0.67 \end{pmatrix} \).

Figure 6.3.4: Orthogonal decomposition of a vector with respect to a line W in R2. Note that xW is in W and xW is in the line perpendicular to W. Click and drag the head of the vector x to see how the orthogonal decomposition changes.
Example 6.3.5: Interactive: Orthogonal decomposition in R3

3D graph showing a pink plane with a vector originating from the origin. An inset in the top left shows a matrix equation related to the 3D plot.

Figure 6.3.5: Orthogonal decomposition of a vector with respect to a plane W in R3. Note that xW is in W and xW is in the line perpendicular to W. Click and drag the head of the vector x to see how the orthogonal decomposition changes.
Example 6.3.6: Interactive: Orthogonal decomposition in R3

3D plot showing transformation of a vector under a matrix. Original vector in purple, transformed vector in green. Matrix and vector details are on the left.

Figure 6.3.6: Orthogonal decomposition of a vector with respect to a line W in R3. Note that xW is in W and xW is in the plane perpendicular to W. Click and drag the head of the vector x to see how the orthogonal decomposition changes.

Now we turn to the problem of computing xW and xW. Of course, since xW=xxW, really all we need is to compute xW. The following theorem gives a method for computing the orthogonal projection onto a column space. To compute the orthogonal projection onto a general subspace, usually it is best to rewrite the subspace as the column space of a matrix, as in Note 2.6.3 in Section 2.6.

Theorem6.3.2

Let A be an m×n matrix, let W=Col(A), and let x be a vector in Rm. Then the matrix equation

ATAc=ATx

in the unknown vector c is consistent, and xW is equal to Ac for any solution c.

Proof

Let x=xW+xW be the orthogonal decomposition with respect to W. By definition xW lies in W=Col(A) and so there is a vector c in Rn with Ac=xW. Choose any such vector c. We know that xxW=xAc lies in W, which is equal to Nul(AT) by Recipe: Shortcuts for Computing Orthogonal Complements in Section 6.2. We thus have

0=AT(xAc)=ATxATAc

and so

ATAc=ATx.

This exactly means that ATAc=ATx is consistent. If c is any solution to ATAc=ATx then by reversing the above logic, we conclude that xW=Ac.

Example 6.3.7: Orthogonal Projection onto a Line

Let L=Span{u} be a line in Rn and let x be a vector in Rn. By Theorem 6.3.2, to find xL we must solve the matrix equation uTuc=uTx, where we regard u as an n×1 matrix (the column space of this matrix is exactly L!). But uTu=uu and uTx=ux, so c=(ux)/(uu) is a solution of uTuc=uTx, and hence xL=uc=(ux)/(uu)u.

Diagram showing vector decomposition. Vector \(x\) (red) is projected onto line \(L\) as \(x_L\) (black), with orthogonal component \(x_{L^\bot}\) (green). Equation: \(x_L = \frac{u \cdot x}{u \cdot u} u\).

Figure 6.3.7

To reiterate:

Recipe: Orthogonal Projection onto a Line

If L=Span{u} is a line, then

xL=uxuuuandxL=xxL

for any vector x.

Remark: Simple proof for the formula for projection onto a line

In the special case where we are projecting a vector x in Rn onto a line L=Span{u}, our formula for the projection can be derived very directly and simply. The vector xL is a multiple of u, say xL=cu. This multiple is chosen so that xxL=xcu is perpendicular to u, as in the following picture.

Diagram of vectors depicting vector \( x \) in red, vector \( u \) in black along line \( L \), and vector \( x - cu \) in green, with the equation \( c = \frac{u \cdot x}{u \cdot u} \).

Figure 6.3.8

In other words,

(xcu)u=0.

Using the distributive property for the dot product and isolating the variable c gives us that

c=uxuu

and so

xL=cu=uxuuu.

Example 6.3.8: Projection onto a line in R2

Compute the orthogonal projection of x=(64) onto the line L spanned by u=(32), and find the distance from x to L.

Solution

First we find

xL=xuuuu=18+89+4(32)=1013(32)xL=xxL=113(4872).

The distance from x to L is

xL=113482+7226.656.

A diagram showing vectors on a coordinate plane. Vector from (-6,4) to (3,2) is red. Labeled line L intersects at a different blue point. Dashed lines indicate vector components.

Figure 6.3.9

A graph shows vectors with matrix calculations. Vector \(a_1\) is shown with purple lines, vector \(a_2\) with green lines. The angle and magnitudes are indicated, and a small Closer label is present.

Figure 6.3.10: Distance from the line L.
Example 6.3.9: Projection onto a line in R3

Let

x=(231)u=(111),

and let L be the line spanned by u. Compute xL and xL.

Solution

xL=xuuuu=2+311+1+1(111)=43(111)xL=xxL=13(257).

3D graph showing a matrix representation of linear transformation with vectors, labeled angles, and axes. A matrix is shown at the top left, and a highlighted plane connects points in space.

Figure 6.3.11: Orthogonal projection onto the line L.

When A is a matrix with more than one column, computing the orthogonal projection of x onto W=Col(A) means solving the matrix equation ATAc=ATx. In other words, we can compute the closest vector by solving a system of linear equations. To be explicit, we state Theorem 6.3.2 as a recipe:

Recipe: Compute an Orthogonal Decomposition

Let W be a subspace of Rm. Here is a method to compute the orthogonal decomposition of a vector x with respect to W:

  1. Rewrite W as the column space of a matrix A. In other words, find a a spanning set for W, and let A be the matrix with those columns.
  2. Compute the matrix ATA and the vector ATx.
  3. Form the augmented matrix for the matrix equation ATAc=ATx in the unknown vector c, and row reduce.
  4. This equation is always consistent; choose one solution c. Then xW=AcxW=xxW.
Example 6.3.10: Projection onto the xy-plane

Use Theorem 6.3.2 to compute the orthogonal decomposition of a vector with respect to the xy-plane in R3.

Solution

A basis for the xy-plane is given by the two standard coordinate vectors

e1=(100)e2=(010).

Let A be the matrix with columns e1,e2:

A=(100100).

Then

ATA=(1001)=I2AT(x1x2x3)=(100010)(x1x2x3)=(x1x2).

It follows that the unique solution c of ATAc=I2c=ATx is given by the first two coordinates of x, so

xW=A(x1x2)=(100100)(x1x2)=(x1x20)xW=xxW=(00x3).

We have recovered this Example 6.3.1.

Example 6.3.11: Projection onto a plane in R3

Let

W=Span{(101),(110)}x=(123).

Compute xW and the distance from x to W.

Solution

We have to solve the matrix equation ATAc=ATx, where

A=(110110).

We have

ATA=(2112)ATx=(23).

We form an augmented matrix and row reduce:

(212123)RREF(107/3018/3)c=13(78).

It follows that

xW=Ac=13(187)xW=xxW=13(222).

The distance from x to W is

xW=134+4+41.155.

3D vector diagram with a matrix and two vectors. A parallelogram is shaded in purple within a cube. Vectors are represented with arrows, and calculations are shown in a box to the top left.

Figure 6.3.12: Orthogonal projection onto the plane W.
Example 6.3.12: Projection onto another plane in R3

Let

W={(x1x2x3)|x12x2=x3}andx=(111).

Compute xW.

Solution

Method 1: First we need to find a spanning set for W. We notice that W is the solution set of the homogeneous equation x12x2x3=0, so W=Nul(121). We know how to compute a basis for a null space: we row reduce and find the parametric vector form. The matrix (121) is already in reduced row echelon form. The parametric form is x1=2x2+x3, so the parametric vector form is

(x1x2x3)=x2(210)+x3(101),

and hence a basis for V is given by

{(210),(101)}.

We let A be the matrix whose columns are our basis vectors:

A=(211001).

Hence Col(A)=Nul(121)=W.

Now we can continue with step 1 of the recipe. We compute

ATA=(5222)ATx=(32).

We write the linear system ATAc=ATx as an augmented matrix and row reduce:

(523222)RREF(101/3012/3).

Hence we can take c=(1/32/3), so

xW=Ac=(211001)(1/32/3)=13(412).

3D plot of a purple plane intersecting a cube with green axes. A matrix equation is visible in the top left, showing matrix transformations.

Figure 6.3.13: Orthogonal projection onto the plane W.

Method 2: In this case, it is easier to compute xW. Indeed, since W=Nul(121), the orthogonal complement is the line

V=W=Col(121).

Using the formula for projection onto a line, Example 6.3.7, gives

xW=xV=(111)(121)(121)(121)(121)=13(121).

Hence we have

xW=xxW=(111)13(121).=13(412),

as above.

Example 6.3.13: Projection onto a 3-space in R4

Let

W=Span{(1010),(0101),(1111)}x=(0134).

Compute the orthogonal decomposition of x with respect to W.

Solution

We have to solve the matrix equation ATAc=ATx, where

A=(101011101011).

We compute

ATA=(200022024)ATx=(330).

We form an augmented matrix and row reduce:

(200302230240)RREF(1003/201030013/2)c=12(363).

It follows that

xW=Ac=12(0363)xW=12(0505).

In the context of the above recipe, if we start with a basis of W, then it turns out that the square matrix ATA is automatically invertible! (It is always the case that ATA is square and the equation ATAc=ATx is consistent, but ATA need not be invertible in general.)

Corollary 6.3.1

Let A be an m×n matrix with linearly independent columns and let W=Col(A). Then the n×n matrix ATA is invertible, and for all vectors x in Rm, we have

xW=A(ATA)1ATx.

Proof

We will show that Nul(ATA)={0}, which implies invertibility by Theorem 5.1.1 in Section 5.1. Suppose that ATAc=0. Then ATAc=AT0, so 0W=Ac by Theorem 6.3.2. But 0W=0 (the orthogonal decomposition of the zero vector is just 0=0+0), so Ac=0, and therefore c is in Nul(A). Since the columns of A are linearly independent, we have c=0, so Nul(ATA)=0, as desired.

Let x be a vector in Rn and let c be a solution of ATAc=ATx. Then c=(ATA)1ATx, so xW=Ac=A(ATA)1ATx.

The corollary applies in particular to the case where we have a subspace W of Rm, and a basis v1,v2,,vn for W. To apply the corollary, we take A to be the m×n matrix with columns v1,v2,,vn.

Example 6.3.14: Computing a projection

Continuing with the above Example 6.3.11, let

W=Span{(101),(110)}x=(x1x2x3).

Compute xW using the formula xW=A(ATA)1ATx.

Solution

Clearly the spanning vectors are noncollinear, so according to Corollary 6.3.1, we have xW=A(ATA)1ATx, where

A=(110110).

We compute

ATA=(2112)(ATA)1=13(2112),

so

(6.3.2)xW=A(ATA)1ATx=(110110)13(2112)(101110)(x1x2x3)=13(211121112)(x1x2x3)=13(2x1+x2x3x1+2x2+x3x1+x2+2x3).

So, for example, if x=(1,0,0), this formula tells us that xW=(2,1,1).

Orthogonal Projection

In this subsection, we change perspective and think of the orthogonal projection xW as a function of x. This function turns out to be a linear transformation with many nice properties, and is a good example of a linear transformation which is not originally defined as a matrix transformation.

Proposition 6.3.1: Properties of Orthogonal Projections

Let W be a subspace of Rn, and define T:RnRn by T(x)=xW. Then:

  1. T is a linear transformation.
  2. T(x)=x if and only if x is in W.
  3. T(x)=0 if and only if x is in W.
  4. TT=T.
  5. The range of T is W.
Proof
  1. We have to verify the defining properties of linearity, Definition 3.3.1 in Section 3.3. Let x,y be vectors in Rn, and let x=xW+xW and y=yW+yW be their orthogonal decompositions. Since W and W are subspaces, the sums xW+yW and xW+yW are in W and W, respectively. Therefore, the orthogonal decomposition of x+y is (xW+yW)+(xW+yW), so T(x+y)=(x+y)W=xW+yW=T(x)+T(y). Now let c be a scalar. Then cxW is in W and cxW is in W, so the orthogonal decomposition of cx is cxW+cxW, and therefore, T(cx)=(cx)W=cxW=cT(x). Since T satisfies the two defining properties, Definition 3.3.1 in Section 3.3, it is a linear transformation.
  2. See Example 6.3.2.
  3. See Example 6.3.3.
  4. For any x in Rn the vector T(x) is in W, so TT(x)=T(T(x))=T(x) by 2. Any vector x in W is in the range of T, because T(x)=x for such vectors. On the other hand, for any vector x in Rn the output T(x)=xW is in W, so W is the range of T.

We compute the standard matrix of the orthogonal projection in the same way as for any other transformation, Theorem 3.3.1 in Section 3.3: by evaluating on the standard coordinate vectors. In this case, this means projecting the standard coordinate vectors onto the subspace.

Example 6.3.15: Matrix of a projection

Let L be the line in R2 spanned by the vector u=(32), and define T:R2R2 by T(x)=xL. Compute the standard matrix B for T.

Solution

The columns of B are T(e1)=(e1)L and T(e2)=(e2)L. We have

(e1)L=ue1uuu=313(32)(e2)L=ue2uuu=213(32)}B=113(9664).

Example 6.3.16: Matrix of a projection

Let L be the line in R2 spanned by the vector

u=(111),

and define T:R3R3 by T(x)=xL. Compute the standard matrix B for T.

Solution

The columns of B are T(e1)=(e1)L, T(e2)=(e2)L, and T(e3)=(e3)L. We have

(e1)L=ue1uuu=13(111)(e2)L=ue2uuu=13(111)(e3)L=ue3uuu=13(111)}B=13(111111111).

Example 6.3.17: Matrix of a projection

Continuing with Example 6.3.11, let

W=Span{(101),(110)},

and define T:R3R3 by T(x)=xW. Compute the standard matrix B for T.

Solution

The columns of B are T(e1)=(e1)W, T(e2)=(e2)W, and T(e3)=(e3)W. Let

A=(110110).

To compute each (ei)W, we solve the matrix equation ATAc=ATei for c, then use the equality (ei)W=Ac. First we note that

ATA=(2112);ATei=the ith column of AT=(101110).

For e1, we form an augmented matrix and row reduce:

(211121)RREF(101/3011/3)(e1)W=A(1/31/3)=13(211).

We do the same for e2:

(210121)RREF(101/3012/3)(e1)W=A(1/32/3)=13(121)

and for e3:

(211120)RREF(102/3011/3)(e1)W=A(2/31/3)=13(112).

It follows that

B=13(211121112).

In the previous Example 6.3.17, we could have used the fact that

{(101),(110)}

forms a basis for W, so that

T(x)=xW=[A(ATA)1AT]xforA=(110110)

by the Corollary 6.3.1. In this case, we have already expressed T as a matrix transformation with matrix A(ATA)1AT. See this Example 6.3.14.

Note 6.3.2

Let W be a subspace of Rn with basis v1,v2,,vm, and let A be the matrix with columns v1,v2,,vm. Then the standard matrix for T(x)=xW is

A(ATA)1AT.

We can translate the above properties of orthogonal projections, Proposition 6.3.1, into properties of the associated standard matrix.

Proposition 6.3.2: Properties of Projection Matrices

Let W be a subspace of Rn, define T:RnRn by T(x)=xW, and let B be the standard matrix for T. Then:

  1. Col(B)=W.
  2. Nul(B)=W.
  3. B2=B.
  4. If W{0}, then 1 is an eigenvalue of B and the 1-eigenspace for B is W.
  5. If WRn, then 0 is an eigenvalue of B and the 0-eigenspace for B is W.
  6. B is similar to the diagonal matrix with m ones and nm zeros on the diagonal, where m=dim(W).
Proof

The first four assertions are translations of properties 5, 3, 4, and 2 from Proposition 6.3.2, respectively, using this Note 3.1.1 in Section 3.1 and Theorem 3.4.1 in Section 3.4. The fifth assertion is equivalent to the second, by Fact 5.1.2 in Section 5.1.

For the final assertion, we showed in the proof of this Theorem 6.3.1 that there is a basis of Rn of the form {v1,,vm,vm+1,,vn}, where {v1,,vm} is a basis for W and {vm+1,,vn} is a basis for W. Each vi is an eigenvector of B: indeed, for im we have

Bvi=T(vi)=vi=1vi

because vi is in W, and for i>m we have

Bvi=T(vi)=0=0vi

because vi is in W. Therefore, we have found a basis of eigenvectors, with associated eigenvalues 1,,1,0,,0 (m ones and nm zeros). Now we use Theorem 5.4.1 in Section 5.4.

We emphasize that the properties of projection matrices, Proposition 6.3.2, would be very hard to prove in terms of matrices. By translating all of the statements into statements about linear transformations, they become much more transparent. For example, consider the projection matrix we found in Example 6.3.17. Just by looking at the matrix it is not at all obvious that when you square the matrix you get the same matrix back.

Example 6.3.18

Continuing with above Example 6.3.17, we showed that

B=13(211121112)

is the standard matrix of the orthogonal projection onto

W=Span{(101),(110)}.

One can verify by hand that B2=B (try it!). We compute W as the null space of

(101110)RREF(101011).

The free variable is x3, and the parametric form is x1=x3,x2=x3, so that

W=Span{(111)}.

It follows that B has eigenvectors

(101),(110),(111)

with eigenvalues 1,1,0, respectively, so that

B=(111011101)(100010000)(111011101)1.

Remark

As we saw in Example 6.3.18, if you are willing to compute bases for W and W, then this provides a third way of finding the standard matrix B for projection onto W: indeed, if {v1,v2,,vm} is a basis for W and {vm+1,vm+2,,vn} is a basis for W, then

B=(|||v1v2vn|||)(1000010000000000)(|||v1v1vn|||)1,

where the middle matrix in the product is the diagonal matrix with m ones and nm zeros on the diagonal. However, since you already have a basis for W, it is faster to multiply out the expression A(ATA)1AT as in Corollary 6.3.1.

Remark: Reflections

Let W be a subspace of Rn, and let x be a vector in Rn. The reflection of x over W is defined to be the vector

refW(x)=x2xW.

In other words, to find refW(x) one starts at x, then moves to xxW=xW, then continues in the same direction one more time, to end on the opposite side of W.

Graph illustrating vectors in 3D space with axes W and \( W^\perp \). Vectors include \( x \) in red, \( x_w \) in blue, \( -x_{w^\perp} \) in green, and \( \text{ref}_W(x) \) in orange.

Figure 6.3.14

Since xW=xxW, we also have

refW(x)=x2(xxW)=2xWx.

We leave it to the reader to check using the definition that:

  1. refWrefW=IdRn.
  2. The 1-eigenspace of refW is W, and the 1-eigenspace of refW is W.
  3. refW is similar to the diagonal matrix with m=dim(W) ones on the diagonal and nm negative ones.

This page titled 6.3: Orthogonal Projection is shared under a GNU Free Documentation License 1.3 license and was authored, remixed, and/or curated by Dan Margalit & Joseph Rabinoff via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?