1.2: Inner Product
- Page ID
- 111926
This page is a draft and is under active development.
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Dot product
In this section we will consider other (geometric) properties of vectors, like the length of a vector and the angle between two vectors. When the angle between two vectors is equal to \(\frac12\pi \), two vectors are perpendicular, which is also known as orthogonal. These properties can all be expressed using a new operator: the inner product or dot product. We will start by considering vectors in \(\mathbb{R}^2 \) and \(\mathbb{R}^3 \). The translation of the concepts to the general space \(\mathbb{R}^n \) will then become more or less immediate.Length and perpendicularity in \(\mathbb{R}^2 \) and \(\mathbb{R}^3 \)
The length of a vector \[ \vect{v}= \left[\begin{array} {r} a_{1}\\a_{2} \end{array}\right] \nonumber \nonumber\] in the plane, which we denote by \(\norm{\vect{v}} \), can be computed using the Pythagorean theorem: \[ \label{Eq:InnerProduct:length-2D} \norm{\vect{v}} = \sqrt{a_1^2+a_2^2} \]Figure \(\PageIndex{1}\): The length of a vector via Pythagoras' Theorem
Figure \(\PageIndex{2}\): The length of a vector via Pythagoras' Theorem
Using this theorem twice we find a similar formula for the length of a vector \[ \vect{v}= \left[\begin{array}{r} a_{1}\\a_{2}\\a_{3}\end{array}\right] \nonumber \nonumber\] in \(\mathbb{R}^3 \). Look at Figure \(\PageIndex{2}\). There are two right triangles: \(\Delta OPQ \) where \(\angle OPQ \) is right, and \(\Delta OQA \) where \(\angle OQA \) is right. From \[ OQ^2 = OP^2 + PQ^2 = a_1^2 + a_2^2, \nonumber \nonumber\] where for two points \(A \) and \(P \), by \(AB \) we denote the length of the vector \(\overline{AB} \), and \[ OA^2 = OQ^2+QA^2 = a_1^2 + a_2^2+a_3^2 \nonumber \nonumber\] we find that \[ \label{Eq:InnerProduct:length-3D} \norm{\vect{v}}= OA = \sqrt{a_1^2 + a_2^2+a_3^2} \]
Figure \(\PageIndex{3}\): Perpendicular versus non-perpendicular
Let us now turn our attention to another important geometric concept, namely that of perpendicularity. It is clear from Figure \(\PageIndex{3}\) that the vectors \( \left[\begin{array}{r}2\\3\end{array}\right] \) and \( \left[\begin{array}{r}-3\\2\end{array}\right] \) are perpendicular, whereas the vectors \( \left[\begin{array}{r}2\\3\end{array}\right] \) and \( \left[\begin{array}{r}-1\\3\end{array}\right] \) are not. But how does this work in \(\mathbb{R}^3 \)? Well, look at Figure \(\PageIndex{4}\):
Figure \(\PageIndex{4}\): Diagonal of a rectangle versus diagonal of a parallelogram
In both pictures, let \(A \) be the end point of vector \(\vect{v} \), \(B \) the end point of vector \(\vect{w} \), and \(C \) the end point of vector \(\vect{v}+\vect{w} \). The diagonals are \[ \overline{OC} = \vect{v}+\vect{w} \quad \text{and} \quad \overline{BA} = \vect{v}-\vect{w} \nonumber \nonumber\] In the left picture of Figure \(\PageIndex{4}\) the vectors \(\vect{v} \) and \(\vect{w} \) are perpendicular, so the parallelogram \(OACB \) is a rectangle. It follows that the two diagonals have the same length: \[ \label{EqualDiagonals} \norm{\vect{v}+\vect{w}} = \norm{\vect{v}-\vect{w}}. \] In the picture on the left the vectors are not perpendicular and \[ \norm{\vect{v}+\vect{w}} \neq \norm{\vect{v}-\vect{w}}. \nonumber \nonumber\] The picture suggests that we are talking about two (non-zero) vectors in the plane, i.e., in \(\mathbb{R}^2 \). However, two vectors in \(\mathbb{R}^3 \) form a parallelogram as well, which becomes a rectangle if and only if the vectors are perpendicular. We introduce a notation for this: if \( \vect{v} \) and \(\vect{w} \) are perpendicular, we write this as \[ \label{Eq:InnerProduct:Orthogonal} \vect{v} \perp \vect{w} \] Taking squares in Equation \eqref{EqualDiagonals}, we see that the following holds both in \(\mathbb{R}^2 \) and in \(\mathbb{R}^3 \): \[ \vect{v} \perp \vect{w} \iff \norm{\vect{v}+\vect{w}}^2 = \norm{\vect{v}-\vect{w}}^2. \nonumber\] If we write this out for two arbitrary vectors \(\vect{v}= \left[\begin{array}{r} a_{1}\\a_{2}\end{array}\right], \vect{w}= \left[\begin{array}{r} b_{1}\\b_{2}\end{array}\right] \) in \(\mathbb{R}^2 \) we get the following: \[ \begin{array}{rcl} \vect{v} \perp \vect{w} &\iff &\norm{\vect{v}+\vect{w}}^2 = \norm{\vect{v}-\vect{w}}^2\\ &\iff &(a_1+b_1)^2 + (a_2+b_2)^2 = (a_1-b_1)^2 + (a_2-b_2)^2\\ &\iff &a_1^2+2a_1b_1 + b_1^2 + a_2^2+2a_2b_2 + b_2^2 = a_1^2 -2a_1b_1+b_1^2+ a_2^2 -2a_2b_2b_2^2\\ &\iff &4(a_1b_1 +a_2b_2)=0 \\ &\iff &a_1b_1 +a_2b_2=0. \end{array} \nonumber \nonumber\] Likewise, for vectors \(\vect{v}= \left[\begin{array}{r} a_{1}\\a_{2}\\a_{3}\end{array}\right], \vect{w}= \left[\begin{array}{r} b_{1}\\b_{2}\\b_{3}\end{array}\right] \) in \(\mathbb{R}^3 \): \[ \label{Eq:InnerProduct:perp-in-3D} \vect{v} \perp \vect{w} \iff a_1b_1 +a_2b_2+a_3b_3=0. \] The derivation is completely analogous to the one above, only now we have one term extra. So to check 'algebraically' whether two vectors are perpendicular we just have to compute \(a_1b_1 +a_2b_2 ( + a_3b_3 ) \) and see whether this is equal to 0. This expression is called the inner product (or dot product) of the vectors \(\vect{v} \) and \(\vect{w} \). We denote it by \(\vect{v}\ip\vect{w} \). Note that the dot product of a general vector \(\vect{v}= \left[\begin{array}{r} a_{1}\\a_{2}\\a_{3}\end{array}\right] \) in \(\mathbb{R}^3 \) with itself gives \[ \vect{v}\ip\vect{v} = a_1^2+a_2^2+a_3^2 = \norm{\vect{v}}^2, \nonumber \nonumber\] so the length of a vector can be expressed as follows using the dot product \[ \label{Eq:NormViaDotproduct} \norm{\vect{v}} = \sqrt{\vect{v}\ip\vect{v} }. \] Using the dot product the concepts length and perpendicular easily carry over to any \(\mathbb{R}^n \), \(n \geq 4 \). Let's do it one by one, starting by generalizing the dot product in the next subsection.
Dot product in \(\mathbb{R}^n \)
The dot product (or inner product) of two vectors \(\vect{v}= \left[\begin{array}{r}a_{1}\\a_{2}\\ \vdots\\a_{n}\end{array}\right] \) and \(\vect{w}= \left[\begin{array}{r}b_{1}\\b_{2}\\ \vdots\\b_{n}\end{array}\right] \) in \(\mathbb{R}^n \) is defined as \[ \label{Eq:InnerProduct:DotProduct} \vect{v}\ip\vect{w} = a_1b_1 +a_2b_2+ \ldots + a_nb_n. \]
The dot product of the two vectors \[ \vect{v}_1= \left[\begin{array}{r} 5\\3\\4\\-2\end{array}\right] \quad \text{and}\quad \vect{v}_2= \left[\begin{array}{r} 2\\3\\0\\1\end{array}\right] \nonumber \nonumber\] is given by \[ \vect{v}_1\ip\vect{v}_2 = 5\cdot2 + 3\cdot3 +4\cdot0 + (-2)\cdot1 = 17 \nonumber \nonumber\] And the dot product of the two vectors \[ \vect{v}_1= \left[\begin{array}{r} 5\\3\\4\\-2\end{array}\right] \quad \text{and}\quad \vect{v}_3= \left[\begin{array}{r} -4\\3\\2\end{array}\right] \nonumber \nonumber\] is not defined. In fact, the dot product of a vector \(\vect{v} \) in \(\mathbb{R}^m \) and a vector \(\vect{w} \) in \(\mathbb{R}^n \) is only defined if \(m = n \).
The following properties hold for any vectors \(\vect{v},\vect{v}_1,\vect{v}_2,\vect{v}_3 \) in \(\mathbb{R}^n \) and scalars \(c \in \mathbb{R} \):
- \(\vect{v}_1\ip\vect{v}_2 = \vect{v}_2\ip\vect{v}_1 \).
- \((c\vect{v}_1)\ip\vect{v}_2 = c(\vect{v}_1\ip\vect{v}_2) = \vect{v}_1\ip(c \vect{v}_2) \).
- \((\vect{v}_1+\vect{v}_2)\ip\vect{v}_3 = \vect{v}_1\ip\vect{v}_3+\vect{v}_2\ip\vect{v}_3 \).
- \(\vect{v}\ip\vect{v} \geq 0 \), and \(\vect{v}\ip\vect{v} = 0 \iff \vect{v} = \vect{0} \).
Skip/Read the proof -
The first three properties follow from the corresponding properties of real numbers. For instance, for the first rule we simply use that \(xy = yx \) holds for the product of real numbers.
- Let \[ \vect{v}_1= \left[\begin{array}{r} a_1\\a_2\\ \vdots\\ a_n\end{array}\right] \quad \text{and}\quad \vect{v}_2= \left[\begin{array}{r} b_1 \\ b_2 \\ \vdots \\ b_n \end{array}\right] \nonumber \nonumber\] be two arbitrary vectors in \(\mathbb{R}^n \). Then \begin{eqnarray*} \vect{v}_1\ip\vect{v}_2 &=& \left[\begin{array}{r}a_{1}\\a_{2}\\ \vdots\\a_{n}\end{array}\right]\ip \left[\begin{array}{r}b_{1}\\b_{2}\\ \vdots\\b_{n}\end{array}\right] = a_1b_1 +a_2b_2+ \ldots + a_nb_n \\ &=& b_1a_1 +b_2a_2+ \ldots + b_na_n = \left[\begin{array}{r}b_{1}\\b_{2}\\ \vdots\\b_{n}\end{array}\right]\ip \left[\begin{array}{r}a_{1}\\a_{2}\\ \vdots\\a_{n}\end{array}\right] = \vect{v}_2\ip\vect{v}_1. \end{eqnarray*}
- Taking \(\vect{v}_1 \), \(\vect{v}_2 \) as before \begin{eqnarray*} (c\vect{v}_1)\ip\vect{v}_2 &=& \left[\begin{array}{r}ca_{1}\\ca_{2}\\ \vdots\\ca_{n}\end{array}\right]\ip \left[\begin{array}{r}b_{1}\\b_{2}\\ \vdots\\b_{n}\end{array}\right]\\ &=& (ca_1b_1) + (ca_2b_2)+ \ldots + (ca_n)b_n \\ &=& c (a_1b_1 +a_2b_2+ \ldots + a_nb_n) = c (\vect{v}_1\ip\vect{v}_2) \end{eqnarray*}
- Is proved in the same way as (ii).
- \(\vect{v}\ip\vect{v} = a_1a_1 +a_2a_2+ \ldots + a_na_n = a_1^2+a_2^2 + \ldots + a_n^2 \) is the sum of squares of real numbers, so it is nonnegative. It only becomes 0 if all the squares are 0, which only happens if each entry \(a_i \) is equal to zero, that is, if \(\vect{v} = \vect{0} \).
Prove property (iii).
Prove the identity \[ (\vect{v}_1+\vect{v}_2)\ip(\vect{v}_1-\vect{v}_2) = \vect{v}_1\ip\vect{v}_1-\vect{v}_2\ip\vect{v}_2. \nonumber \nonumber\]
Prove the identity \[ \norm{\vect{v}_1+\vect{v}_2}^2 + \norm{\vect{v}_1-\vect{v}_2}^2 = 2 (\norm{\vect{v}_1}^2 + \norm{\vect{v}_2}^2), \nonumber \nonumber\] and explain why it is called the parallelogram rule.
Orthogonality
In \(\mathbb{R}^2 \) and \(\mathbb{R}^3 \) the dot product gives an easy way to check whether two vectors are perpendicular: \[ \vect{v}\perp\vect{w} \iff \vect{v}\ip\vect{w} = 0. \nonumber \nonumber\] We use this identity to define the concept of perpendicularity in \(\mathbb{R}^n \). It seems a bit 'academic', but in this more general setting the term orthogonal is used.Two vectors \(\vect{v} \) and \(\vect{w} \) in \(\mathbb{R}^n \) are called orthogonal if \(\vect{v}\ip\vect{w} = 0 \). As before, we denote this by \(\vect{v}\perp\vect{w} \).
Let \(\vect{u} = \left[\begin{array}{r} 1\\2\\-1\\-1\end{array}\right] \), \(\vect{v} = \left[\begin{array}{r} 3\\-1\\2\\-1\end{array}\right] \), \(\vect{w} = \left[\begin{array}{r} 2\\2\\-1\\2\end{array}\right] \). We compute \[ \vect{u}\ip\vect{v} = 3-2-2+1 = 0, \nonumber \nonumber\] \[ \vect{u}\ip\vect{w} = 2+4+1-2 = 5, \nonumber \nonumber\] \[ \vect{v}\ip\vect{w} = 6 - 2 - 2 - 2 = 0, \nonumber \nonumber\] and conclude: \(\vect{u} \) and \(\vect{v} \) are orthogonal, \(\vect{u} \) and \(\vect{w} \) are not orthogonal, \(\vect{v} \) and \(\vect{w} \) are orthogonal. In \(\mathbb{R}^2 \), two nonzero vectors that are orthogonal to the same nonzero vector \(\vect{v} \) are automatically multiples of each other (i.e. have either the same or the opposite direction). In \(\mathbb{R}^n \) with \(n \geq 3 \) this no longer holds. In this example both vectors \(\vect{u} \) and \(\vect{w} \) are orthogonal to the vector \(\vect{v} \), but \(\vect{u} \neq c\vect{w} \).
Suppose \(\vect{v} \in \mathbb{R}^n \). Then \(\vect{v}\perp\vect{v} \iff \vect{v} = \vect{0} \).
Skip/Read the proof -
By definition \[ \vect{v}\perp\vect{v} \iff \vect{v}\ip\vect{v}=0 \nonumber \nonumber\] In Proposition \(\PageIndex{3}\) (iv) we already showed that the last equality only holds for \(\vect{v} = \vect{0} \).
Let \(\vect{n} \) be any nonzero vector in the plane. The set of vectors that are orthogonal to \(\vect{n} \) all lie on a line through the origin. (See Figure \(\PageIndex{5}\).) If we agree that \(\vect{0}\perp\vect{n} \), it will be the whole line. The vector \(\vect{n} \) is often said to be a normal vector to the line.
Figure \(\PageIndex{5}\): Vectors orthogonal to a non-zero vector \(\vect{n} \) in the plane
The orthogonal projection of a vector \(\vect{w} \) onto the nonzero vector \(\vect{v} \) is the vector \(\vect{\hat{w}} = c\vect{v} \) for which \[ (\vect{w} - \vect{\hat{w}}) \perp \vect{v}. \nonumber \nonumber\] Another notation for this vector: \[ \vect{\hat{w}} = \text{proj}_{\vect{v}}(\vect{w}). \nonumber \nonumber\]
Figure \(\PageIndex{6}\): Projection of a vector \(\vect{w} \) onto a non-zero vector \(\vect{v} \)
In the definition above the vector \(\vect{\hat{w}} \) with these properties is unique and it is given by \[ \text{proj}_{\vect{v}}(\vect{w}) = \vect{\hat{w}} = \frac{\vect{w}\ip\vect{v}}{\vect{v}\ip\vect{v}} \vect{v}. \nonumber \nonumber\]
Skip/Read the proof -
With the rules of the dot product the vector \(\vect{w} \) is easily constructed: Starting from \[ \vect{\hat{w}} = t\vect{v}, \text{ for some } t\in\mathbb{R} \nonumber \nonumber\] and \[ (\vect{w} - \vect{\hat{w}}) \perp \vect{v} \nonumber \nonumber\] it follows that we must have \[ (\vect{w} - t\vect{v}) \ip \vect{v} = \vect{w}\ip \vect{v} - t (\vect{v}\ip \vect{v}) = 0 \nonumber \nonumber\] so that \(t \) is uniquely given by \[ t = \frac{\vect{w}\ip \vect{v}}{\vect{v}\ip \vect{v}} \nonumber \nonumber\] and indeed \(\vect{\hat{w}} \) must be as stated.
We compute the orthogonal projection of the vector \[ \vect{w} = \left[\begin{array}{r} 2\\ -4 \\ -1 \\ -5\end{array}\right] \nonumber \nonumber\] onto the vector \[ \vect{v} = \left[\begin{array}{r} 1 \\1\\1\\1\end{array}\right]. \nonumber \nonumber\] As follows \[ \vect{\hat{w}} = \text{proj}_{\vect{v}}(\vect{w}) = \frac{\vect{w}\ip\vect{v}}{\vect{v}\ip\vect{v}} \vect{v} = \frac{-8}{4} \left[\begin{array}{r} 1 \\1\\1\\1\end{array}\right] = \left[\begin{array}{r} -2\\-2\\-2\\-2\end{array}\right]. \nonumber \nonumber\] We verify the orthogonality: \[ (\vect{w} - \vect{\hat{w}} )\ip \vect{v} = \left[\begin{array}{r} 4 \\-2\\1\\-3\end{array}\right] \ip \left[\begin{array}{r} 1 \\1\\1\\1\end{array}\right] = 4-2+1-3 = 0, \nonumber \nonumber\] so indeed \[ (\vect{w} - \vect{\hat{w}} )\perp \vect{v}, \nonumber \nonumber\] as required.
Find the projection \(\vect{w} = \text{proj}_{\vect{v}}(\vect{u}) \) of the vector \[ \vect{u} = \left[\begin{array}{r} 11\\ 8 \\ -5\end{array}\right] \text{ onto the vector } \vect{v} = \left[\begin{array}{r} 1\\2 \\ -3\end{array}\right], \nonumber \nonumber\] and show that \[ \norm{\vect{w}} \leq \norm{\vect{u}}. \nonumber \nonumber\]
Suppose \(\text{proj}_{\vect{v}}(\vect{w}_1) = \text{proj}_{\vect{v}}(\vect{w}_2) \), for three vectors \(\vect{v}, \vect{w}_1, \vect{w}_2 \) in \(\mathbb{R}^n \). What does this say about the relative positions of the three vectors? Verify your statement for the following three vectors \[ \vect{v} = \left[\begin{array}{r} 1\\ 1 \\ -2 \\ -3\end{array}\right], \quad \vect{w}_1 = \left[\begin{array}{r} 6\\ 4 \\ -7 \\ -7\end{array}\right], \quad \vect{w}_2 = \left[\begin{array}{r} 5\\ 6 \\ -2 \\ -10\end{array}\right]. \nonumber \nonumber\]
Norm in \(\mathbb{R}^n \)
The length of a vector in the plane can be computed using the dot product: for \(\vect{v}= \left[\begin{array}{r}a_{1}\\a_{2}\end{array}\right] \) in \(\mathbb{R}^2 \) we have seen that \[ \norm{\vect{v}} = \sqrt{a_1^2 + a_2^2} = \sqrt{\vect{v}\ip\vect{v}}. \nonumber \nonumber\] The identity \(\norm{\vect{v}} = \sqrt{\vect{v}\ip\vect{v}} \) also holds in \(\mathbb{R}^3 \). It seems natural to extend the concept to \(\mathbb{R}^n \). Again, for this more general space a new word is introduced:The norm of a vector \(\vect{v} \) in \(\mathbb{R}^n \), denoted by \(\norm{\vect{v}} \), is defined by \[ \norm{\vect{v}} = \sqrt{\vect{v}\ip\vect{v} }. \nonumber \nonumber\]
For any \(\vect{v}, \vect{w} \in \mathbb{R}^{n} \) and all \(c \in \mathbb{R} \) the following holds:
- \(\norm{\vect{v}}\geq 0 \);
- \(\norm{c\vect{v}} = |c|\norm{\vect{v}}\quad \) (scaling property);
- \(\norm{\vect{v}+\vect{w}} \leq \norm{\vect{v}}+\norm{\vect{w}}\quad \) (triangle inequality).
Figure \(\PageIndex{7}\): The Triangle Inequality
We compute the norms of the vectors \[ \vect{v} = \left[\begin{array}{r} 1 \\ -2 \\ 3 \\ -1 \end{array}\right] \quad \text{and} \quad -2\vect{v} = \left[\begin{array}{r} -2 \\ 4 \\ -6 \\ 2 \end{array}\right]. \nonumber\] As follows: \[ \norm{\vect{v}} = \sqrt{1^2 + (-2)^2 + 3^2 + (-1)^2 } = \sqrt{15}. \nonumber \nonumber\] and \[ \norm{-2\vect{v}} = \sqrt{(-2)^2 + 4^2 + (-6)^2 + 2^2 } = \sqrt{60} = 2\sqrt{15}. \nonumber \nonumber\] The last norm can also be found via \[ \norm{-2\vect{v}} = |-2|\cdot\norm{\vect{v}} = 2 \quad \sqrt{15}. \nonumber \nonumber\]
A unit vector is a vector of norm 1. Moreover, for any nonzero vector \(\vect{v} \), the vector \[ \vect{u} = \frac{\vect{v}}{\norm{\vect{v}}} \nonumber \nonumber\] is called the unit vector in the direction of \(\vect{v} \).
For a nonzero vector \(\vect{v} \) \[ \frac{\vect{v}}{\norm{\vect{v}}} \nonumber\] is the unique vector \(\vect{u} \) of norm 1 such that \[ \vect{u} = k\vect{v}, \text{ for some } k > 0. \nonumber \nonumber\]
Skip/Read the proof -
Assume that \(\vect{v} \neq \vect{0} \). For \(\vect{u} = k\vect{v} \), with \(\norm{\vect{u}} = 1 \) and \(k > 0 \) to hold, we must have \[ \norm{\vect{u}} = \norm{k\vect{v}} = |k|\norm{\vect{v}} = k\norm{\vect{v}} = 1. \nonumber \nonumber\] We see that \[ k = \dfrac{1}{\norm{\vect{v}}} \nonumber \nonumber\] and consequently \[ \vect{u} = \dfrac{1}{k}\vect{v} = \frac{\vect{v}}{\norm{\vect{v}}}. \nonumber \nonumber\]
We compute the unit vector \(\vect{u} \) in the direction of the vector \(\vect{v} = \left[\begin{array}{r}1 \\ 2 \\ 4 \\ -2 \end{array}\right] \) in \(\mathbb{R}^4 \). As follows: \[ \norm{\vect{v}} = \sqrt{1^2+2^2+4^2+(-2)^2} = \sqrt{25} = 5 \quad \quad \Longrightarrow\quad \vect{u} = \dfrac{1}{5} \left[\begin{array}{r}1 \\ 2 \\ 4 \\ -2 \end{array}\right] = \left[\begin{array}{r}1/5 \\ 2/5 \\ 4/5 \\ -2/5 \end{array}\right]. \nonumber \nonumber\]
For any two vectors \(\vect{v} \) and \(\vect{w} \) in \(\mathbb{R}^n \) we have \[ \norm{\vect{v}+\vect{w}}^2 = \norm{\vect{v}}^2 + \norm{\vect{w}}^2 \iff \vect{v} \perp \vect{w}. \nonumber \nonumber\]
Skip/Read the proof -
This follows quite straightforwardly from the properties of the dot product: Let us start from the identity on the left and work our way to the conclusion on the right, making sure that each step is reversible. Note that from the definition of the norm it follows immediately that \(\norm{\vect{v}}^2 = \vect{v}\ip\vect{v} \). \[ \begin{array}{cl} &\norm{\vect{v}+\vect{w}}^2 = \norm{\vect{v}}^2 + \norm{\vect{w}}^2 \\ \iff &(\vect{v}+\vect{w})\ip(\vect{v}+\vect{w}) = \vect{v}\ip\vect{v} + \vect{w}\ip\vect{w} \\ \iff&\vect{v}\ip\vect{v} + \vect{v}\ip\vect{w}+\vect{w}\ip\vect{v}+ \vect{w}\ip\vect{w} = \vect{v}\ip\vect{v} + \vect{w}\ip\vect{w}. \end{array} \nonumber \nonumber\] Next we subtract \(\vect{v}\ip\vect{v} + \vect{w}\ip\vect{w} \) from both sides. Thus the last identity is equivalent to \[ \begin{array}{rcl} \vect{v}\ip\vect{w}+\vect{w}\ip\vect{v} = 0 &\iff& 2\vect{v}\ip\vect{w} = 0\\ &\iff& \vect{v}\ip\vect{w}= 0\\ &\iff& \vect{v}\perp\vect{w}. \end{array} \nonumber \nonumber\]
We verify the equality for the vectors \(\vect{v} = \left[\begin{array}{r} 2 \\ -3\\ 3 \\ 1 \end{array}\right] \) and \(\vect{w} = \left[\begin{array}{r} 2 \\ 4 \\ 1 \\ 5 \end{array}\right] \) in \(\mathbb{R}^4 \): First of all \[ \vect{v} \ip \vect{w} = 4 - 12 + 3 + 5 = 0, \nonumber \nonumber\] so \(\vect{v}\perp \vect{w} \), and second \[ \norm{\vect{v}} = \sqrt{2^2 + (-3)^2 + 3^2 + 1^2} = \sqrt{23}, \quad \norm{\vect{w}} = \sqrt{2^2 + 4^2 + 1^2 + 5^2} = \sqrt{46} \nonumber \nonumber\] Furthermore \[ \vect{v}+\vect{w} = \left[\begin{array}{r} 4 \\ 1 \\ 4 \\ 6 \end{array}\right] \Longrightarrow \norm{\vect{v}+\vect{w}} = \sqrt{4^2+1^2+4^2+6^2} = \sqrt{69} \nonumber \nonumber\] and we see that indeed \[ \norm{\vect{v}+\vect{w}}^2 = 69 = 23 + 46 = \norm{\vect{v}}^2+\norm{\vect{w}}^2. \nonumber \nonumber\]
For any two vectors in \(\mathbb{R}^n \) \[ |\vect{v}\ip\vect{w}| \leq \norm{\vect{v}} \norm{\vect{w}}. \nonumber \nonumber\]
Skip/Read the proof -
There are many ways to prove the Cauchy-Schwarz inequality. There is even a whole book devoted to it: "Cauchy Schwarz master class" by J.M. Steele. The following proof is based on orthogonal projection and Pythagoras' Theorem. If \(\vect{v} = \vect{0} \), the zero vector, then the inequality obviously holds; in fact it becomes an equality: \[ \vect{v} = \vect{0} \quad \Longrightarrow \norm{\vect{v}} = 0 \Longrightarrow \norm{\vect{v}}\cdot\norm{\vect{w}} = 0 \nonumber \nonumber\] and also \[ \vect{v} = \vect{0} \quad \Longrightarrow \vect{v}\ip \vect{w} = 0 \Longrightarrow |\vect{v}\ip \vect{w}| = 0. \nonumber \nonumber\] So now suppose \(\vect{v} \neq \vect{0} \). Let \[ \vect{\hat{w}} = \dfrac{\vect{w}\ip\vect{v}}{\vect{v}\ip\vect{v}} \vect{v} \nonumber \nonumber\] be the projection of \(\vect{w} \) onto \(\vect{v} \). Then we can apply Pythagoras Theorem! \[ (\vect{w} - \vect{\hat{w}}) \perp \vect{\hat{w}} \Longrightarrow \norm{\vect{w} - \vect{\hat{w}}}^2 + \norm{ \vect{\hat{w}}}^2 = \norm{(\vect{w} - \vect{\hat{w}}) + \vect{\hat{w}}}^2 = \norm{\vect{w}}^2 \nonumber \nonumber\] It follows that \[ \norm{ \vect{\hat{w}}}^2 = \norm{\vect{w}}^2 - \norm{\vect{w} - \vect{\hat{w}}}^2 \quad \leq \norm{\vect{w}}^2 \nonumber \nonumber\] and substitution of the expression for \(\vect{\hat{w}} \) we arrive at \[ \left(\dfrac{\vect{w}\ip\vect{v}}{\vect{v}\ip\vect{v}}\right)^2 \norm{\vect{v}}^2 = \dfrac{(\vect{w}\ip\vect{v})^2}{(\vect{v}\ip\vect{v})^2} \norm{\vect{v}}^2 \quad \leq \norm{\vect{w}}^2. \nonumber \nonumber\] Using \[ \vect{v}\ip\vect{v} = \norm{\vect{v}}^2 \nonumber \nonumber\] we conclude that indeed \[ (\vect{w}\ip\vect{v})^2 \quad \leq \norm{\vect{v}}^2\norm{\vect{w}}^2. \nonumber \nonumber\]
We verify that the inequality holds for the vectors \(\vect{v} = \left[\begin{array}{r} 1 \\ -2\\ 3 \\ -4 \end{array}\right] \) and \(\vect{w} = \left[\begin{array}{r} -5 \\ 4 \\-3 \\ 0 \end{array}\right] \) in \(\mathbb{R}^4 \). As follows \[ \vect{v}\ip\vect{w} = -5-8-9 = -22, \quad \norm{\vect{v}} = \sqrt{30}, \quad \norm{\vect{w}} = \sqrt{50} \nonumber \nonumber\] and we see that indeed \[ |\vect{v}\ip\vect{w}| = 22 \quad \leq \norm{\vect{v}} \norm{\vect{w}} = \sqrt{1500}. \nonumber \nonumber\]
For any two vectors in \(\mathbb{R}^n \): \[ \norm{\vect{v}+\vect{w}} \leq \norm{\vect{v}}+\norm{\vect{w}}. \nonumber\]
Skip/Read the proof -
Since all terms involved are non-negative we may as well show that the inequality holds for the squares: \[ \begin{array}{l} \norm{\vect{v}+\vect{w}}^2 \quad \leq (\norm{\vect{v}}+\norm{\vect{w}})^2 \\ \iff (\vect{v}+\vect{w})\ip(\vect{v}+\vect{w}) \leq \norm{\vect{v}}^2 + 2\norm{\vect{v}}\norm{\vect{w}} + \norm{\vect{w}}^2 \\ \iff \vect{v}\ip\vect{v} + 2\vect{v}\ip\vect{w}+\vect{w}\ip\vect{w} \leq \norm{\vect{v}}^2 + 2\norm{\vect{v}}\norm{\vect{w}} + \norm{\vect{w}}^2 \\ \iff 2\vect{v}\ip\vect{w} \leq 2\norm{\vect{v}}\norm{\vect{w}} \end{array} \nonumber \nonumber\] and this, apart from the factor 2, is the Cauchy-Schwarz Inequality.
We verify the inequality for the vectors \(\vect{v} = \left[\begin{array}{r} -1 \\ 2\\ 3 \end{array}\right] \) and \(\vect{w} = \left[\begin{array}{r} 4 \\ -4\\ 3 \end{array}\right] \): \[ \norm{\vect{v} + \vect{w}} = \sqrt{3^2+(-2)^2+6^2} =\sqrt{49} = 7 \nonumber \nonumber\] and indeed \[ \norm{\vect{v}} + \norm{\vect{w}} = \sqrt{14} + \sqrt{35} > 3+5 = 8 > \norm{\vect{v} + \vect{w}}. \nonumber \nonumber\]
Angles in \(\mathbb{R}^n \)
The first motivation to consider the dot product came from the question of perpendicularity. We have seen that the length of a vector can also be computed using a dot product. Below we will show that not only can the dot product be used to mark angles between vectors of \(\frac12\pi \) (namely, when the vectors are perpendicular), but that it is possible to express the angle between any two (nonzero) vectors into dot products.Figure \(\PageIndex{8}\): Angle between two vectors
First we will show a geometrical characterization of the dot product that holds in \(\mathbb{R}^2 \) as well as in \(\mathbb{R}^3 \).
For two nonzero vectors \(\vect{v} \) and \(\vect{w} \) in either \(\mathbb{R}^2 \) or \(\mathbb{R}^3 \) the following identity holds: \[ \label{Eq:InnerProduct:GeometricDefinition} \vect{v}\ip\vect{w} = \norm{\vect{v}}\norm{\vect{w}} \cos(\varphi) \] where \(\varphi \) is the angle between \(\vect{v} \) and \(\vect{w} \). Note that is in line with the special case of two perpendicular vectors: \[ \vect{v}\perp\vect{w} \iff \vect{v}\ip\vect{w}=0 \quad \iff \cos(\varphi)=0. \nonumber \nonumber\]
The angle between two nonzero vectors \(\vect{v} \) and \(\vect{w} \) is thus determined by dot products in the following way \[ \cos(\varphi) = \frac{\vect{w}\ip\vect{v}}{\norm{\vect{v}}\norm{\vect{w}}} \nonumber \nonumber\] so \[ \varphi = \arccos\left(\frac{\vect{w}\ip\vect{v}}{\norm{\vect{v}}\norm{\vect{w}}}\right) = \cos^{-1}\left(\frac{\vect{w}\ip\vect{v}}{\norm{\vect{v}}\norm{\vect{w}}}\right). \nonumber \nonumber\]
Skip/Read the proof -
Now let's derive formula \eqref{Eq:InnerProduct:GeometricDefinition}. Assume that \(\vect{v} \) and \(\vect{w} \) are nonzero vectors. Recall the formula of the orthogonal projection \[ \vect{\hat{w}} = \dfrac{\vect{w}\ip\vect{v}}{\vect{v}\ip\vect{v}}\vect{v}. \nonumber \nonumber\] Let \(\varphi \in [0,\pi] \) denote the angle between two nonzero vectors \(\vect{v} \) and \(\vect{w} \). From Figure \(\PageIndex{8}\) it is clear that the factor \[ \dfrac{\vect{w}\ip\vect{v}}{\vect{v}\ip\vect{v}} \nonumber \nonumber\] is positive if the angle is sharp, zero if the angle is right, and negative if the angle is obtuse. In the case of a sharp angle, by considering the right triangle \(\Delta OAB \), where \(A \) is the end point of \(\vect{\hat{v}} \), \(B \) the end point of \(\vect{w} \) we see that on the one hand \[ OA = \norm{\dfrac{\vect{w}\ip\vect{v}}{\vect{v}\ip\vect{v}}\vect{v}} = \dfrac{|\vect{w}\ip\vect{v}|}{\vect{v}\ip\vect{v}}\norm{\vect{v}} = \dfrac{\vect{w}\ip\vect{v}}{\norm{\vect{v}}^2} \norm{\vect{v}} = \dfrac{\vect{w}\ip\vect{v}}{\norm{\vect{v}}} \nonumber \nonumber\] and on the other hand \[ OA = OB\cos(\varphi) = \norm{\vect{w}}\cos(\varphi). \nonumber \nonumber\] So we may conclude that \[ \vect{w}\ip\vect{v} = \norm{\vect{v}}\norm{\vect{w}}\cos(\varphi). \label{Eq:InnerProduct:GeometricInterpretation} \] In the case of an obtuse angle, we use that the projection of \(\vect{w} \) onto \(\vect{v} \) is equal to the projection of \(\vect{w} \) onto \(-\vect{v} \), as it is in fact the projection onto the line consisting of all multiples of \(\vect{v} \). Now look at the picture on the right of figure \(\PageIndex{8}\). There you see that \(\vect{w} \) and \(-\vect{v} \) make a sharp angle \(\psi = \pi - \phi \), so we can apply Equation \eqref{Eq:InnerProduct:GeometricInterpretation}) to \(\vect{w} \) and \(-\vect{v} \): \[ \begin{array}{rcl} \vect{w}\ip\vect{v} = - \vect{w}\ip(\vect{-v}) &=& -\norm{\vect{w}}\norm{\vect{-v}}\cos(\psi) \\ &=& -\norm{\vect{w}}\norm{\vect{v}}\cos(\pi-\varphi) \\ &=& \norm{\vect{w}}\norm{\vect{v}}\cos(\varphi). \end{array} \nonumber \nonumber\]