14.5: Directional Derivatives
- Page ID
- 933
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)We still have not answered one of our first questions about the steepness of a surface: starting at a point on a surface given by \(f(x,y)\), and walking in a particular direction, how steep is the surface? We are now ready to answer the question.
We already know roughly what has to be done: as shown in Figure 14.3.1, we extend a line in the \(x\)-\)y\) plane to a vertical plane, and we then compute the slope of the curve that is the cross-section of the surface in that plane. The major stumbling block is that what appears in this plane to be the horizontal axis, namely the line in the \(xy\) plane, is not an actual axis---we know nothing about the "units'' along the axis. Our goal is to make this line into a \(t\) axis; then we need formulas to write \(x\) and \(y\) in terms of this new variable \(t\); then we can write \(z\) in terms of \(t\) since we know \(z\) in terms of \(x\) and \(y\); and finally we can simply take the derivative.
So we need to somehow "mark off'' units on the line, and we need a convenient way to refer to the line in calculations. It turns out that we can accomplish both by using the vector form of a line. Suppose that \({\bf u}\) is a unit vector \(\langle u_1,u_2\rangle\) in the direction of interest. A vector equation for the line through \((x_0,y_0)\) in this direction is \({\bf v}(t)=\langle u_1t+x_0,u_2t+y_0\rangle\). The height of the surface above the point \((u_1t+x_0,u_2t+y_0)\) is \(g(t)=f(u_1t+x_0,u_2t+y_0)\). Because \(\bf u\) is a unit vector, the value of \(t\) is precisely the distance along the line from \((x_0,y_0)\) to \((u_1t+x_0,u_2t+y_0)\); this means that the line is effectively a \(t\) axis, with origin at the point \((x_0,y_0)\), so the slope we seek is
\[\eqalign{ g'(0)&=\langle f_x(x_0,y_0),f_y(x_0,y_0)\rangle\cdot \langle u_1,u_2\rangle\cr &=\langle f_x,f_y\rangle\cdot{\bf u}\cr &=\nabla f\cdot {\bf u}.\cr }\]
Here we have used the chain rule and the derivatives \({d\over dt}(u_1t+x_0)=u_1\) and \({d\over dt}(u_2t+y_0)=u_2\). The vector \(\langle f_x,f_y\rangle\) is very useful, so it has its own symbol, \(\nabla f\), pronounced "del f''; it is also called the gradient of \(f\).
Example \(\PageIndex{1}\)
Find the slope of \(z=x^2+y^2\) at \((1,2)\) in the direction of the vector \(\langle 3,4\rangle\).
Solution
We first compute the gradient at \((1,2)\): \(\nabla f=\langle 2x,2y\rangle\), which is \(\langle 2,4\rangle\) at \((1,2)\). A unit vector in the desired direction is \(\langle 3/5,4/5\rangle\), and the desired slope is then
\[\langle 2,4\rangle\cdot\langle 3/5,4/5\rangle=6/5+16/5=22/5.\nonumber\]
Example \(\PageIndex{2}\)
Find a tangent vector to \(z=x^2+y^2\) at \((1,2)\) in the direction of the vector \(\langle 3,4\rangle\) and show that it is parallel to the tangent plane at that point.
Solution
Since \(\langle 3/5,4/5\rangle\) is a unit vector in the desired direction, we can easily expand it to a tangent vector simply by adding the third coordinate computed in the previous example: \(\langle 3/5,4/5,22/5\rangle\). To see that this vector is parallel to the tangent plane, we can compute its dot product with a normal to the plane. We know that a normal to the tangent plane is
\[\langle f_x(1,2),f_y(1,2),-1\rangle = \langle 2,4,-1\rangle, \nonumber\]
and the dot product is
\[\langle 2,4,-1\rangle\cdot\langle 3/5,4/5,22/5\rangle=6/5+16/5-22/5=0 \nonumber\]
so the two vectors are perpendicular. (Note that the vector normal to the surface, namely \(\langle f_x,f_y,-1\rangle\), is simply the gradient with a \(-1\) tacked on as the third component.)
The slope of a surface given by \(z=f(x,y)\) in the direction of a (two-dimensional) vector \(\bf u\) is called the directional derivative of \(f\), written \(D_{\bf u}f\). The directional derivative immediately provides us with some additional information. We know that
\[D_{\bf u}f=\nabla f\cdot {\bf u}=|\nabla f||{\bf u}|\cos\theta= |\nabla f|\cos\theta\]
if \(\bf u\) is a unit vector; \(\theta\) is the angle between \(\nabla f\) and \(\bf u\). This tells us immediately that the largest value of \(D_{\bf u}f\) occurs when \(\cos\theta=1\), namely, when \(\theta=0\), so \(\nabla f\) is parallel to \(\bf u\). In other words, the gradient \(\nabla f\) points in the direction of steepest ascent of the surface, and \(|\nabla f|\) is the slope in that direction. Likewise, the smallest value of \(D_{\bf u}f\) occurs when \(\cos\theta=-1\), namely, when \(\theta=\pi\), so \(\nabla f\) is anti-parallel to \(\bf u\). In other words, \(-\nabla f\) points in the direction of steepest descent of the surface, and \(-|\nabla f|\) is the slope in that direction.
Example \(\PageIndex{3}\)
Investigate the direction of steepest ascent and descent for \(z=x^2+y^2\).
Solution
The gradient is \(\langle 2x,2y\rangle=2\langle x,y\rangle\); this is a vector parallel to the vector \(\langle x,y\rangle\), so the direction of steepest ascent is directly away from the origin, starting at the point \((x,y)\). The direction of steepest descent is thus directly toward the origin from \((x,y)\). Note that at \((0,0)\) the gradient vector is \(\langle 0,0\rangle\), which has no direction, and it is clear from the plot of this surface that there is a minimum point at the origin, and tangent vectors in all directions are parallel to the \(x\)-\)y\) plane.
If \(\nabla f\) is perpendicular to \(\bf u\), \(D_{\bf u}f=|\nabla f|\cos(\pi/2)=0\), since \(\cos(\pi/2)=0\). This means that in either of the two directions perpendicular to \(\nabla f\), the slope of the surface is 0; this implies that a vector in either of these directions is tangent to the level curve at that point. Starting with \(\nabla f=\langle f_x,f_y\rangle\), it is easy to find a vector perpendicular to it: either \(\langle f_y,-f_x\rangle\) or \(\langle -f_y,f_x\rangle\) will work.
If \(f(x,y,z)\) is a function of three variables, all the calculations proceed in essentially the same way. The rate at which \(f\) changes in a particular direction is \(\nabla f\cdot{\bf u}\), where now \(\nabla f=\langle f_x,f_y,f_z\rangle\) and \({\bf u}=\langle u_1,u_2,u_3\rangle\) is a unit vector. Again \(\nabla f\) points in the direction of maximum rate of increase, \(-\nabla f\) points in the direction of maximum rate of decrease, and any vector perpendicular to \(\nabla f\) is tangent to the level surface \(f(x,y,z)=k\) at the point in question. Of course there are no longer just two such vectors; the vectors perpendicular to \(\nabla f\) describe the tangent plane to the level surface, or in other words \(\nabla f\) is a normal to the tangent plane.
Example \(\PageIndex{4}\)
Suppose the temperature at a point in space is given by \(T(x,y,z)=T_0/(1+x^2+y^2+z^2)\); at the origin the temperature in Kelvin is \(T_0>0\), and it decreases in every direction from there. It might be, for example, that there is a source of heat at the origin, and as we get farther from the source, the temperature decreases. The gradient is
\[\eqalign{ \nabla T&=\langle {-2T_0x\over (1+x^2+y^2+z^2)^2}+ {-2T_0x\over (1+x^2+y^2+z^2)^2}+{-2T_0x\over (1+x^2+y^2+z^2)^2}\rangle\cr &={-2T_0\over (1+x^2+y^2+z^2)^2}\langle x,y,z\rangle.\cr } \nonumber\]
The gradient points directly at the origin from the point \((x,y,z)\)---by moving directly toward the heat source, we increase the temperature as quickly as possible.
Example \(\PageIndex{5}\)
Find the points on the surface defined by \(x^2+2y^2+3z^2=1\) where the tangent plane is parallel to the plane defined by \(3x-y+3z=1\).
Solution
Two planes are parallel if their normals are parallel or anti-parallel, so we want to find the points on the surface with normal parallel or anti-parallel to \(\langle 3,-1,3\rangle\). Let \(f=x^2+2y^2+3z^2\); the gradient of \(f\) is normal to the level surface at every point, so we are looking for a gradient parallel or anti-parallel to \(\langle 3,-1,3\rangle\). The gradient is \(\langle 2x,4y,6z\rangle\); if it is parallel or anti-parallel to \(\langle 3,-1,3\rangle\), then
\[\langle 2x,4y,6z\rangle=k\langle 3,-1,3\rangle \nonumber\]
for some \(k\). This means we need a solution to the equations
\[2x=3k\qquad 4y=-k\qquad 6z=3k\nonumber\]
but this is three equations in four unknowns---we need another equation. What we haven't used so far is that the points we seek are on the surface \(x^2+2y^2+3z^2=1\); this is the fourth equation. If we solve the first three equations for \(x\), \(y\), and \(z\) and substitute into the fourth equation we get
\[\eqalign{ 1&=\left({3k\over2}\right)^2+2\left({-k\over4}\right)^2+3\left({3k\over6}\right)^2\cr &=\left({9\over4}+{2\over16}+{3\over4}\right)k^2\cr &={25\over8}k^2\cr } \nonumber\]
so \( k=\pm{2\sqrt2\over 5}\). The desired points are \( \left({3\sqrt2\over5},-{\sqrt2\over10},{\sqrt2\over 5}\right)\) and \( \left(-{3\sqrt2\over5},{\sqrt2\over10},-{\sqrt2\over 5}\right)\). Here are the original plane and the two tangent planes, shown with the ellipsoid.
Contributors
Integrated by Justin Marshall.