# 2.4: Directional Derivatives and the Gradient

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$

( \newcommand{\kernel}{\mathrm{null}\,}\) $$\newcommand{\range}{\mathrm{range}\,}$$

$$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$

$$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$

$$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$

$$\newcommand{\Span}{\mathrm{span}}$$

$$\newcommand{\id}{\mathrm{id}}$$

$$\newcommand{\Span}{\mathrm{span}}$$

$$\newcommand{\kernel}{\mathrm{null}\,}$$

$$\newcommand{\range}{\mathrm{range}\,}$$

$$\newcommand{\RealPart}{\mathrm{Re}}$$

$$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$

$$\newcommand{\Argument}{\mathrm{Arg}}$$

$$\newcommand{\norm}[1]{\| #1 \|}$$

$$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$

$$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\AA}{\unicode[.8,0]{x212B}}$$

$$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$$

$$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$$

$$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vectorC}[1]{\textbf{#1}}$$

$$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$$

$$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$$

$$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$$

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

$$\newcommand{\avec}{\mathbf a}$$ $$\newcommand{\bvec}{\mathbf b}$$ $$\newcommand{\cvec}{\mathbf c}$$ $$\newcommand{\dvec}{\mathbf d}$$ $$\newcommand{\dtil}{\widetilde{\mathbf d}}$$ $$\newcommand{\evec}{\mathbf e}$$ $$\newcommand{\fvec}{\mathbf f}$$ $$\newcommand{\nvec}{\mathbf n}$$ $$\newcommand{\pvec}{\mathbf p}$$ $$\newcommand{\qvec}{\mathbf q}$$ $$\newcommand{\svec}{\mathbf s}$$ $$\newcommand{\tvec}{\mathbf t}$$ $$\newcommand{\uvec}{\mathbf u}$$ $$\newcommand{\vvec}{\mathbf v}$$ $$\newcommand{\wvec}{\mathbf w}$$ $$\newcommand{\xvec}{\mathbf x}$$ $$\newcommand{\yvec}{\mathbf y}$$ $$\newcommand{\zvec}{\mathbf z}$$ $$\newcommand{\rvec}{\mathbf r}$$ $$\newcommand{\mvec}{\mathbf m}$$ $$\newcommand{\zerovec}{\mathbf 0}$$ $$\newcommand{\onevec}{\mathbf 1}$$ $$\newcommand{\real}{\mathbb R}$$ $$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$$ $$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$$ $$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$$ $$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$$ $$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$$ $$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$$ $$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$$ $$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$$ $$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$$ $$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$$ $$\newcommand{\bcal}{\cal B}$$ $$\newcommand{\ccal}{\cal C}$$ $$\newcommand{\scal}{\cal S}$$ $$\newcommand{\wcal}{\cal W}$$ $$\newcommand{\ecal}{\cal E}$$ $$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$$ $$\newcommand{\gray}[1]{\color{gray}{#1}}$$ $$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$$ $$\newcommand{\rank}{\operatorname{rank}}$$ $$\newcommand{\row}{\text{Row}}$$ $$\newcommand{\col}{\text{Col}}$$ $$\renewcommand{\row}{\text{Row}}$$ $$\newcommand{\nul}{\text{Nul}}$$ $$\newcommand{\var}{\text{Var}}$$ $$\newcommand{\corr}{\text{corr}}$$ $$\newcommand{\len}[1]{\left|#1\right|}$$ $$\newcommand{\bbar}{\overline{\bvec}}$$ $$\newcommand{\bhat}{\widehat{\bvec}}$$ $$\newcommand{\bperp}{\bvec^\perp}$$ $$\newcommand{\xhat}{\widehat{\xvec}}$$ $$\newcommand{\vhat}{\widehat{\vvec}}$$ $$\newcommand{\uhat}{\widehat{\uvec}}$$ $$\newcommand{\what}{\widehat{\wvec}}$$ $$\newcommand{\Sighat}{\widehat{\Sigma}}$$ $$\newcommand{\lt}{<}$$ $$\newcommand{\gt}{>}$$ $$\newcommand{\amp}{&}$$ $$\definecolor{fillinmathshade}{gray}{0.9}$$

For a function $$z = f (x, y)$$, we learned that the partial derivatives $$\dfrac{∂f}{∂x}\text{ and} \dfrac{∂f}{∂y}$$ represent the (instantaneous) rate of change of $$f$$ in the positive $$x$$ and $$y$$ directions, respectively. What about other directions? It turns out that we can find the rate of change in any direction using a more general type of derivative called a directional derivative.

Definition 2.5: directional derivative

Let $$f (x, y)$$ be a real-valued function with domain $$D$$ in $$\mathbb{R}^2$$, and let $$(a,b)$$ be a point in $$D$$. Let $$\textbf{v}$$ be a unit vector in $$\mathbb{R}^2$$. Then the directional derivative of $$\textbf{f}$$ at $$\mathbf{(a,b)}$$ in the direction of $$\mathbf{v}$$, denoted by $$D_v f(a,b)$$, is defined as

$D_v f(a,b)=\lim \limits_{h \to 0}\dfrac{f((a,b)+h\textbf{v})-f(a,b)}{h}\label{Eq2.8}$

Notice in the definition that we seem to be treating the point $$(a,b)$$ as a vector, since we are adding the vector $$h\textbf{v}$$ to it. But this is just the usual idea of identifying vectors with their terminal points, which the reader should be used to by now. If we were to write the vector $$\textbf{v}$$ as $$\textbf{v} = (v_1 ,v_2)$$, then

$D_v f (a,b)=\lim \limits_{h \to 0}\dfrac{f (a+ hv_1 ,b + hv_2)− f (a,b)}{h}\label{Eq2.9}$

From this we can immediately recognize that the partial derivatives $$\dfrac{∂f}{∂x}\text{ and} \dfrac{∂f}{∂y}$$ are special cases of the directional derivative with $$\textbf{v} = \textbf{i} = (1,0)\text{ and } \textbf{v} = \textbf{j} = (0,1)$$, respectively. That is, $$\dfrac{∂f}{∂x} = D_i f\text{ and } \dfrac{∂f}{∂y} = D_j f$$. Since there are many vectors with the same direction, we use a unit vector in the definition, as that represents a “standard” vector for a given direction.

If $$f (x, y)$$ has continuous partial derivatives $$\dfrac{∂f}{∂x}\text{ and }\dfrac{∂f}{∂y}$$ (which will always be the case in this text), then there is a simple formula for the directional derivative:

Theorem 2.2

Let $$f (x, y)$$ be a real-valued function with domain $$D$$ in $$\mathbb{R}^2$$ such that the partial derivatives $$\dfrac{∂f}{∂x}\text{ and }\dfrac{∂f}{∂y}$$ exist and are continuous in $$D$$. Let $$(a,b)$$ be a point in $$D$$, and let $$\textbf{v} = (v_1 ,v_2)$$ be a unit vector in $$\mathbb{R}^2$$. Then

$D_v f (a,b) = v_1 \dfrac{∂f}{∂x} (a,b)+ v_2 \dfrac{∂f}{∂y} (a,b)\label{Eq2.10}$

Proof: Note that if $$\textbf{v} = \textbf{i}$$ = (1,0) then the above formula reduces to $$D_v f (a,b) = \dfrac{∂f}{∂x} (a,b)$$, which we know is true since $$D_i f = \dfrac{∂f}{∂x}$$, as we noted earlier. Similarly, for $$\textbf{v} = \textbf{j} = (0,1)$$ the formula reduces to $$D_v f (a,b) = \dfrac{∂f}{∂y} (a,b)$$, which is true since $$D_j f = \dfrac{∂f}{∂y}$$. So since $$\textbf{i} = (1,0)\text{ and }\textbf{j} = (0,1)$$ are the only unit vectors in $$\mathbb{R}^2$$ with a zero component, then we need only show the formula holds for unit vectors $$\textbf{v} = (v_1 ,v_2)\text{ with }v_1 \neq 0 \text{ and }v_2 \neq 0$$. So fix such a vector $$\textbf{v}$$ and fix a number $$h \neq 0$$.

Then

$f (a+ hv_1 ,b + hv_2)− f (a,b) = f (a+ hv_1 ,b + hv_2)− f (a+ hv_1 ,b)+ f (a+ hv_1 ,b)− f (a,b)\label{Eq2.11}$

Since $$h \neq 0 \text{ and }v_2 \neq 0$$, then $$hv_2 \neq 0$$ and thus any number $$c$$ between $$b \text{ and }b + hv_2$$ can be written as $$c = b+\alpha hv_2$$ for some number $$0 < \alpha < 1$$. So since the function $$f (a+hv_1 , y)$$ is a realvalued function of $$y$$ (since $$a + hv_1$$ is a fixed number), then the Mean Value Theorem from single-variable calculus can be applied to the function $$g(y) = f (a + hv_1 , y)$$ on the interval $$[b,b + hv_2]$$ (or $$[b + hv_2 ,b]$$ if one of $$h \text{ or }v_2$$ is negative) to find a number $$0 < \alpha < 1$$ such that

$\nonumber \dfrac{∂f}{∂y} (a+ hv_1 ,b +\alpha hv_2) = g ′ (b +\alpha hv_2)=\dfrac{g(b + hv_2)− g(b)}{b + hv_2 − b}=\dfrac{f (a+ hv_1 ,b + hv_2)− f (a+ hv_1 ,b)}{hv_2}$

and so

$\nonumber f (a+ hv_1 ,b + hv_2)− f (a+ hv_1 ,b) = hv_2 \dfrac{∂f}{∂y} (a+ hv_1 ,b +\alpha hv_2) .$

By a similar argument, there exists a number $$0 < \beta < 1$$ such that

$\nonumber f (a+ hv_1 ,b)− f (a,b) = hv_1 \dfrac{∂f}{∂x} (a+\beta hv_1 ,b) .$

Thus, by Equation \ref{Eq2.11}, we have

\nonumber \begin{align} \dfrac{f (a+ hv_1 ,b + hv_2)− f (a,b)}{h}&=\dfrac{hv_2 \dfrac{∂f}{∂y} (a+ hv_1 ,b +\alpha hv_2)+ hv_1 \dfrac{∂f}{∂x} (a+\beta hv_1 ,b)}{h} \\[4pt] \nonumber &=v_2 \dfrac{∂f}{∂y} (a+ hv_1 ,b +\alpha hv_2)+ v_1 \dfrac{∂f}{∂x} (a+\beta hv_1 ,b)\end{align}

so by Equation \ref{Eq2.9} we have

\nonumber \begin{align} D_v f (a,b)&=\lim \limits_{h \to 0}\dfrac{f (a+ hv_1 ,b + hv_2)− f (a,b)}{h} \\[4pt] \nonumber &=\lim \limits_{h \to 0}\left [v_2 \dfrac{∂f}{∂y} (a+ hv_1 ,b +\alpha hv_2)+ v_1 \dfrac{∂f}{∂x} (a+\beta hv_1 ,b)\right ] \\[4pt] \nonumber &= v_2 \dfrac{∂f}{∂y} (a,b)+ v_1 \dfrac{∂f}{∂x} (a,b) \text{ by the continuity of } \dfrac{∂f}{∂x} \text{ and } \dfrac{∂f}{∂y}\text{, so} \\[4pt] \nonumber D_v f (a,b) &= v_1 \dfrac{∂f}{∂x} (a,b)+ v_2 \dfrac{∂f}{∂y} (a,b) \end{align}

$\nonumber \text{after reversing the order of summation.}\tag{$$\textbf{QED}$$}$

Note that $$D_v f (a,b) = v \cdot \left (\dfrac{∂f}{∂x} (a,b), \dfrac{∂f}{∂y} (a,b) \right )$$. The second vector has a special name:

Definition 2.6

For a real-valued function $$f (x, y)$$, the gradient of $$f$$, denoted by $$\nabla f$$, is the vector

$\nabla f =\left ( \dfrac{∂f}{∂x} , \dfrac{∂f}{∂y} \right ) \label{Eq2.12}$

in $$\mathbb{R}^2$$. For a real-valued function $$f (x, y, z)$$, the gradient is the vector

$\nabla f = \left ( \dfrac{∂f}{∂x} , \dfrac{∂f}{∂y} , \dfrac{∂f}{∂z}\right ) \label{Eq2.13}$

in $$\mathbb{R}^ 3$$. The symbol $$\nabla$$ is pronounced “del”.

Corollary 2.3

$\nonumber D_v f = \textbf{v} \cdot \nabla f$

Example 2.15

Find the directional derivative of $$f (x, y) = x y^2 + x^3 y$$ at the point (1,2) in the direction of $$\textbf{v} = \left ( \dfrac{1}{\sqrt{2}},\dfrac{1}{\sqrt{2}}\right )$$.

Solution

We see that $$\nabla f = (y^2 +3x^2 y,2x y+ x^3 )$$, so

$\nonumber D_v f (1,2) = \textbf{v}\cdot \nabla f (1,2) = \left ( \dfrac{1}{\sqrt{2}},\dfrac{1}{\sqrt{2}}\right ) \cdot (2^2 +3(1)^2 (2),2(1)(2)+1^3 ) = \dfrac{15}{\sqrt{2}}$

A real-valued function $$z = f (x, y)$$ whose partial derivatives $$\dfrac{∂f}{∂x}\text{ and }\dfrac{∂f}{∂y}$$ exist and are continuous is called continuously differentiable. Assume that $$f (x, y)$$ is such a function and that $$\nabla f \neq \textbf{0}$$. Let $$c$$ be a real number in the range of $$f$$ and let $$\textbf{v}$$ be a unit vector in $$\mathbb{R}^2$$ which is tangent to the level curve $$f (x, y) = c$$ (see Figure 2.4.1).

The value of $$f (x, y)$$ is constant along a level curve, so since $$\textbf{v}$$ is a tangent vector to this curve, then the rate of change of $$f$$ in the direction of $$\textbf{v}$$ is 0, i.e. $$D_v f = 0$$. But we know that $$D_v f = \textbf{v} \cdot \nabla f = \left\lVert \textbf{v} \right\rVert \left\lVert \nabla f \right\rVert \cos \theta$$, where $$\theta$$ is the angle between $$\textbf{v} \text{ and }\nabla f$$. So since $$\left\lVert \textbf{v} \right\rVert = 1 \text{ then }D_v f = \left\lVert \nabla f \right\rVert \cos \theta$$. So since $$\nabla f \neq \textbf{0}\text{ then }D_v f = 0 ⇒ \cos \theta = 0 ⇒ \theta = 90^\circ$$. In other words, $$\nabla f \perp \textbf{v}$$, which means that $$\nabla f$$ is normal to the level curve.

In general, for any unit vector $$\textbf{v}$$ in $$\mathbb{R}^2$$, we still have $$D_v f = \left\lVert \nabla f \right\rVert \cos \theta \text{, where }\theta$$ is the angle between $$\textbf{v} \text{ and }\nabla f$$. At a fixed point $$(x, y)$$ the length $$\left\lVert \nabla f \right\rVert$$ is fixed, and the value of $$D_v f$$ then varies as $$\theta$$ varies. The largest value that $$D_v f$$ can take is when $$\cos \theta = 1 (\theta = 0^\circ )$$, while the smallest value occurs when $$\cos \theta = −1 (\theta = 180^\circ )$$. In other words, the value of the function $$f$$ increases the fastest in the direction of $$\nabla f$$ (since $$\theta = 0^\circ$$ in that case), and the value of $$f$$ decreases the fastest in the direction of $$−\nabla f$$ (since $$\theta = 180^\circ$$ in that case). We have thus proved the following theorem:

Theorem 2.4

Let $$f (x, y)$$ be a continuously differentiable real-valued function, with $$\nabla f \neq 0$$. Then:

1. The gradient $$\nabla f$$ is normal to any level curve $$f (x, y) = c$$.
2. The value of $$f (x, y)$$ increases the fastest in the direction of $$\nabla f$$.
3. The value of $$f (x, y)$$ decreases the fastest in the direction of $$−\nabla f$$.

Example 2.16

In which direction does the function $$f (x, y) = x y^2 + x^3 y$$ increase the fastest from the point (1,2)? In which direction does it decrease the fastest?

Solution

Since $$\nabla f = (y^2 + 3x^2 y,2xy + x^3 )$$, then $$\nabla f (1,2) = (10,5) \neq \textbf{0}$$. A unit vector in that direction is $$\textbf{v} = \dfrac{\nabla f}{\left\lVert \nabla f \right\rVert} = \left (\dfrac{2}{\sqrt{5}} ,\dfrac{1}{\sqrt{5}}\right )$$. Thus, $$f$$ increases the fastest in the direction of $$\left (\dfrac{2}{\sqrt{5}} ,\dfrac{1}{\sqrt{5}}\right )$$ and decreases the fastest in the direction of $$\left (\dfrac{−2}{\sqrt{5}},\dfrac{−1}{\sqrt{5}}\right )$$.

Though we proved Theorem 2.4 for functions of two variables, a similar argument can be used to show that it also applies to functions of three or more variables. Likewise, the directional derivative in the three-dimensional case can also be defined by the formula $$D_v f = \textbf{v}\cdot \nabla f$$.

Example 2.17

The temperature $$T$$ of a solid is given by the function $$T(x, y, z) = e^−x + e^{−2y} + e^{4z}\text{, where }x, y, z$$ are space coordinates relative to the center of the solid. In which direction from the point (1,1,1) will the temperature decrease the fastest?

Solution

Since $$\nabla f = (−e^{−x} ,−2e^{−2y} ,4e^{4z} )$$, then the temperature will decrease the fastest in the direction of $$−\nabla f (1,1,1) = (e^{−1} ,2e^{−2} ,−4e^4 )$$.

This page titled 2.4: Directional Derivatives and the Gradient is shared under a GNU Free Documentation License 1.3 license and was authored, remixed, and/or curated by Michael Corral via source content that was edited to the style and standards of the LibreTexts platform.