Loading [MathJax]/jax/element/mml/optable/GeneralPunctuation.js
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

2.4: Directional Derivatives and the Gradient

( \newcommand{\kernel}{\mathrm{null}\,}\)

For a function z=f(x,y), we learned that the partial derivatives fx andfy represent the (instantaneous) rate of change of f in the positive x and y directions, respectively. What about other directions? It turns out that we can find the rate of change in any direction using a more general type of derivative called a directional derivative.

Definition 2.5: directional derivative

Let f(x,y) be a real-valued function with domain D in R2, and let (a,b) be a point in D. Let v be a unit vector in R2. Then the directional derivative of f at (a,b) in the direction of v, denoted by Dvf(a,b), is defined as

Dvf(a,b)=limh0f((a,b)+hv)f(a,b)h

Notice in the definition that we seem to be treating the point (a,b) as a vector, since we are adding the vector hv to it. But this is just the usual idea of identifying vectors with their terminal points, which the reader should be used to by now. If we were to write the vector v as v=(v1,v2), then

Dvf(a,b)=limh0f(a+hv1,b+hv2)f(a,b)h

From this we can immediately recognize that the partial derivatives fx andfy are special cases of the directional derivative with v=i=(1,0) and v=j=(0,1), respectively. That is, fx=Dif and fy=Djf. Since there are many vectors with the same direction, we use a unit vector in the definition, as that represents a “standard” vector for a given direction.

If f(x,y) has continuous partial derivatives fx and fy (which will always be the case in this text), then there is a simple formula for the directional derivative:

Theorem 2.2

Let f(x,y) be a real-valued function with domain D in R2 such that the partial derivatives fx and fy exist and are continuous in D. Let (a,b) be a point in D, and let v=(v1,v2) be a unit vector in R2. Then

Dvf(a,b)=v1fx(a,b)+v2fy(a,b)

Proof: Note that if v=i = (1,0) then the above formula reduces to Dvf(a,b)=fx(a,b), which we know is true since Dif=fx, as we noted earlier. Similarly, for v=j=(0,1) the formula reduces to Dvf(a,b)=fy(a,b), which is true since Djf=fy. So since i=(1,0) and j=(0,1) are the only unit vectors in R2 with a zero component, then we need only show the formula holds for unit vectors v=(v1,v2) with v10 and v20. So fix such a vector v and fix a number h0.

Then

f(a+hv1,b+hv2)f(a,b)=f(a+hv1,b+hv2)f(a+hv1,b)+f(a+hv1,b)f(a,b)

Since h0 and v20, then hv20 and thus any number c between b and b+hv2 can be written as c=b+αhv2 for some number 0<α<1. So since the function f(a+hv1,y) is a realvalued function of y (since a+hv1 is a fixed number), then the Mean Value Theorem from single-variable calculus can be applied to the function g(y)=f(a+hv1,y) on the interval [b,b+hv2] (or [b+hv2,b] if one of h or v2 is negative) to find a number 0<α<1 such that

fy(a+hv1,b+αhv2)=g(b+αhv2)=g(b+hv2)g(b)b+hv2b=f(a+hv1,b+hv2)f(a+hv1,b)hv2

and so

f(a+hv1,b+hv2)f(a+hv1,b)=hv2fy(a+hv1,b+αhv2).

By a similar argument, there exists a number 0<β<1 such that

f(a+hv1,b)f(a,b)=hv1fx(a+βhv1,b).

Thus, by Equation ???, we have

f(a+hv1,b+hv2)f(a,b)h=hv2fy(a+hv1,b+αhv2)+hv1fx(a+βhv1,b)h=v2fy(a+hv1,b+αhv2)+v1fx(a+βhv1,b)

so by Equation ??? we have

Dvf(a,b)=limh0f(a+hv1,b+hv2)f(a,b)h=limh0[v2fy(a+hv1,b+αhv2)+v1fx(a+βhv1,b)]=v2fy(a,b)+v1fx(a,b) by the continuity of fx and fy, soDvf(a,b)=v1fx(a,b)+v2fy(a,b)

after reversing the order of summation.

Note that Dvf(a,b)=v(fx(a,b),fy(a,b)). The second vector has a special name:

Definition 2.6

For a real-valued function f(x,y), the gradient of f, denoted by f, is the vector

f=(fx,fy)

in R2. For a real-valued function f(x,y,z), the gradient is the vector

f=(fx,fy,fz)

in R3. The symbol is pronounced “del”.

Corollary 2.3

Dvf=vf

Example 2.15

Find the directional derivative of f(x,y)=xy2+x3y at the point (1,2) in the direction of v=(12,12).

Solution

We see that f=(y2+3x2y,2xy+x3), so

Dvf(1,2)=vf(1,2)=(12,12)(22+3(1)2(2),2(1)(2)+13)=152

A real-valued function z=f(x,y) whose partial derivatives fx and fy exist and are continuous is called continuously differentiable. Assume that f(x,y) is such a function and that f0. Let c be a real number in the range of f and let v be a unit vector in R2 which is tangent to the level curve f(x,y)=c (see Figure 2.4.1).

alt
Figure 2.4.1

The value of f(x,y) is constant along a level curve, so since v is a tangent vector to this curve, then the rate of change of f in the direction of v is 0, i.e. Dvf=0. But we know that Dvf=vf=, where \theta is the angle between \textbf{v} \text{ and }\nabla f. So since \left\lVert \textbf{v} \right\rVert = 1 \text{ then }D_v f = \left\lVert \nabla f \right\rVert \cos \theta . So since \nabla f \neq \textbf{0}\text{ then }D_v f = 0 ⇒ \cos \theta = 0 ⇒ \theta = 90^\circ. In other words, \nabla f \perp \textbf{v}, which means that \nabla f is normal to the level curve.

In general, for any unit vector \textbf{v} in \mathbb{R}^2, we still have D_v f = \left\lVert \nabla f \right\rVert \cos \theta \text{, where }\theta is the angle between \textbf{v} \text{ and }\nabla f. At a fixed point (x, y) the length \left\lVert \nabla f \right\rVert is fixed, and the value of D_v f then varies as \theta varies. The largest value that D_v f can take is when \cos \theta = 1 (\theta = 0^\circ ), while the smallest value occurs when \cos \theta = −1 (\theta = 180^\circ ). In other words, the value of the function f increases the fastest in the direction of \nabla f (since \theta = 0^\circ in that case), and the value of f decreases the fastest in the direction of −\nabla f (since \theta = 180^\circ in that case). We have thus proved the following theorem:

Theorem 2.4

Let f (x, y) be a continuously differentiable real-valued function, with \nabla f \neq 0. Then:

  1. The gradient \nabla f is normal to any level curve f (x, y) = c.
  2. The value of f (x, y) increases the fastest in the direction of \nabla f.
  3. The value of f (x, y) decreases the fastest in the direction of −\nabla f.

Example 2.16

In which direction does the function f (x, y) = x y^2 + x^3 y increase the fastest from the point (1,2)? In which direction does it decrease the fastest?

Solution

Since \nabla f = (y^2 + 3x^2 y,2xy + x^3 ), then \nabla f (1,2) = (10,5) \neq \textbf{0}. A unit vector in that direction is \textbf{v} = \dfrac{\nabla f}{\left\lVert \nabla f \right\rVert} = \left (\dfrac{2}{\sqrt{5}} ,\dfrac{1}{\sqrt{5}}\right ). Thus, f increases the fastest in the direction of \left (\dfrac{2}{\sqrt{5}} ,\dfrac{1}{\sqrt{5}}\right ) and decreases the fastest in the direction of \left (\dfrac{−2}{\sqrt{5}},\dfrac{−1}{\sqrt{5}}\right ).

Though we proved Theorem 2.4 for functions of two variables, a similar argument can be used to show that it also applies to functions of three or more variables. Likewise, the directional derivative in the three-dimensional case can also be defined by the formula D_v f = \textbf{v}\cdot \nabla f.

Example 2.17

The temperature T of a solid is given by the function T(x, y, z) = e^−x + e^{−2y} + e^{4z}\text{, where }x, y, z are space coordinates relative to the center of the solid. In which direction from the point (1,1,1) will the temperature decrease the fastest?

Solution

Since \nabla f = (−e^{−x} ,−2e^{−2y} ,4e^{4z} ), then the temperature will decrease the fastest in the direction of −\nabla f (1,1,1) = (e^{−1} ,2e^{−2} ,−4e^4 ).


This page titled 2.4: Directional Derivatives and the Gradient is shared under a GNU Free Documentation License 1.3 license and was authored, remixed, and/or curated by Michael Corral via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?