6.1: Directional and Partial Derivatives

Last updated
Save as PDF

Page ID: 19197

Elias Zakon
University of Windsor via The Trilla Group (support by Saylor Foundation)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

In Chapter 5 we considered functions \(f : E^{1} \rightarrow E\) of one real variable.

Now we take up functions \(f : E^{\prime} \rightarrow E\) where both \(E^{\prime}\) and \(E\) are normed spaces.

The scalar field of both is always assumed the same: \(E^{1}\) or \(C\) (the complex field). The case \(E=E^{*}\) is excluded here; thus all is assumed finite.

We mostly use arrowed letters \(\vec{p}, \vec{q}, \ldots, \vec{x}, \vec{y}, \vec{z}\) for vectors in the domain space \(E^{\prime},\) and nonarrowed letters for those in \(E\) and for scalars.

As before, we adopt the convention that \(f\) is defined on all of \(E^{\prime},\) with \(f(\vec{x})=0\) if not defined otherwise.

Note that, if \(\vec{p} \in E^{\prime},\) one can express any point \(\vec{x} \in E^{\prime}\) as

\[\vec{x}=\vec{p}+t \vec{u},\]

with \(t \in E^{1}\) and \(\vec{u}\) a unit vector. For if \(\vec{x} \neq \vec{p},\) set

\[t=|\vec{x}-\vec{p}| \text { and } \vec{u}=\frac{1}{t}(\vec{x}-\vec{p});\]

and if \(\vec{x}=\vec{p},\) set \(t=0,\) and any\(\vec{u}\) will do. We often use the notation

\[\vec{t}=\Delta \vec{x}=\vec{x}-\vec{p}=t \vec{u} \quad\left(t \in E^{1}, \vec{t}, \vec{u} \in E^{\prime}\right).\]

First of all, we generalize Definition 1 in Chapter 5, §1.

Definition 1

Given \(f : E^{\prime} \rightarrow E\) and \(\vec{p}, \vec{u} \in E^{\prime}(\vec{u} \neq \overrightarrow{0}),\) we define the directional derivative of \(f\) along \(\vec{u}\) (or \(\vec{u}\) -directed derivative of \(f )\) at \(\vec{p}\) by

\[D_{\vec{u}} f(\vec{p})=\lim _{t \rightarrow 0} \frac{1}{t}[f(\vec{p}+t \vec{u})-f(\vec{p})],\]

if this limit exists in \(E\) (finite).

We also define the \(\vec{u}\) -directed derived function,

\[D_{\vec{u}} f : E^{\prime} \rightarrow E,\]

as follows. For any \(\vec{p} \in E^{\prime}\),

\[D_{\vec{u} f(\vec{p})}=\left\{\begin{array}{ll}{\lim _{t \rightarrow 0} \frac{1}{t}[f(\vec{p}+t \vec{u})-f(\vec{p})]} & {\text { if this limit exists, }} \\ {0} & {\text { otherwise. }}\end{array}\right.\]

Thus \(D_{\vec{u}} f\) is always defined, but the name derivative is used only if the limit (1) exists (finite). If it exists for each \(\vec{p}\) in a set \(B \subseteq E^{\prime},\) we call \(D_{\vec{u}} f\) (in classical notation \(\partial f / \partial \vec{u} )\) the \(\vec{u}\) -directed derivative of \(f\) on \(B\).

Note that, as \(t \rightarrow 0, \vec{x}\) tends to \(\vec{p}\) over the line \(\vec{x}=\vec{p}+t \vec{u}.\) Thus \(D_{\vec{u}} f(\vec{p})\) can be treated as a relative limit over that line. Observe that it depends on both the direction and the length of \(\vec{u}.\) Indeed, we have the following result.

Corollary \(\PageIndex{1}\)

Given \(f : E^{\prime} \rightarrow E, \vec{u} \neq \overrightarrow{0},\) and a scalar \(s \neq 0,\) we have

\[D_{s \vec{u}} f=s D_{\vec{u}} f.\]

Moreover, \(D_{s \vec{u}} f(\vec{p})\) is a genuine derivative iff \(D_{\vec{u}} f(\vec{p})\) is.

Proof

Set \(t=s \theta\) in (1) to get

\[s D_{\vec{u}} f(\vec{p})=\lim _{\theta \rightarrow 0} \frac{1}{\theta}[f(\vec{p}+\theta s \vec{u})-f(\vec{p})]=D_{s \vec{u}} f(\vec{p}). \quad \square\]

In particular, taking \(s=1 /|\vec{u}|,\) we have

\[|s \vec{u}|=\frac{|\vec{u}|}{|\vec{u}|}=1 \text { and } D_{\vec{u}} f=\frac{1}{s} D_{s \vec{u}} f.\]

Thus all reduces to the case \(D_{\vec{v}} f,\) where \(\vec{v}=s \vec{u}\) is a unit vector. This device, called normalization, is often used, but actually it does not simplify matters.

If \(E^{\prime}=E^{n}\left(C^{n}\right),\) then \(f\) is a function of \(n\) scalar variables \(x_{k}(k=1, \ldots, n)\) and \(E^{\prime}\) has the \(n\) basic unit vectors \(\vec{e}_{k}.\) This example leads us to the following definition.

Definition 2

If in formula (1), \(E^{\prime}=E^{n}\left(C^{n}\right)\) and \(\vec{u}=\vec{e}_{k}\) for a fixed \(k \leq n,\) we call \(D_{\vec{u}} f\) the partially derived function for \(f,\) with respect to \(x_{k},\) denoted

\[D_{k} f \text { or } \frac{\partial f}{\partial x_{k}},\]

and the limit (1) is called the partial derivative of \(f\) at \(\vec{p},\) with respect to \(x_{k},\) denoted

\[D_{k} f(\vec{p}), \text { or } \frac{\partial}{\partial x_{k}} f(\vec{p}), \text { or }\left.\frac{\partial f}{\partial x_{k}}\right|_{\vec{x}=\vec{p}}.\]

If it exists for all \(\vec{p} \in B,\) we call \(D_{k} f\) the partial derivative (briefly, partial) of \(f\) on \(B,\) with respect to \(x_{k}\).

In any case, the derived functions \(D_{k} f(k=1, \ldots, n)\) are always defined on all of \(E^{n}\left(C^{n}\right).\)

If \(E^{\prime}=E^{3}\left(C^{3}\right),\) we often write \(x, y, z\) for \(x_{1}, x_{2}, x_{3},\) and

\[\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z} \text { for } D_{k} f \quad(k=1,2,3).\]

Note 1. If \(E^{\prime}=E^{1},\) scalars are also "vectors," and \(D_{1} f\) coincides with \(f^{\prime}\) as defined in Chapter 5, §1 (except where \(f^{\prime}=\pm \infty\)). Explain!

Note 2. As we have observed, the \(\vec{u}\) -directed derivative (1) is obtained by keeping \(\vec{x}\) on the line \(\vec{x}=\vec{p}+t \vec{u}.\)

If \(\vec{u}=\vec{e}_{k},\) the line is parallel to the \(k\)th axis; so all coordinates of \(\vec{x},\) except \(x_{k},\) remain fixed \(\left(x_{i}=p_{i}, i \neq k\right),\) and \(f\) behaves like a function of one variable, \(x_{k}.\) Thus we can compute \(D_{k} f\) by the usual rules of differentiation, treating all \(x_{i}(i \neq k)\) as constants and \(x_{k}\) as the only variable.

For example, let \(f(x, y)=x^{2} y.\) Then

\[\frac{\partial f}{\partial x}=D_{1} f(x, y)=2 x y \text { and } \frac{\partial f}{\partial y}=D_{2} f(x, y)=x^{2}.\]

Note 3. More generally, given \(\vec{p}\) and \(\vec{u} \neq \overrightarrow{0},\) set

\[h(t)=f(\vec{p}+t \vec{u}), \quad t \in E^{1}.\]

Then \(h(0)=f(\vec{p});\) so

\[\begin{aligned} D_{\vec{u}} f(\vec{p}) &=\lim _{t \rightarrow 0} \frac{1}{t}[f(\vec{p}+t \vec{u})-f(\vec{p})] \\ &=\lim _{t \rightarrow 0} \frac{h(t)-h(0)}{t-0} \\ &=h^{\prime}(0) \end{aligned}\]

if the limit exists. Thus all reduces to a function \(h\) of one real variable.

For functions \(f : E^{1} \rightarrow E,\) the existence of a finite derivative ("differentiability") at \(p\) implies continuity at \(p\) (Theorem 1 of Chapter 5, §1). But in the general case, \(f : E^{\prime} \rightarrow E,\) this may fail even if \(D_{\vec{u}} f(\vec{p})\) exists for all \(\vec{u} \neq \overrightarrow{0}\).

Examples

(a) Define \(f : E^{2} \rightarrow E^{1}\) by

\[f(x, y)=\frac{x^{2} y}{x^{4}+y^{2}}, \quad f(0,0)=0.\]

Fix a unit vector \(\vec{u}=\left(u_{1}, u_{2}\right)\) in \(E^{2}.\) Let \(\vec{p}=(0,0).\) To find \(D_{\vec{u}} f(p),\) use the \(h\) of Note 3 :

\[h(t)=f(\vec{p}+t \vec{u})=f(t \vec{u})=f\left(t u_{1}, t u_{2}\right)=\frac{t u_{1}^{2} u_{2}}{t^{2} u_{1}^{4}+u_{2}^{2}} \text { if } u_{2} \neq 0,\]

and \(h=0\) if \(u_{2}=0.\) Hence

\[D_{\vec{u}} f(\vec{p})=h^{\prime}(0)=\frac{u_{1}^{2}}{u_{2}} \text { if } u_{2} \neq 0,\]

and \(h^{\prime}(0)=0\) if \(u_{2}=0.\) Thus \(D_{\vec{u}}(\overrightarrow{0})\) exists for all \(\vec{u}.\) Yet \(f\) is discontinuous at \(\overrightarrow{0}\) (see Problem 9 in Chapter 4, §3).

(b) Let

\[f(x, y)=\left\{\begin{array}{ll}{x+y} & {\text { if } x y=0,} \\ {1} & {\text { otherwise.}}\end{array}\right.\]

Then \(f(x, y)=x\) on the \(x\)-axis; so \(D_{1} f(0,0)=1\).

Yet \(f\) is discontinuous at \(\overrightarrow{0}\) (even relatively so) over any line \(y=a x\) \((a \neq 0).\) For on that line, \(f(x, y)=1\) if \((x, y) \neq(0,0);\) so \(f(x, y) \rightarrow 1\) but \(f(0,0)=0+0=0\).

Thus continuity at \(\overrightarrow{0}\) fails. (But see Theorem 1 below!)

Hence, if differentiability is to imply continuity, it must be defined in a stronger manner. We do it in §3. For now, we prove only some theorems on partial and directional derivatives, based on those of Chapter 5.

Theorem \(\PageIndex{1}\)

If \(f : E^{\prime} \rightarrow E\) has a \(\vec{u}\)-directed derivative at \(\vec{p} \in E^{\prime},\) then \(f\) is relatively continuous at \(\vec{p}\) over the line

\[\vec{x}=\vec{p}+t \vec{u} \quad\left(\overrightarrow{0} \neq \vec{u} \in E^{\prime}\right).\]

Proof

Set \(h(t)=f(\vec{p}+t \vec{u}), t \in E^{1}\).

By Note 3, our assumption implies that \(h\) (a function on \(E^{1}\)) is differentiable at \(0.\)

By Theorem 1 in Chapter 5, §1, then, \(h\) is continuous at \(0;\) so

\[\lim _{t \rightarrow 0} h(t)=h(0)=f(\vec{p}),\]

i.e.,

\[\lim _{t \rightarrow 0} f(\vec{p}+t \vec{u})=f(\vec{p}).\]

But this means that \(f(\vec{x}) \rightarrow f(\vec{p})\) as \(\vec{x} \rightarrow \vec{p}\) over the line \(\vec{x}=\vec{p}+t \vec{u},\) for, on that line, \(\vec{x}=\vec{p}+t \vec{u}.\)

Thus, indeed, \(f\) is relatively continuous at \(\vec{p},\) as stated. \(\quad \square\)

Note that we actually used the substitution \(\vec{x}=\vec{p}+t \vec{u}.\) This is admissible since the dependence between \(x\) and \(t\) is one-to-tone (Corollary 2(iii) of Chapter 4, §2). Why?

Theorem \(\PageIndex{2}\)

Let \(E^{\prime} \ni \vec{u}=\vec{q}-\vec{p}, \vec{u} \neq \overrightarrow{0}\).

If \(f : E^{\prime} \rightarrow E\) is relatively continuous on the segment \(I=L[\vec{p}, \vec{q}]\) and has a \(\vec{u}\)-directed derivative on \(I-Q\) (\(Q\) countable), then

\[|f(\vec{q})-f(\vec{p})| \leq \sup \left|D_{\vec{u}} f(\vec{x})\right|, \quad \vec{x} \in I-Q.\]

Proof

Set again \(h(t)=f(\vec{p}+t \vec{u})\) and \(g(t)=\vec{p}+t \vec{u}\).

Then \(h=f \circ g,\) and \(g\) is continuous on \(E^{1}.\) (Why?)

As \(f\) is relatively continuous on \(I=L[\vec{p}, \vec{q}],\) so is \(h=f \circ g\) on the interval \(J=[0,1] \subset E^{1}\) (cf. Chapter 4, §8, Example (1)).

Now fix \(t_{0} \in J.\) If \(\vec{x}_{0}=\vec{p}+t_{0} \vec{u} \in I-Q,\) our assumptions imply the existence of

\[\begin{aligned} D_{\vec{u}} f\left(\vec{x}_{0}\right) &=\lim _{t \rightarrow 0} \frac{1}{t}\left[f\left(\vec{x}_{0}+t \vec{u}\right)-f\left(\vec{x}_{0}\right)\right] \\ &=\lim _{t \rightarrow 0} \frac{1}{t}\left[f\left(\vec{p}+t_{0} \vec{u}+t \vec{u}\right)-f\left(\vec{p}+t_{0} \vec{u}\right)\right] \\ &=\lim _{t \rightarrow 0} \frac{1}{t}\left[h\left(t_{0}+t\right)-h\left(t_{0}\right)\right] \\ &=h^{\prime}\left(t_{0}\right) . \quad \text {(Explain!)} \end{aligned}\]

This can fail for at most a countable set \(Q^{\prime}\) of points \(t_{0} \in J\) (those for which \(\vec{x}_{0} \in Q).\)

Thus \(h\) is differentiable on \(J-Q^{\prime};\) and so, by Corollary 1 in Chapter 5, §4,

\[|h(1)-h(0)| \leq \sup _{t \in J-Q^{\prime}}\left|h^{\prime}(t)\right|=\sup _{\vec{x} \in I-Q}\left|D_{\vec{u}} f(\vec{x})\right|.\]

Now as \(h(1)=f(\vec{p}+\vec{u})=f(\vec{q})\) and \(h(\overrightarrow{0})=f(\vec{p}),\) formula (2) follows. \(\quad \square\)

Theorem \(\PageIndex{3}\)

If in Theorem 2, \(E=E^{1}\) and if \(f\) has a \(\vec{u}\)-directed derivative at least on the open line segment \(L(\vec{p}, \vec{q}),\) then

\[f(\vec{q})-f(\vec{p})=D_{\vec{u}} f\left(\vec{x}_{0}\right)\]

for some \(\vec{x}_{0} \in L(\vec{p}, \vec{q})\).

Proof

The proof is as in Theorem 2, based on Corollary 3 in Chapter 5, §2 (instead of Corollary 1 in Chapter 5, §4).

Theorems 2 and 3 are often used in "normalized" form, as follows.

Corollary \(\PageIndex{2}\)

If in Theorems 2 and 3, we set

\[r=|\vec{u}|=|\vec{q}-\vec{p}| \neq 0 \text { and } \vec{v}=\frac{1}{r} \vec{u},\]

then formulas (2) and (3) can be written as

\[|f(\vec{q})-f(\vec{p})| \leq|\vec{q}-\vec{p}| \sup \left|D_{\vec{v}} f(\vec{x})\right|, \quad \vec{x} \in I-Q,\]

and

\[f(\vec{q})-f(\vec{p})=|\vec{q}-\vec{p}| D_{\vec{v}} f\left(\vec{x}_{0}\right)\]

for some \(\vec{x}_{0} \in L(\vec{p}, \vec{q})\).

For by Corollary 1,

\[D_{\vec{u}} f=r D_{\vec{v}} f=|\vec{q}-\vec{p}| D_{\vec{v}} f;\]

so (2') and (3') follow.