$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

# 6.1: Directional and Partial Derivatives

$$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$$ $$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}[1]{\| #1 \|}$$ $$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

In Chapter 5 we considered functions $$f : E^{1} \rightarrow E$$ of one real variable.

Now we take up functions $$f : E^{\prime} \rightarrow E$$ where both $$E^{\prime}$$ and $$E$$ are normed spaces.

The scalar field of both is always assumed the same: $$E^{1}$$ or $$C$$ (the complex field). The case $$E=E^{*}$$ is excluded here; thus all is assumed finite.

We mostly use arrowed letters $$\vec{p}, \vec{q}, \ldots, \vec{x}, \vec{y}, \vec{z}$$ for vectors in the domain space $$E^{\prime},$$ and nonarrowed letters for those in $$E$$ and for scalars.

As before, we adopt the convention that $$f$$ is defined on all of $$E^{\prime},$$ with $$f(\vec{x})=0$$ if not defined otherwise.

Note that, if $$\vec{p} \in E^{\prime},$$ one can express any point $$\vec{x} \in E^{\prime}$$ as

$\vec{x}=\vec{p}+t \vec{u},$

with $$t \in E^{1}$$ and $$\vec{u}$$ a unit vector. For if $$\vec{x} \neq \vec{p},$$ set

$t=|\vec{x}-\vec{p}| \text { and } \vec{u}=\frac{1}{t}(\vec{x}-\vec{p});$

and if $$\vec{x}=\vec{p},$$ set $$t=0,$$ and any$$\vec{u}$$ will do. We often use the notation

$\vec{t}=\Delta \vec{x}=\vec{x}-\vec{p}=t \vec{u} \quad\left(t \in E^{1}, \vec{t}, \vec{u} \in E^{\prime}\right).$

First of all, we generalize Definition 1 in Chapter 5, §1.

Definition 1

Given $$f : E^{\prime} \rightarrow E$$ and $$\vec{p}, \vec{u} \in E^{\prime}(\vec{u} \neq \overrightarrow{0}),$$ we define the directional derivative of $$f$$ along $$\vec{u}$$ (or $$\vec{u}$$ -directed derivative of $$f )$$ at $$\vec{p}$$ by

$D_{\vec{u}} f(\vec{p})=\lim _{t \rightarrow 0} \frac{1}{t}[f(\vec{p}+t \vec{u})-f(\vec{p})],$

if this limit exists in $$E$$ (finite).

We also define the $$\vec{u}$$ -directed derived function,

$D_{\vec{u}} f : E^{\prime} \rightarrow E,$

as follows. For any $$\vec{p} \in E^{\prime}$$,

$D_{\vec{u} f(\vec{p})}=\left\{\begin{array}{ll}{\lim _{t \rightarrow 0} \frac{1}{t}[f(\vec{p}+t \vec{u})-f(\vec{p})]} & {\text { if this limit exists, }} \\ {0} & {\text { otherwise. }}\end{array}\right.$

Thus $$D_{\vec{u}} f$$ is always defined, but the name derivative is used only if the limit (1) exists (finite). If it exists for each $$\vec{p}$$ in a set $$B \subseteq E^{\prime},$$ we call $$D_{\vec{u}} f$$ (in classical notation $$\partial f / \partial \vec{u} )$$ the $$\vec{u}$$ -directed derivative of $$f$$ on $$B$$.

Note that, as $$t \rightarrow 0, \vec{x}$$ tends to $$\vec{p}$$ over the line $$\vec{x}=\vec{p}+t \vec{u}.$$ Thus $$D_{\vec{u}} f(\vec{p})$$ can be treated as a relative limit over that line. Observe that it depends on both the direction and the length of $$\vec{u}.$$ Indeed, we have the following result.

Corollary $$\PageIndex{1}$$

Given $$f : E^{\prime} \rightarrow E, \vec{u} \neq \overrightarrow{0},$$ and a scalar $$s \neq 0,$$ we have

$D_{s \vec{u}} f=s D_{\vec{u}} f.$

Moreover, $$D_{s \vec{u}} f(\vec{p})$$ is a genuine derivative iff $$D_{\vec{u}} f(\vec{p})$$ is.

Proof

Set $$t=s \theta$$ in (1) to get

$s D_{\vec{u}} f(\vec{p})=\lim _{\theta \rightarrow 0} \frac{1}{\theta}[f(\vec{p}+\theta s \vec{u})-f(\vec{p})]=D_{s \vec{u}} f(\vec{p}). \quad \square$

In particular, taking $$s=1 /|\vec{u}|,$$ we have

$|s \vec{u}|=\frac{|\vec{u}|}{|\vec{u}|}=1 \text { and } D_{\vec{u}} f=\frac{1}{s} D_{s \vec{u}} f.$

Thus all reduces to the case $$D_{\vec{v}} f,$$ where $$\vec{v}=s \vec{u}$$ is a unit vector. This device, called normalization, is often used, but actually it does not simplify matters.

If $$E^{\prime}=E^{n}\left(C^{n}\right),$$ then $$f$$ is a function of $$n$$ scalar variables $$x_{k}(k=1, \ldots, n)$$ and $$E^{\prime}$$ has the $$n$$ basic unit vectors $$\vec{e}_{k}.$$ This example leads us to the following definition.

Definition 2

If in formula (1), $$E^{\prime}=E^{n}\left(C^{n}\right)$$ and $$\vec{u}=\vec{e}_{k}$$ for a fixed $$k \leq n,$$ we call $$D_{\vec{u}} f$$ the partially derived function for $$f,$$ with respect to $$x_{k},$$ denoted

$D_{k} f \text { or } \frac{\partial f}{\partial x_{k}},$

and the limit (1) is called the partial derivative of $$f$$ at $$\vec{p},$$ with respect to $$x_{k},$$ denoted

$D_{k} f(\vec{p}), \text { or } \frac{\partial}{\partial x_{k}} f(\vec{p}), \text { or }\left.\frac{\partial f}{\partial x_{k}}\right|_{\vec{x}=\vec{p}}.$

If it exists for all $$\vec{p} \in B,$$ we call $$D_{k} f$$ the partial derivative (briefly, partial) of $$f$$ on $$B,$$ with respect to $$x_{k}$$.

In any case, the derived functions $$D_{k} f(k=1, \ldots, n)$$ are always defined on all of $$E^{n}\left(C^{n}\right).$$

If $$E^{\prime}=E^{3}\left(C^{3}\right),$$ we often write $$x, y, z$$ for $$x_{1}, x_{2}, x_{3},$$ and

$\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z} \text { for } D_{k} f \quad(k=1,2,3).$

Note 1. If $$E^{\prime}=E^{1},$$ scalars are also "vectors," and $$D_{1} f$$ coincides with $$f^{\prime}$$ as defined in Chapter 5, §1 (except where $$f^{\prime}=\pm \infty$$). Explain!

Note 2. As we have observed, the $$\vec{u}$$ -directed derivative (1) is obtained by keeping $$\vec{x}$$ on the line $$\vec{x}=\vec{p}+t \vec{u}.$$

If $$\vec{u}=\vec{e}_{k},$$ the line is parallel to the $$k$$th axis; so all coordinates of $$\vec{x},$$ except $$x_{k},$$ remain fixed $$\left(x_{i}=p_{i}, i \neq k\right),$$ and $$f$$ behaves like a function of one variable, $$x_{k}.$$ Thus we can compute $$D_{k} f$$ by the usual rules of differentiation, treating all $$x_{i}(i \neq k)$$ as constants and $$x_{k}$$ as the only variable.

For example, let $$f(x, y)=x^{2} y.$$ Then

$\frac{\partial f}{\partial x}=D_{1} f(x, y)=2 x y \text { and } \frac{\partial f}{\partial y}=D_{2} f(x, y)=x^{2}.$

Note 3. More generally, given $$\vec{p}$$ and $$\vec{u} \neq \overrightarrow{0},$$ set

$h(t)=f(\vec{p}+t \vec{u}), \quad t \in E^{1}.$

Then $$h(0)=f(\vec{p});$$ so

\begin{aligned} D_{\vec{u}} f(\vec{p}) &=\lim _{t \rightarrow 0} \frac{1}{t}[f(\vec{p}+t \vec{u})-f(\vec{p})] \\ &=\lim _{t \rightarrow 0} \frac{h(t)-h(0)}{t-0} \\ &=h^{\prime}(0) \end{aligned}

if the limit exists. Thus all reduces to a function $$h$$ of one real variable.

For functions $$f : E^{1} \rightarrow E,$$ the existence of a finite derivative ("differentiability") at $$p$$ implies continuity at $$p$$ (Theorem 1 of Chapter 5, §1). But in the general case, $$f : E^{\prime} \rightarrow E,$$ this may fail even if $$D_{\vec{u}} f(\vec{p})$$ exists for all $$\vec{u} \neq \overrightarrow{0}$$.

Examples

(a) Define $$f : E^{2} \rightarrow E^{1}$$ by

$f(x, y)=\frac{x^{2} y}{x^{4}+y^{2}}, \quad f(0,0)=0.$

Fix a unit vector $$\vec{u}=\left(u_{1}, u_{2}\right)$$ in $$E^{2}.$$ Let $$\vec{p}=(0,0).$$ To find $$D_{\vec{u}} f(p),$$ use the $$h$$ of Note 3 :

$h(t)=f(\vec{p}+t \vec{u})=f(t \vec{u})=f\left(t u_{1}, t u_{2}\right)=\frac{t u_{1}^{2} u_{2}}{t^{2} u_{1}^{4}+u_{2}^{2}} \text { if } u_{2} \neq 0,$

and $$h=0$$ if $$u_{2}=0.$$ Hence

$D_{\vec{u}} f(\vec{p})=h^{\prime}(0)=\frac{u_{1}^{2}}{u_{2}} \text { if } u_{2} \neq 0,$

and $$h^{\prime}(0)=0$$ if $$u_{2}=0.$$ Thus $$D_{\vec{u}}(\overrightarrow{0})$$ exists for all $$\vec{u}.$$ Yet $$f$$ is discontinuous at $$\overrightarrow{0}$$ (see Problem 9 in Chapter 4, §3).

(b) Let

$f(x, y)=\left\{\begin{array}{ll}{x+y} & {\text { if } x y=0,} \\ {1} & {\text { otherwise.}}\end{array}\right.$

Then $$f(x, y)=x$$ on the $$x$$-axis; so $$D_{1} f(0,0)=1$$.

Yet $$f$$ is discontinuous at $$\overrightarrow{0}$$ (even relatively so) over any line $$y=a x$$ $$(a \neq 0).$$ For on that line, $$f(x, y)=1$$ if $$(x, y) \neq(0,0);$$ so $$f(x, y) \rightarrow 1$$ but $$f(0,0)=0+0=0$$.

Thus continuity at $$\overrightarrow{0}$$ fails. (But see Theorem 1 below!)

Hence, if differentiability is to imply continuity, it must be defined in a stronger manner. We do it in §3. For now, we prove only some theorems on partial and directional derivatives, based on those of Chapter 5.

Theorem $$\PageIndex{1}$$

If $$f : E^{\prime} \rightarrow E$$ has a $$\vec{u}$$-directed derivative at $$\vec{p} \in E^{\prime},$$ then $$f$$ is relatively continuous at $$\vec{p}$$ over the line

$\vec{x}=\vec{p}+t \vec{u} \quad\left(\overrightarrow{0} \neq \vec{u} \in E^{\prime}\right).$

Proof

Set $$h(t)=f(\vec{p}+t \vec{u}), t \in E^{1}$$.

By Note 3, our assumption implies that $$h$$ (a function on $$E^{1}$$) is differentiable at $$0.$$

By Theorem 1 in Chapter 5, §1, then, $$h$$ is continuous at $$0;$$ so

$\lim _{t \rightarrow 0} h(t)=h(0)=f(\vec{p}),$

i.e.,

$\lim _{t \rightarrow 0} f(\vec{p}+t \vec{u})=f(\vec{p}).$

But this means that $$f(\vec{x}) \rightarrow f(\vec{p})$$ as $$\vec{x} \rightarrow \vec{p}$$ over the line $$\vec{x}=\vec{p}+t \vec{u},$$ for, on that line, $$\vec{x}=\vec{p}+t \vec{u}.$$

Thus, indeed, $$f$$ is relatively continuous at $$\vec{p},$$ as stated. $$\quad \square$$

Note that we actually used the substitution $$\vec{x}=\vec{p}+t \vec{u}.$$ This is admissible since the dependence between $$x$$ and $$t$$ is one-to-tone (Corollary 2(iii) of Chapter 4, §2). Why?

Theorem $$\PageIndex{2}$$

Let $$E^{\prime} \ni \vec{u}=\vec{q}-\vec{p}, \vec{u} \neq \overrightarrow{0}$$.

If $$f : E^{\prime} \rightarrow E$$ is relatively continuous on the segment $$I=L[\vec{p}, \vec{q}]$$ and has a $$\vec{u}$$-directed derivative on $$I-Q$$ ($$Q$$ countable), then

$|f(\vec{q})-f(\vec{p})| \leq \sup \left|D_{\vec{u}} f(\vec{x})\right|, \quad \vec{x} \in I-Q.$

Proof

Set again $$h(t)=f(\vec{p}+t \vec{u})$$ and $$g(t)=\vec{p}+t \vec{u}$$.

Then $$h=f \circ g,$$ and $$g$$ is continuous on $$E^{1}.$$ (Why?)

As $$f$$ is relatively continuous on $$I=L[\vec{p}, \vec{q}],$$ so is $$h=f \circ g$$ on the interval $$J=[0,1] \subset E^{1}$$ (cf. Chapter 4, §8, Example (1)).

Now fix $$t_{0} \in J.$$ If $$\vec{x}_{0}=\vec{p}+t_{0} \vec{u} \in I-Q,$$ our assumptions imply the existence of

\begin{aligned} D_{\vec{u}} f\left(\vec{x}_{0}\right) &=\lim _{t \rightarrow 0} \frac{1}{t}\left[f\left(\vec{x}_{0}+t \vec{u}\right)-f\left(\vec{x}_{0}\right)\right] \\ &=\lim _{t \rightarrow 0} \frac{1}{t}\left[f\left(\vec{p}+t_{0} \vec{u}+t \vec{u}\right)-f\left(\vec{p}+t_{0} \vec{u}\right)\right] \\ &=\lim _{t \rightarrow 0} \frac{1}{t}\left[h\left(t_{0}+t\right)-h\left(t_{0}\right)\right] \\ &=h^{\prime}\left(t_{0}\right) . \quad \text {(Explain!)} \end{aligned}

This can fail for at most a countable set $$Q^{\prime}$$ of points $$t_{0} \in J$$ (those for which $$\vec{x}_{0} \in Q).$$

Thus $$h$$ is differentiable on $$J-Q^{\prime};$$ and so, by Corollary 1 in Chapter 5, §4,

$|h(1)-h(0)| \leq \sup _{t \in J-Q^{\prime}}\left|h^{\prime}(t)\right|=\sup _{\vec{x} \in I-Q}\left|D_{\vec{u}} f(\vec{x})\right|.$

Now as $$h(1)=f(\vec{p}+\vec{u})=f(\vec{q})$$ and $$h(\overrightarrow{0})=f(\vec{p}),$$ formula (2) follows. $$\quad \square$$

Theorem $$\PageIndex{3}$$

If in Theorem 2, $$E=E^{1}$$ and if $$f$$ has a $$\vec{u}$$-directed derivative at least on the open line segment $$L(\vec{p}, \vec{q}),$$ then

$f(\vec{q})-f(\vec{p})=D_{\vec{u}} f\left(\vec{x}_{0}\right)$

for some $$\vec{x}_{0} \in L(\vec{p}, \vec{q})$$.

Proof

The proof is as in Theorem 2, based on Corollary 3 in Chapter 5, §2 (instead of Corollary 1 in Chapter 5, §4).

Theorems 2 and 3 are often used in "normalized" form, as follows.

Corollary $$\PageIndex{2}$$

If in Theorems 2 and 3, we set

$r=|\vec{u}|=|\vec{q}-\vec{p}| \neq 0 \text { and } \vec{v}=\frac{1}{r} \vec{u},$

then formulas (2) and (3) can be written as

$|f(\vec{q})-f(\vec{p})| \leq|\vec{q}-\vec{p}| \sup \left|D_{\vec{v}} f(\vec{x})\right|, \quad \vec{x} \in I-Q,$

and

$f(\vec{q})-f(\vec{p})=|\vec{q}-\vec{p}| D_{\vec{v}} f\left(\vec{x}_{0}\right)$

for some $$\vec{x}_{0} \in L(\vec{p}, \vec{q})$$.

For by Corollary 1,

$D_{\vec{u}} f=r D_{\vec{v}} f=|\vec{q}-\vec{p}| D_{\vec{v}} f;$

so (2') and (3') follow.