Processing math: 84%
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

6.3: Differentiable Functions

( \newcommand{\kernel}{\mathrm{null}\,}\)

As we know, a function f:E1E(on E1) is differentiable at pE1 iff, with Δf=f(x)f(p) and Δx=xp,

f(p)=limxpΔfΔx exists ( finite ).

Setting Δx=xp=t,Δf=f(p+t)f(p), and f(p)=v, we may write this equation as

limt0|Δftv|=0,

or

limt01|t||f(p+t)f(p)vt|=0

Now define a map ϕ:E1E by ϕ(t)=tv,v=f(p)E.

Then ϕ is linear and continuous, i.e., ϕL(E1,E); so by Corollary 2 in §2, we may express (1) as follows: there is a map ϕL(E1,E) such that

limt01|t||Δfϕ(t)|=0.

We adopt this as a definition in the general case, f:EE, as well.

Definition: Differentiable at a Point

A function f:EE where E and E are normed spaces over the same scalar field) is said to be differentiable at a point pE iff there is a map

ϕL(E,E)

such that

limt01|t||Δfϕ(t)|=0;

that is,

limt01|t|[f(p+t)f(p)ϕ(t)]=0.

As we show below, ϕ is unique (for a fixed p), if it exists.

We call ϕ the differential of f at p, briefly denoted df. As it depends on p, we also write df(p;t) for df(t) and df(p;) for df.

Some authors write f(p) for df(p;) and call it the derivative at p, but we shall not do this (see Preface). Following M. Spivak, however, we shall use "[f(p)]" for its matrix, as follows.

Definition: Jacobian matrix

If E=En(Cn) and E=Em(Cm), and f:EE is differentiable at p, we set

[f(p)]=[df(p;)]

and call it the Jacobian matrix of f at p.

Note 1. In Chapter 5, §6, we did not define df as a mapping. However, if E=E1, the function value

df(p;t)=vt=f(p)Δx

is as in Chapter 5, §6.

Also, [f(p)] is a 1×1 matrix with single term f(p). (Why?) This motivated Definition 2.

Theorem 6.3.1

(uniqueness of df). If f:EE is differentiable at p, then the map ϕ described in Definition 1 is unique (dependent on f and p only).

Proof

Suppose there is another linear map g:EE such that

limt01|t|[f(p+t)f(p)g(t)]=limt01|t|[Δfg(t)]=0.

Let h=ϕg. By Corollary 1 in §2, h is linear.

Also, by the triangle law,

|h(t)|=|ϕ(t)g(t)||Δfϕ(t)|+|Δfg(t)|.

Hence, dividing by |t|,

|h(t|t|)|=1|t||h(t)|1|t||Δfϕ(t)|+1|t||Δfg(t)|.

By (3) and (2), the right side expressions tend to 0 as t0. Thus

limt0h(t|t|)=0.

This remains valid also if t0 over any line through 0, so that t/|t| remains constant, say t/|t|=u, where u is an arbitrary (but fixed) unit vector.

Then

h(t|t|)=h(u)

is constant; so it can tend to 0 only if it equals 0, so h(u)=0 for any unit vector u.

Since any xE can be written as x=|x|u, linearity yields

h(x)=|x|h(u)=0.

Thus h=ϕg=0 on E, and so ϕ=g after all, proving the uniqueness of ϕ.

Theorem 6.3.2

If f is differentiable at p, then

(i) f is continuous at p;

(ii) for any u0, has the u-directed derivative

Duf(p)=df(p;u).

Proof

By assumption, formula (2) holds for ϕ=df(p;).

Thus, given ε>0, there is δ>0 such that, setting Δf=f(p+t)f(p) we have

1|t||Δfϕ(t)|<ε whenever 0<|t|<δ;

or, by the triangle law,

|Δf||Δfϕ(t)|+|ϕ(t)|ε|t|+|ϕ(t)|,0<|t|<δ.

Now, by Definition 1,ϕ is linear and continuous; so

limt0|ϕ(t)|=|ϕ(0)|=0.

Thus, making t0 in (5), with ε fixed, we get

limt0|Δf|=0.

As t is just another notation for Δx=xp, this proves assertion (i).

Next, fix any u0 in E, and substitute tu for t in (4).

In other words, t is a real variable, 0<t<δ/|u|, so that t=tu satisfies 0<|t|<δ.

Multiplying by |u|, we use the linearity of ϕ to get

ε|u|>|Δftϕ(tu)t|=|Δftϕ(u)|=|f(p+tu)f(p)tϕ(u)|.

As ε is arbitrary, we have

ϕ(u)=limt01t[f(p+tu)f(p)].

But this is simply Duf(p), by Definition 1 in §1.

Thus Duf(p)=ϕ(u)=df(p;u), proving (ii).

Note 2. If E=En(Cn), Theorem 2(iii) shows that if f is differentiable at p, it has the n partials

Dkf(p)=df(p;ek),k=1,,n.

But the converse fails: the existence of the Dkf(p) does not even imply continuity, let alone differentiability (see §1). Moreover, we have the following result.

Corollary 6.3.1

If E=En(Cn) and if f:EE is differentiable at p, then

df(p;t)=nk=1tkDkf(p)=nk=1tkxkf(p),

where t=(t1,,tn).

Proof

By definition, ϕ=df(p;) is a linear map for a fixed p.

If E=En or Cn, we may use formula (3) of §2, replacing f and x by ϕ and t, and get

ϕ(t)=df(p;t)=nk=1tkdf(p;ek)=nk=1tkDkf(p)

by Note 2.

Note 3. In classical notation, one writes Δxk or dxk for tk in (6). Thus, omitting p and t, formula (6) is often written as

df=fx1dx1+fx2dx2++fxndxn.

In particular, if n=3, we write x,y,z for x1,x2,x3. This yields

df=fxdx+fydy+fzdz

(a familiar calculus formula).

Note 4. If the range space E in Corollary 1 is E1(C), then the Dkf(p) form an n-tuple of scalars, i.e., a vector in En(Cn).

In case f:EnE1, we denote it by

f(p)=(D1f(p),,Dnf(p))=nk=1ekDkf(p).

In case f:CnC, we replace the Dkf(p) by their conjugates ¯Dkf(p) and set

f(p)=nk=1ek¯Dkf(p).

The vector f(p) is called the gradient of f ("grad f") at p.

From (6) we obtain

df(p;t)=nk=1tkDkf(p)=tf(p)

(dot product of t by f(p)), provided f:EnE1 (or f:CnC) is differentiable at p.

This leads us to the following result.

Corollary 6.3.2

A function f:EnE1 (or f:CnC) is differentiable at p iff

limt¯01|t||f(p+t)f(p)tv|=0

for some vEn(Cn).

In this case, necessarily v=f(p) and tv=df(p;t),tEn(Cn).

Proof

If f is differentiable at p, we may set ϕ=df(p;) and v=f(p)

Then by (7),

ϕ(t)=df(p;t)=tv;

so by Definition 1, (8) results.

Conversely, if some v satisfies (8), set ϕ(t)=tv. Then (8) implies (2), and ϕ is linear and continuous.

Thus by definition, f is differentiable at p; so (7) holds.

Also, ϕ is a linear functional on En(Cn). By Theorem 2(ii) in §2, the v in ϕ(t)=tv is unique, as is ϕ.

Thus by (7), v=f(p) necessarily.

Corollary 6.3.3 (law of the mean)

If f:EnE1 (real) is relatively continuous on a closed segment L[p,q],pq, and differentiable on L(p,q), then

f(q)f(p)=(qp)f(x0)

for some x0L(p,q).

Proof

Let

r=|qp|,v=1r(qp), and rv=(qp).

By (7) and Theorem 2(ii),

Dvf(x)=df(x;v)=vf(x)

for xL(p,q). Thus by formula (3') of Corollary 2 in §1,

f(q)f(p)=rDvf(x0)=rvf(x0)=(qp)f(x0)

for some x0L(p,q).

As we know, the mere existence of partials does not imply differentiability. But the existence of continuous partials does. Indeed, we have the following theorem.

Theorem 6.3.3

Let E=En(Cn).

If f:EE has the partial derivatives Dkf(k=1,,n) on all of an open set AE, and if the Dkf are continuous at some pA, then f is differentiable at p.

Proof

With p as above, let

ϕ(t)=nk=1tkDkf(p) with t=nk=1tkekE.

Then ϕ is continuous (a polynomial!) and linear (Corollary 2 in §2).

Thus by Definition 1, it remains to show that

limt0|t||Δfϕ(t)|=0;

that is;

limt01|t||f(p+t)f(p)nk=1tkDkf(p)|=0.

To do this, fix ε>0. As A is open and the Dkf are continuous at pA there is a δ>0 such that Gp(δ)A and simultaneously (explain this!)

(xGp(δ))|Dkf(x)Dkf(p)|<εn,k=1,,n.

Hence for any set IGp(δ)

supxI|Dkf(x)Dkf(p)|εn.(Why?)

Now fix any tE,0<|t|<δ, and let p0=p,

pk=p+ki=1tiei,k=1,,n.

Then

pn=p+ni=1tiei=p+t,

\left|\vec{p}_{k}-\vec{p}_{k-1}\right|=\left|t_{k}\right|, and all \vec{p}_{k} lie in G_{\vec{p}}(\delta), for

\left|\vec{p}_{k}-\vec{p}\right|=\left|\sum_{i=1}^{k} t_{i} e_{i}\right|=\sqrt{\sum_{i=1}^{k}\left|t_{i}\right|^{2}} \leq \sqrt{\sum_{i=1}^{n}\left|t_{i}\right|^{2}}=|\vec{t}|<\delta,

as required.

As G_{p}(\delta) is convex (Chapter 4, §9), the segments I_{k}=L\left[\vec{p}_{k-1}, \vec{p}_{k}\right] all lie in G_{\vec{p}}(\delta) \subseteq A; and by assumption, f has all partials there.

Hence by Theorem 1 in §1, f is relatively continuous on all I_{k}.

All this also applies to the functions g_{k}, defined by

\left(\forall \vec{x} \in E^{\prime}\right) \quad g_{k}(\vec{x})=f(\vec{x})-x_{k} D_{k} f(\vec{p}), \quad k=1, \ldots, n.

(Why?) Here

D_{k} g_{k}(\vec{x})=D_{k} f(\vec{x})-D_{k} f(\vec{p}).

(Why?)

Thus by Corollary 2 in §1, and (11) above,

\begin{aligned}\left|g_{k}\left(\vec{p}_{k}\right)-g_{k}\left(\vec{p}_{k-1}\right)\right| & \leq\left|\vec{p}_{k}-\vec{p}_{k-1}\right| \sup _{x \in I_{k}}\left|D_{k} f(\vec{x})-D_{k} f(\vec{p})\right| \\ & \leq \frac{\varepsilon}{n}\left|t_{k}\right| \leq \frac{\varepsilon}{n}|\vec{t}|, \end{aligned}

since

\left|\vec{p}_{k}-\vec{p}_{k-1}\right|=\left|t_{k} \vec{e}_{k}\right| \leq|\vec{t}|,

by construction.

Combine with (12), recalling that the kth coordinates x_{k}, for \vec{p}_{k} and \vec{p}_{k-1} differ by t_{k}; so we obtain

\begin{aligned}\left|g_{k}\left(\vec{p}_{k}\right)-g_{k}\left(\vec{p}_{k-1}\right)\right| &=\left|f\left(\vec{p}_{k}\right)-f\left(\vec{p}_{k-1}\right)-t_{k} D_{k} f(\vec{p})\right| \\ & \leq \frac{\varepsilon}{n}|\vec{t}|. \end{aligned}

Also,

\begin{aligned} \sum_{k=1}^{n}\left[f\left(\vec{p}_{k}\right)-f\left(\vec{p}_{k-1}\right)\right] &=f\left(\vec{p}_{n}\right)-f\left(\vec{p}_{0}\right) \\ &=f(\vec{p}+\vec{t})-f(\vec{p})=\Delta f(\text {see above}). \end{aligned}

Thus,

\begin{aligned}\left|\Delta f-\sum_{k=1}^{n} t_{k} D_{k} f(\vec{p})\right| &=\left|\sum_{k=1}^{n}\left[f\left(\vec{p}_{k}\right)-f\left(\vec{p}_{k-1}\right)-t_{k} D_{k} f(\vec{p})\right]\right| \\ & \leq n \cdot \frac{\varepsilon}{n}|\vec{t}|=\varepsilon|\vec{t}|. \end{aligned}

As \varepsilon is arbitrary, (10) follows, and all is proved. \quad \square

Theorem \PageIndex{4}

If f : E^{n} \rightarrow E^{m} (or f : C^{n} \rightarrow C^{m}) is differentiable at \vec{p}, with f=\left(f_{1}, \ldots, f_{m}\right), then \left[f^{\prime}(\vec{p})\right] is an m \times n matrix,

\left[f^{\prime}(\vec{p})\right]=\left[D_{k} f_{i}(\vec{p})\right], \quad i=1, \ldots, m, k=1, \ldots, n.

Proof

By definition, \left[f^{\prime}(\vec{p})\right] is the matrix of the linear map \phi=d f(\vec{p} ; \cdot), \phi=\left(\phi_{1}, \ldots, \phi_{m}\right). Here

\phi(\vec{t})=\sum_{k=1}^{n} t_{k} D_{k} f(\vec{p})

by Corollary 1.

As f=\left(f_{1}, \ldots, f_{m}\right), we can compute D_{k} f(\vec{p}) componentwise by Theorem 5 of Chapter 5, §1, and Note 2 in §1 to get

\begin{aligned} D_{k} f(\vec{p}) &=\left(D_{k} f_{1}(\vec{p}), \ldots, D_{k} f_{m}(\vec{p})\right) \\ &=\sum_{i=1}^{m} e_{i}^{\prime} D_{k} f_{i}(\vec{p}), \quad k=1,2, \ldots, n, \end{aligned}

where the e_{i}^{\prime} are the basic vectors in E^{m}\left(C^{m}\right). (Recall that the \vec{e}_{k} are the basic vectors in E^{n}\left(C^{n}\right).)

Thus

\phi(\vec{t})=\sum_{i=1}^{m} e_{i}^{\prime} \phi_{i}(\vec{t}).

Also,

\phi(\vec{t})=\sum_{k=1}^{n} t_{k} \sum_{i=1}^{m} e_{i}^{\prime} D_{k} f_{i}(\vec{p})=\sum_{i=1}^{m} e_{i}^{\prime} \sum_{k=1}^{n} t_{k} D_{k} f_{i}(\vec{p}).

The uniqueness of the decomposition (Theorem 2 in Chapter 3, §§1-3) now yields

\phi_{i}(\vec{t})=\sum_{k=1}^{n} t_{k} D_{k} f_{i}(\vec{p}), \quad i=1, \ldots, m, \quad \vec{t} \in E^{n}\left(C^{n}\right).

If here \vec{t}=\vec{e}_{k}, then t_{k}=1, while t_{j}=0 for j \neq k. Thus we obtain

\phi_{i}\left(\vec{e}_{k}\right)=D_{k} f_{i}(\vec{p}), \quad i=1, \ldots, m, k=1, \ldots, n.

Hence,

\phi\left(\vec{e}_{k}\right)=\left(v_{1 k}, v_{2 k}, \ldots, v_{m k}\right),

where

v_{i k}=\phi_{i}\left(\vec{e}_{k}\right)=D_{k} f_{i}(\vec{p}).

But by Note 3 of §2, v_{1 k}, \ldots, v_{m k} (written vertically) is the kth column of the m \times n matrix [\phi]=\left[f^{\prime}(\vec{p})\right]. Thus formula (14) results indeed. \quad \square

In conclusion, let us stress again that while D_{\vec{u}} f(\vec{p}) is a constant, for a fixed \vec{p}, d f(\vec{p} ; \cdot) is a mapping

\phi \in L\left(E^{\prime}, E\right),

especially "tailored" for \vec{p}.

The reader should carefully study at least the "arrowed" problems below.


This page titled 6.3: Differentiable Functions is shared under a CC BY 3.0 license and was authored, remixed, and/or curated by Elias Zakon (The Trilla Group (support by Saylor Foundation)) via source content that was edited to the style and standards of the LibreTexts platform.

Support Center

How can we help?