5.1: Derivatives of Functions of One Real Variable

Last updated
Save as PDF

Page ID: 19049

Elias Zakon
University of Windsor via The Trilla Group (support by Saylor Foundation)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

In this chapter, \("E"\) will always denote any one of \(E^{1}, E^{*}, C\) (the complex field\(), E^{n},^{*}\) or another normed space. We shall consider functions \(f : E^{1} \rightarrow E\) of one real variable with values in \(E\). Functions \(f : E^{1} \rightarrow E^{*}\) (admitting finite and infinite values) are said to be extended real. Thus \(f : E^{1} \rightarrow E\) may be real, extended real, complex, or vector valued.

Operations in \(E^{*}\) were defined in Chapter 4, §4. Recall, in particular, our conventions \(\left(2^{*}\right)\) there. Due to them, addition, subtraction, and multiplication are always defined in \(E^{*}\) (with sums and products possibly "unorthodox").

To simplify formulations, we shall also adopt the convention that

\[
f(x)=0 \text { unless defined otherwise.}
\]

\(\left("0" \text { stands also for the zero-vector in } E \text { if } E \text { is a vector space.) Thus each }\right.\) function \(f\) is defined on all of \(E^{1}\). For convenience, we call \(f(x)\) "finite" if \(f(x) \neq \pm \infty(\text { also if it is a vector })\).

Definition

For each function \(f : E^{1} \rightarrow E,\) we define its derived function \(f^{\prime} : E^{1} \rightarrow E\)
by setting, for every point \(p \in E^{1}\),

\[
f^{\prime}(p)=\left\{\begin{array}{l}{\lim _{x \rightarrow p} \frac{f(x)-f(p)}{x-p} \text { if this limit exists (finite or not); }} \\ {0, \text { otherwise. }}\end{array}\right.
\]

Thus \(f^{\prime}(p)\) is always defined.

If the limit in \((1)\) exists, we call it the derivative of \(f\) at \(p\).

If, in addition, this limit is finite, we say that \(f\) is differentiable at \(p\).

If this holds for each \(p\) in a set \(B \subseteq E^{1},\) we say that \(f\) has a derivative (respectively, is differentiable) on \(B,\) and we call the function \(f^{\prime}\) thederivative of \(f\) on \(B\).

If the limit in \((1)\) is one sided (with \(x \rightarrow p^{-}\) or \(x \rightarrow p^{+} ),\) we call it a one-sided (left or right) derivative at \(p,\) denoted \(f_{-}^{\prime}\) or \(f_{+}^{\prime}\).

Definition

Given a function \(f : E^{1} \rightarrow E,\) we define its \(n\) th derived function (or derived function of order \(n ),\) denoted \(f^{(n)} : E^{1} \rightarrow E,\) by induction:

\[
f^{(0)}=f, f^{(n+1)}=\left[f^{(n)}\right]^{\prime}, \quad n=0,1,2, \ldots
\]

Thus \(f^{(n+1)}\) is the derived function of \(f^{(n)} .\) By our conventions, \(f^{(n)}\) is defined on all of \(E^{1}\) for each \(n\) and each function \(f : E^{1} \rightarrow E .\) We have \(f^{(1)}=f^{\prime},\) and we write \(f^{\prime \prime}\) for \(f^{(2)}, f^{\prime \prime \prime}\) for \(f^{(3)},\) etc. We say that \(f\) has \(n\) derivatives at a point \(p\) iff the limits

\[
\lim _{x \rightarrow q} \frac{f^{(k)}(x)-f^{(k)}(q)}{x-q}
\]

exist for all \(q\) in a neighborhood \(G_{p}\) of \(p\) and for \(k=0,1, \ldots, n-2,\) and also

\[
\lim _{x \rightarrow p} \frac{f^{(n-1)}(x)-f^{(n-1)}(p)}{x-p}
\]

exists. If all these limits are finite, we say that \(f\) is \(n\) times differentiable on \(I ;\) similarly for one-sided derivatives.

It is an important fact that differentiability implies continuity.

Theorem \(\PageIndex{1}\)

If a function \(f : E^{1} \rightarrow E\) is differentiable at a point \(p \in E^{1},\) it is continuous at \(p,\) and \(f(p)\) is finite (even if \(E=E^{*} )\).

Proof

Setting \(\Delta x=x-p\) and \(\Delta f=f(x)-f(p),\) we have the identity

\[
|f(x)-f(p)|=\left|\frac{\Delta f}{\Delta x} \cdot(x-p)\right| \quad \text { for } x \neq p.
\]

By assumption,

\[
f^{\prime}(p)=\lim _{x \rightarrow p} \frac{\Delta f}{\Delta x}
\]

exists and is finite. Thus as \(x \rightarrow p,\) the right side of \((2)\) (hence the left side as well) tends to \(0,\) so

\[
\lim _{x \rightarrow p}|f(x)-f(p)|=0, \text { or } \lim _{x \rightarrow p} f(x)=f(p)
\]

proving continuity at \(p\).

Also, \(f(p) \neq \pm \infty,\) for otherwise \(|f(x)-f(p)|=+\infty\) for all \(x,\) and so \(|f(x)-f(p)|\) cannot tend to \(0 . \quad \square\)

Note 1. Similarly, the existence of a finite left (right) derivative at \(p\) implies left (right) continuity at \(p\). The proof is the same.

Note 2. The existence of an infinite derivative does not imply continuity, nor does it exclude it. For example, consider the two cases

(i) \(f(x)=\frac{1}{x},\) with \(f(0)=0,\) and

(ii) \(f(x)=\sqrt[3]{x}\).

Give your comments for \(p=0\).

Caution: A function may be continuous on \(E^{1}\) without being differentiable anywhere (thus the converse to Theorem 1 fails). The first such function was indicated by Weierstrass. We give an example due to Olmsted (Advanced Calculus).

Example \(\PageIndex{1}\)

(a) We first define a sequence of functions \(f_{n} : E^{1} \rightarrow E^{1}(n=1,2, \ldots)\) as follows. For each \(k=0, \pm 1, \pm 2, \ldots,\) let

\[
f_{n}(x)=0 \text { if } x=k \cdot 4^{-n}, \text { and } f_{n}(x)=\frac{1}{2} \cdot 4^{-n} \text { if } x=\left(k+\frac{1}{2}\right) \cdot 4^{-n}.
\]

Between \(k \cdot 4^{-n}\) and \(\left(k \pm \frac{1}{2}\right) \cdot 4^{-n}, f_{n}\) is linear (see Figure \(21 ),\) so it is continuous on \(E^{1} .\) The series \(\sum f_{n}\) converges uniformly on \(E^{1} .\) (Verify!)

Let

\[
f=\sum_{n=1}^{\infty} f_{n}.
\]

Then \(f\) is continuous on \(E^{1}(\text { why? yet it is nowhere differentiable.}\)

To prove this fact, fix any \(p \in E^{1} .\) For each \(n,\) let

\[
x_{n}=p+d_{n}, \text { where } d_{n}=\pm 4^{-n-1},
\]

choosing the sign of \(d_{n}\) so that \(p\) and \(x_{n}\) are in the same half of a "sawtooth" in the graph of \(f_{n}\) (Figure 21\()\). Then

\[
f_{n}\left(x_{n}\right)-f_{n}(p)=\pm d_{n}=\pm\left(x_{n}-p\right) . \quad(\text { Why } ?)
\]

Also,

\[
f_{m}\left(x_{n}\right)-f_{m}(p)=\pm d_{n} \text { if } m \leq n
\]

but vanishes for \(m>n .\) (Why?)

Thus, when computing \(f\left(x_{n}\right)-f(p),\) we may replace

\[
f=\sum_{m=1}^{\infty} f_{m} \text { by } f=\sum_{m=1}^{n} f_{m}.
\]

Since

\[
\frac{f_{m}\left(x_{n}\right)-f_{m}(p)}{x_{n}-p}=\pm 1 \text { for } m \leq n.
\]

the fraction

\[
\frac{f\left(x_{n}\right)-f(p)}{x_{n}-p}
\]

is an integer, odd if \(n\) is odd and even if \(n\) is even. Thus this fraction cannot tend to a finite limit as \(n \rightarrow \infty,\) i.e., as \(d_{n}=4^{-n-1} \rightarrow 0\) and \(x_{n}=p+d_{n} \rightarrow p .\) A fortiori, this applies to

\[
\lim _{x \rightarrow p} \frac{f(x)-f(p)}{x-p}.
\]

Thus \(f\) is not differentiable at any \(p\).

The expressions \(f(x)-f(p)\) and \(x-p,\) briefly denoted \(\Delta f\) and \(\Delta x,\) and \(\Delta x,\) are called the increments of \(f\) and \(x\) (at \(p ),\) respectively. 2 We now show that for differentiable functions, \(\Delta f\) and \(\Delta x\) are "nearly proportional' when \(x\) approaches \(p ;\) that is,

\[
\frac{\Delta f}{\Delta x}=c+\delta(x)
\]

with \(c\) constant and \(\lim _{x \rightarrow p} \delta(x)=0\).

Theorem \(\PageIndex{2}\)

A function \(f : E^{1} \rightarrow E\) is differentiable at \(p,\) and \(f^{\prime}(p)=c,\) iff there is a finite \(c \in E\) and a function \(\delta : E^{1} \rightarrow E\) such that \(\lim _{x \rightarrow p} \delta(x)=\delta(p)=0,\) and such that

\[
\Delta f=[c+\delta(x)] \Delta x \quad \text { for all } x \in E^{1}.
\]

Proof

If \(f\) is differentiable at \(p,\) put \(c=f^{\prime}(p) .\) Define \(\delta(p)=0\) and

\[
\delta(x)=\frac{\Delta f}{\Delta x}-f^{\prime}(p) \text { for } x \neq p.
\]

Then \(\lim _{x \rightarrow p} \delta(x)=f^{\prime}(p)-f^{\prime}(p)=0=\delta(p) .\) Also, \((3)\) follows.

Conversely, if \((3)\) holds, then

\[
\frac{\Delta f}{\Delta x}=c+\delta(x) \rightarrow c \text { as } x \rightarrow p(\text { since } \delta(x) \rightarrow 0).
\]

Thus by definition,

\[
c=\lim _{x \rightarrow p} \frac{\Delta f}{\Delta x}=f^{\prime}(p) \text { and } f^{\prime}(p)=c \text { is finite. } \square
\]

Theorem \(\PageIndex{3}\)

(chain rule). Let the functions \(g : E^{1} \rightarrow E^{1}(\text { real })\) and \(f : E^{1} \rightarrow E\) (real or not) be differentiable at \(p\) and \(q,\) respectively, where \(q=g(p) .\) Then the composite function \(h=f \circ g\) is differentiable at \(p,\) and

\[
h^{\prime}(p)=f^{\prime}(q) g^{\prime}(p).
\]

Proof

Setting

\[
\Delta h=h(x)-h(p)=f(g(x))-f(g(p))=f(g(x))-f(q).
\]

we must show that

\[
\lim _{x \rightarrow p} \frac{\Delta h}{\Delta x}=f^{\prime}(q) g^{\prime}(p) \neq \pm \infty.
\]

Now as \(f\) is differentiable at \(q,\) Theorem 2 yields a function \(\delta : E^{1} \rightarrow E\) such that \(\lim _{x \rightarrow q} \delta(x)=\delta(q)=0\) and such that

\[
\left(\forall y \in E^{1}\right) \quad f(y)-f(q)=\left[f^{\prime}(q)+\delta(y)\right] \Delta y, \Delta y=y-q.
\]

Taking \(y=g(x),\) we get

\[
\left(\forall x \in E^{1}\right) \quad f(g(x))-f(q)=\left[f^{\prime}(q)+\delta(g(x))\right][g(x)-g(p)],
\]

where

\[
g(x)-g(p)=y-q=\Delta y \text { and } f(g(x))-f(q)=\Delta h,
\]

as noted above. Hence

\[
\frac{\Delta h}{\Delta x}=\left[f^{\prime}(q)+\delta(g(x))\right] \cdot \frac{g(x)-g(p)}{x-p} \quad \text { for all } x \neq p.
\]

Let \(x \rightarrow p .\) Then we obtain \(h^{\prime}(p)=f^{\prime}(q) g^{\prime}(p),\) for, by the continuity of \(\delta \circ g\) at \(p\) (Chapter 4, §2, Theorem 3),

\[
\lim _{x \rightarrow p} \delta(g(x))=\delta(g(p))=\delta(q)=0 . \square
\]

The proofs of the next two theorems are left to the reader.

Theorem \(\PageIndex{4}\)

If \(f, g,\) and \(h\) are real or complex and are differentiable at \(p,\) so are

\[
f \pm g, h f, \text { and } \frac{f}{h}
\]

(the latter if \(h(p) \neq 0 ),\) and at the point \(p\) we have

(i) \((f \pm g)^{\prime}=f^{\prime} \pm g^{\prime}\);

(ii) \((h f)^{\prime}=h f^{\prime}+h^{\prime} f ;\) and

(iii) \(\left(\frac{f}{h}\right)^{\prime}=\frac{h f^{\prime}-h^{\prime} f}{h^{2}}\).

All this holds also if \(f\) and \(g\) are vector valued and \(h\) is scalar valued. It also applies to infinite (even one-sided) derivatives, except when the limits involved become indeterminate (Chapter 4, §4).

Note 3. By induction, if \(f, g,\) and \(h\) are \(n\) times differentiable at a point \(p,\) so are \(f \pm g\) and \(h f,\) and, denoting by \(\left(\begin{array}{l}{n} \\ {k}\end{array}\right)\) the binomial coefficients, we have

(i*) \((f \pm g)^{(n)}=f^{(n)} \pm g^{(n)} ;\) and

(ii*) \((h f)^{(n)}=\sum_{k=0}^{n}\left(\begin{array}{l}{n} \\ {k}\end{array}\right) h^{(k)} f^{(n-k)}\).

Formula (ii \()\) is known as the Leibniz formula; its proof is analogous to that of the binomial theorem. It is symbolically written as \((h f)^{(n)}=(h+f)^{n},\) with the last term interpreted accordingly.

Theorem \(\PageIndex{5}\)

(componentwise differentiation). A function \(f : E^{1} \rightarrow E^{n}\left(^{*} C^{n}\right)\) is differentiable at \(p\) iff each of its \(n\) components \(\left(f_{1}, \ldots, f_{n}\right)\) is, and then

\[
f^{\prime}(p)=\left(f_{1}^{\prime}(p), \ldots, f_{n}^{\prime}(p)\right)=\sum_{k=1}^{n} f_{k}^{\prime}(p) \overline{e}_{k},
\]

with \(\overline{e}_{k}\) as in Theorem 2 of Chapter 3, §§1-3.

In particular, a complex function \(f : E^{1} \rightarrow C\) is differentiable iff its real and imaginary parts are, and \(f^{\prime}=f_{\mathrm{re}}^{\prime}+i \cdot f_{\text { im }}^{\prime}\) Chapter 4, §3, Note 5).

Example \(\PageIndex{2}\)

(b) Consider the complex exponential

\[
f(x)=\cos x+i \cdot \sin x=e^{x i}(\text { Chapter } 4, §3).
\]

We assume the derivatives of \(\cos x\) and \(\sin x\) to be known (see Problem 8\() .\) By Theorem \(5,\) we have

\[
f^{\prime}(x)=-\sin x+i \cdot \cos x=\cos \left(x+\frac{1}{2} \pi\right)+i \cdot \sin \left(x+\frac{1}{2} \pi\right)=e^{\left(x+\frac{1}{2} \pi\right) i}.
\]

Hence by induction,

\[
f^{(n)}(x)=e^{\left(x+\frac{1}{2} n \pi\right) i}, n=1,2, \ldots .(\text { Verify! })
\]

\[
f(x)=(1, \cos x, \sin x), \quad x \in E^{1}.
\]

Here Theorem 5 yields

\[
f^{\prime}(p)=(0,-\sin p, \cos p), \quad p \in E^{1}.
\]

For a fixed \(p=p_{0},\) we may consider the line

\[
\overline{x}=\overline{a}+t \vec{u},
\]

where

\[
\overline{a}=f\left(p_{0}\right) \text { and } \vec{u}=f^{\prime}\left(p_{0}\right)=\left(0,-\sin p_{0}, \cos p_{0}\right).
\]

This is, by definition, the tangent vector at \(p_{0}\) to the curve \(f\left[E^{1}\right]\) in \(E^{3}\).

More generally, if \(f : E^{1} \rightarrow E\) is differentiable at \(p\) and continuous on some globe about \(p,\) we define the tangent at \(p\) to the curve \(f\left[G_{p}\right]\) to be the line

\[
\overline{x}=f(p)+t \cdot f^{\prime}(p);
\]

\(f^{\prime}(p)\) is its direction vector in \(E,\) while \(t\) is the variable real parameter. For real functions \(f : E^{1} \rightarrow E^{1},\) we usually consider not \(f\left[E^{1}\right]\) but the curve \(y=f(x)\) in \(E^{2},\) i.e., the set

\[
\left\{(x, y) | y=f(x), x \in E^{1}\right\}.
\]

The tangent to that curve at \(p\) is the line through \((p, f(p))\) with slope \(f^{\prime}(p)\).

In conclusion, let us note that differentiation (i.e., taking derivatives) is a local limit process at some point \(p .\) Hence (cf. Chapter 4, §1, Note 4 ) the existence and the value of \(f^{\prime}(p)\) is not affected by restricting \(f\) to some globe \(G_{p}\) about \(p\) or by arbitrarily redefining \(f\) outside \(G_{p} .\) For one-sided derivatives, we may replace \(G_{p}\) by its corresponding "half."