5.6: Differentials. Taylor’s Theorem and Taylor’s Series
Recall (Theorem 2 of §1) that a function \(f\) is differentiable at \(p\) iff
\[\Delta f=f^{\prime}(p) \Delta x+\delta(x) \Delta x,\]
with \(\lim _{x \rightarrow p} \delta(x)=\delta(p)=0.\) It is customary to write \(df\) for \(f^{\prime}(p) \Delta x\) and \(o(\Delta x)\) for \(\delta(x) \Delta x; df\) is called the differential of \(f\) (at \(p\) and \(x\)). Thus
\[\Delta f=d f+o(\Delta x);\]
i.e., \(df\) approximates \(\Delta f\) to within \(o(\Delta x)\).
More generally, given any function \(f : E^{1} \rightarrow E\) and \(p, x \in E^{1},\) we define
\[d^{n} f=d^{n} f(p, x)=f^{(n)}(p)(x-p)^{n}, \quad n=0,1,2, \ldots,\]
where \(f^{(n)}\) is the \(n\) th derived function (Definition 2 in §1); \(d^{n} f\) is called the nth differential, or differential of order n, of \(f\) (at \(p\) and \(x\)). In particular, \(d^{1} f=f^{\prime}(p) \Delta x=d f.\) By our conventions, \(d^{n} f\) is always defined, as is \(f^{(n)}\).
As we shall see, good approximations of \(\Delta f\) (suggested by Taylor) can often be obtained by using higher differentials (1), as follows:
\[\Delta f=d f+\frac{d^{2} f}{2 !}+\frac{d^{3} f}{3 !}+\cdots+\frac{d^{n} f}{n !}+R_{n}, \quad n=1,2,3, \ldots,\]
where
\[R_{n}=\Delta f-\sum_{k=1}^{n} \frac{d^{k} f}{k !} \quad \text {(the "remainder term")}\]
is the error of the approximation. Substituting the values of \(\Delta f\) and \(d^{k} f\) and transposing \(f(p),\) we have
\[f(x)=f(p)+\frac{f^{\prime}(p)}{1 !}(x-p)+\frac{f^{\prime \prime}(p)}{2 !}(x-p)^{2}+\cdots+\frac{f^{(n)}(p)}{n !}(x-p)^{n}+R_{n}.\]
Formula (3) is known as the nth Taylor expansion of \(f\) about \(p\) (with remainder term \(R_{n}\) to be estimated). Usually we treat \(p\) as fixed and \(x\) as variable. Writing \(R_{n}(x)\) for \(R_{n}\) and setting
\[P_{n}(x)=\sum_{k=0}^{n} \frac{f^{(k)}(p)}{k !}(x-p)^{k},\]
we have
\[f(x)=P_{n}(x)+R_{n}(x).\]
The function \(P_{n} : E^{1} \rightarrow E\) so defined is called the nth Taylor polynomial for \(f\) about \(p.\) Thus (3) yields approximations of \(f\) by polynomials \(P_{n}, n=1,2,3, \ldots.\) This is one way of interpreting it. The other (easy to remember) one is (2), which gives approximations of \(\Delta f\) by the \(d^{k} f.\) It remains, however, to find a good estimate for \(R_{n}.\) We do it next.
Let the function \(f : E^{1} \rightarrow E\) and its first n derived functions be relatively continuous and finite on an interval I and differentiable on \(I-Q\) (Q countable). Let \(p, x \in I.\) Then formulas (2) and (3) hold, with
\[R_{n}=\frac{1}{n !} \int_{p}^{x} f^{(n+1)}(t) \cdot(x-t)^{n} d t \quad\left(\text {"integral form of } R_{n} \text{"}\right)\]
and
\[\left|R_{n}\right| \leq M_{n} \frac{|x-p|^{n+1}}{(n+1) !} \text { for some real } M_{n} \leq \sup _{t \in I-Q}\left|f^{(n+1)}(t)\right|.\]
- Proof
-
By definition, \(R_{n}=f-P_{n},\) or
\[R_{n}=f(x)-f(p)-\sum_{k=1}^{n} f^{(k)}(p) \frac{(x-p)^{k}}{k !}.\]
We use the right side as a "pattern" to define a function \(h : E^{1} \rightarrow E.\) This time, we keep \(x\) fixed \((\text { say }, x=a \in I)\) and replace \(p\) by a variable \(t.\) Thus we set
\[h(t)=f(a)-f(t)-\frac{f^{\prime}(t)}{1 !}(a-t)-\cdots-\frac{f^{(n)}(t)}{n !}(a-t)^{n} \text { for all } t \in E^{1}.\]
Then \(h(p)=R_{n}\) and \(h(a)=0.\) Our assumptions imply that \(h\) is relatively continuous and finite on \(I,\) and differentiable on \(I-Q.\) Differentiating (4), we see that all cancels out except for one term
\[h^{\prime}(t)=-f^{(n+1)}(t) \frac{(a-t)^{n}}{n !}, \quad t \in I-Q. \quad \text {(Verify!)}\]
Hence by Definitions 1 and 2 of §5,
\[-h(t)=\frac{1}{n !} \int_{t}^{a} f^{(n+1)}(s)(a-s)^{n} d s \quad \text { on } I\]
and
\[\frac{1}{n !} \int_{p}^{a} f^{(n+1)}(t)(a-t)^{n} d t=-h(a)+h(p)=0+R_{n}=R_{n} \quad\left(\text {for } h(p)=R_{n}\right).\]
As \(x=a,\) (3') is proved.
Next, let
\[M=\sup _{t \in I-Q}\left|f^{(n+1)}(t)\right|.\]
If \(M<+\infty,\) define
\[g(t)=M \frac{(t-a)^{n+1}}{(n+1) !} \text { for } t \geq a \text { and } g(t)=-M \frac{(a-t)^{n+1}}{(n+1) !} \text { for } t \leq a.\]
In both cases,
\[g^{\prime}(t)=M \frac{|a-t|^{n}}{n !} \geq\left|h^{\prime}(t)\right| \text { on } I-Q \text { by (5).}\]
Hence, applying Theorem 1 in §4 to the functions \(h\) and \(g\) on the interval \([a, p]\) (or \([p, a]),\) we get
\[|h(p)-h(a)| \leq|g(p)-g(a)|,\]
or
\[\left|R_{n}-0\right| \leq M \frac{|a-p|^{n+1}}{(n+1) !}.\]
Thus (3") follows, with \(M_{n}=M\).
Finally, if \(M=+\infty,\) we put
\[M_{n}=\left|R_{n}\right| \frac{(n+1) !}{|a-p|^{n+1}}<M. \quad \square\]
For real functions, we obtain some additional estimates of \(R_{n}\).
If \(f\) is real and \(n+1\) times differentiable on \(I\), then for \(p \neq x\) \((p, x \in I),\) there are \(q_{n}, q_{n}^{\prime}\) in the interval \((p, x)\) (respectively, \((x, p) )\) such that
\[R_{n}=\frac{f^{(n+1)}\left(q_{n}\right)}{(n+1) !}(x-p)^{n+1}\]
and
\[R_{n}=\frac{f^{(n+1)}\left(q_{n}^{\prime}\right)}{n !}(x-p)\left(x-q_{n}^{\prime}\right)^{n}.\]
(Formulas (5') and (5") are known as the Lagrange and Cauchy forms of \(R_{n},\) respectively.)
- Proof
-
Exactly as in the proof of Theorem 1, we obtain the function \(h\) and formula (5). By our present assumptions, \(h\) is differentiable (hence continuous) on \(I,\) so we may apply to it Cauchy's law of the mean (Theorem 2 of §2) on the interval \([a, p]\) (or \([p, a]\) if \(p<a ),\) where \(a=x \in I\).
For this purpose, we shall associate \(h\) with another suitable function \(g\) (to be specified later). Then by Theorem 2 of §2, there is a real \(q \in(a, p)\) (respectively, \(q \in(p, a))\) such that
\[g^{\prime}(q)[h(a)-h(p)]=h^{\prime}(q)[g(a)-g(p)].\]
Here by the previous proof, \(h(a)=0, h(p)=R_{n},\) and
\[h^{\prime}(q)=-\frac{f^{(n+1)}}{n !}(a-q)^{n}.\]
Thus
\[g^{\prime}(q) \cdot R_{n}=\frac{f^{(n+1)}(q)}{n !}(a-q)^{n}[g(a)-g(p)].\]
Now define \(g\) by
\[g(t)=a-t, \quad t \in E^{1}.\]
Then
\[g(a)-g(p)=-(a-p) \text { and } g^{\prime}(q)=-1,\]
so (6) yields (5") (with \(q_{n}^{\prime}=q\) and \(a=x)\).
Similarly, setting \(g(t)=(a-t)^{n+1},\) we obtain (5'). (Verify!) Thus all is proved. \(\quad \square\)
Note 1. In (5') and (5"), the numbers \(q_{n}\) and \(q_{n}^{\prime}\) depend on \(n\) and are different in general \(\left(q_{n} \neq q_{n}^{\prime}\right),\) for they depend on the choice of the function \(g\). Since they are between \(p\) and \(x,\) they may be written as
\[q_{n}=p+\theta_{n}(x-p) \text { and } q_{n}^{\prime}=p+\theta_{n}^{\prime}(x-p),\]
where \(0<\theta_{n}<1\) and \(0<\theta_{n}^{\prime}<1.\) (Explain!)
Note 2. For any function \(f : E^{1} \rightarrow E,\) the Taylor polynomials \(P_{n}\) are partial sums of a power series, called the Taylor series for \(f\) (about \(p).\) We say that \(f\) admits such a series on a set \(B\) iff the series converges to \(f\) on \(B;\) i.e.,
\[f(x)=\lim _{n \rightarrow \infty} P_{n}(x)=\sum_{n=1}^{\infty} \frac{f^{(n)}(p)}{n !}(x-p)^{n} \neq \pm \infty \text { for } x \in B.\]
This is clearly the case iff
\[\lim _{n \rightarrow \infty} R_{n}(x)=\lim _{n \rightarrow \infty}\left[f(x)-P_{n}(x)\right]=0 \text { for } x \in B;\]
briefly, \(R_{n} \rightarrow 0.\) Thus
\[f \text { admits a Taylor series (about p) iff } R_{n} \rightarrow 0.\]
Caution: The convergence of the series alone (be it pointwise or uniform) does not suffice. Sometimes the series converges to a sum other than \(f(x);\) then (7) fails. Thus all depends on the necessary and sufficient condition: \(R_{n} \rightarrow 0\).
We say that \(f\) is of class \(\mathrm{CD}^{n}\), or continuously differentiable \(n\) times, on a set \(B\) iff \(f\) is \(n\) times differentiable on \(B,\) and \(f^{(n)}\) is relatively continuous on \(B\). Notation: \(f \in \mathrm{CD}^{n}\) (on \(B).\)
If this holds for each \(n \in N,\) we say that \(f\) is infinitely differentiable on \(B\) and write \(f \in \mathrm{CD}^{\infty}\) (on \(B).\)
The notation \(f \in \mathrm{CD}^{0}\) means that \(f\) is finite and relatively continuous (all on \(B).\)
(a) Let
\[f(x)=e^{x} \text { on } E^{1}.\]
Then
\[(\forall n) \quad f^{(n)}(x)=e^{x},\]
so \(f \in \mathrm{CD}^{\infty}\) on \(E^{1}.\) At \(p=0, f^{(n)}(p)=1,\) so we obtain by Theorem 1' (using (5') and Note 1)
\[e^{x}=1+\frac{x}{1 !}+\frac{x^{2}}{2 !}+\cdots+\frac{x^{n}}{n !}+\frac{e^{\theta_{n} x}}{(n+1) !} x^{n+1}, \quad 0<\theta_{n}<1.\]
Thus on an interval \([-a, a]\),
\[e^{x} \approx 1+\frac{x}{1 !}+\frac{x^{2}}{2 !}+\cdots+\frac{x^{n}}{n !}\]
to within an error \(R_{n}(>0 \text { if } x>0)\) with
\[\left|R_{n}\right|<e^{a} \frac{a^{n+1}}{(n+1) !},\]
which tends to 0 as \(n \rightarrow+\infty.\) For \(a=1=x,\) we get
\[e=1+\frac{1}{1 !}+\frac{1}{2 !}+\cdots+\frac{1}{n !}+R_{n} \text { with } 0<R_{n}<\frac{e^{1}}{(n+1) !}.\]
Taking \(n=10,\) we have
\[e \approx 2.7182818 | 011463845 \ldots\]
with a nonnegative error of no more than
\[\frac{e}{11 !}=0.00000006809869 \ldots;\]
all digits are correct before the vertical bar.
(b) Let
\[f(x)=e^{-1 / x^{2}} \text { with } f(0)=0.\]
As \(\lim _{x \rightarrow 0} f(x)=0=f(0), f\) is continuous at \(0.\) We now show that \(f \in \mathrm{CD}^{\infty}\) on \(E^{1}.\)
For \(x \neq 0,\) this is clear; moreover, induction yields
\[f^{(n)}(x)=e^{-1 / x^{2}} x^{-3 n} S_{n}(x),\]
where \(S_{n}\) is a polynomial in \(x\) of degree 2\((n-1)\) (this is all we need know about \(S_{n}).\) A repeated application of L'Hôpital's rule then shows that
\[\lim _{x \rightarrow 0} f^{(n)}(x)=0 \text { for each } n.\]
To find \(f^{\prime}(0),\) we have to use the definition of a derivative:
\[f^{\prime}(0)=\lim _{x \rightarrow 0} \frac{f(x)-f(0)}{x-0},\]
or by L'Hôpital's rule,
\[f^{\prime}(0)=\lim _{x \rightarrow 0} \frac{f^{\prime}(x)}{1}=0.\]
Using induction again, we get
\[f^{(n)}(0)=0, \quad n=1,2, \ldots.\]
Thus, indeed, \(f\) has finite derivatives of all orders at each \(x \in E^{1},\) including \(x=0,\) so \(f \in \mathrm{CD}^{\infty}\) on \(E^{1},\) as claimed.
Nevertheless, any attempt to use formula (3) at \(p=0\) yields nothing. As all \(f^{(n)}\) vanish at \(0,\) so do all terms except \(R_{n}.\) Thus no approximation by polynomials results - we only get \(P_{n}=0\) on \(E^{1}\) and \(R_{n}(x)=e^{-1 / x^{2}}\). \(R_{n}\) does not tend to 0 except at \(x=0,\) so \(f\) admits no Taylor series about 0 (except on \(E=\{0\}).\)
Taylor's theorem also yields sufficient conditions for maxima and minima, as we see in the following theorem.
Let \(f : E^{1} \rightarrow E^{*}\) be of class \(\mathrm{CD}^{n}\) on \(G_{p}(\delta)\) for an even number \(n \geq 2,\) and let
\[f^{(k)}(p)=0 \text { for } k=1,2, \ldots,\text{ } n-1,\]
while
\[f^{(n)}(p)<0 \text { (respectively, } f^{(n)}(p)>0).\]
Then \(f(p)\) is the maximum (respectively, minimum) value of \(f\) on some \(G_{p}(\varepsilon)\) \(\varepsilon \leq \delta.\)
If, however, these conditions hold for some odd \(n \geq 1\) (i.e., the first nonvanishing derivative at \(p\) is of odd order), \(f\) has no maximum or minimum at \(p.\)
- Proof
-
As
\[f^{(k)}(p)=0, \quad k=1,2, \ldots, \text{ } n-1,\]
Theorem 1' (with \(n\) replaced by \(n-1)\) yields
\[f(x)=f(p)+f^{(n)}\left(q_{n}\right) \frac{(x-p)^{n}}{n !} \quad \text { for all } x \in G_{p}(\delta),\]
with \(q_{n}\) between \(x\) and \(p\).
Also, as \(f \in \mathrm{CD}^{n}, f^{(n)}\) is continuous at \(p.\) Thus if \(f^{(n)}(p)<0,\) then \(f^{(n)}<0\) on some \(G_{p}(\varepsilon), 0<\varepsilon \leq \delta.\) However, \(x \in G_{p}(\varepsilon)\) implies \(q_{n} \in G_{p}(\varepsilon),\) so
\[f^{(n)}\left(q_{n}\right)<0,\]
while
\[(x-p)^{n} \geq 0 \text { if } n \text { is even.}\]
It follows that
\[f^{(n)}\left(q_{n}\right) \frac{(x-p)^{n}}{n !} \leq 0,\]
and so
\[f(x)=f(p)+f^{(n)}\left(q_{n}\right) \frac{(x-p)^{n}}{n !} \leq f(p) \quad \text { for } x \in G_{p}(\varepsilon),\]
i.e., \(f(p)\) is the maximum value of \(f\) on \(G_{p}(\varepsilon),\) as claimed.
Similarly, in the case \(f^{(n)}(p)>0,\) a minimum would result.
If, however, \(n\) is odd, then \((x-p)^{n}\) is negative for \(x<p\) but positive for \(x>p .\) The same argument then shows that \(f(x)<f(p)\) on one side of \(p\) and \(f(x)>f(p)\) on the other side; thus no local maximum or minimum can exist at \(p.\) This completes the proof. \(\quad \square\)
(a') Let
\[f(x)=x^{2} \text { on } E^{1} \text { and } p=0.\]
Then
\[f^{\prime}(x)=2 x \text { and } f^{\prime \prime}(x)=2>0,\]
so
\[f^{\prime}(0)=0 \text { and } f^{\prime \prime}(0)=2>0.\]
By Theorem 2, \(f(p)=0^{2}=0\) is a minimum value.
It turns out to be a minimum on all of \(E^{1}\). Indeed, \(f^{\prime}(x)>0\) for \(x>0\), and \(f^{\prime}<0\) for \(x<0,\) so \(f\) strictly decreases on \((-\infty, 0)\) and increases on \((0,+\infty).\)
Actually, even without using Theorem 2, the last argument yields the answer.
(b') Let
\[f(x)=\ln x \text { on }(0,+\infty).\]
Then
\[f^{\prime}(x)=\frac{1}{x}>0 \text { on all of }(0,+\infty).\]
This shows that \(f\) strictly increases everywhere and hence can have no maximum or minimum anywhere. The same follows by the second part of Theorem 2, with \(n=1\).
(b") In Example (b'), consider also
\[f^{\prime \prime}(x)=-\frac{1}{x^{2}}<0.\]
In this case, \(f^{\prime \prime}\) has no bearing on the existence of a maximum or minimum because \(f^{\prime} \neq 0.\) Still, the formula \(f^{\prime \prime}<0\) does have a certain meaning. In fact, if \(f^{\prime \prime}(p)<0\) and \(f \in \mathrm{CD}^{2}\) on \(G_{p}(\delta),\) then (using the same argument as in Theorem 2) the reader will easily find that
\[f(x) \leq f(p)+f^{\prime}(p)(x-p) \quad \text { for } x \text { in some } G_{p}(\varepsilon), 0<\varepsilon \leq \delta.\]
since \(y=f(p)+f^{\prime}(p)(x-p)\) is the equation of the tangent at \(p\), it follows that \(f(x) \leq y;\) i.e., near \(p\) the curve lies below the tangent at \(p.\)
Similarly, \(f^{\prime \prime}(p)>0\) and \(f \in \mathrm{CD}^{2}\) on \(G_{p}(\delta)\) implies that the curve near \(p\) lies above the tangent.