1.1: Introduction to Derivatives

Last updated
Save as PDF

Page ID: 139429

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $ $ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $$\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$ $\newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\kernel}{\mathrm{null}\,}$ $ \newcommand{\range}{\mathrm{range}\,}$ $ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$ $ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$ $ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $ \newcommand{\Span}{\mathrm{span}}$$\newcommand{\AA}{\unicode[.8,0]{x212B}}$

The Derivative

Introduction

Calculus can be thought of as the analysis of curved shapes.¹ Its development grew out of attempts to solve physical problems. For example, suppose that an object at rest 100 ft above the ground is dropped. Ignoring air resistance and wind, the object will fall straight down until it hits the ground (see Figure [fig:fall](a)). As will be proved later, $t$ seconds after being dropped the object will be $s = s(t) = -16t^2 + 100$ ft above the ground. The object will thus hit the ground after 2.5 seconds (when $s = 0$). While the object’s path is a straight line, the graph of its position $s$ above the ground as a function of time $t$ is curved, part of a parabola (see Figure [fig:fall](b)).

How fast is the object moving before it hits the ground? This is where calculus comes in. The solution, presented now, will motivate much of this chapter.

First, the object travels 100 ft in 2.5 seconds, so its average speed in that time is

\[\frac{\text{distance traveled}}{\text{time elapsed}} ~=~ \frac{100 \text{ ft}}{2.5 \text{ seconds}} ~=~ 40 \text{ ft/s,} \nonumber \]

and its average velocity in that time is

\[\frac{\text{change in position}}{\text{change in time}} ~=~ \frac{\text{final position} ~-~ \text{initial position}} {\text{end time} ~-~ \text{start time}} ~=~ \frac{0 \text{ ft} ~-~ 100 \text{ ft}}{2.5 \text{ sec} ~-~ 0 \text{ sec}} ~=~ -40 \text{ ft/s.} \nonumber \]

Unlike speed, velocity takes direction into account. Thus, the object’s downward motion means it has negative velocity. Positive velocity implies upward motion.

Using the idea of average velocity over an interval of time, there is a natural way to define the object’s instantaneous velocity at a particular instant of time $t$:

Find the average velocity over an interval of time.
Let the interval become smaller and smaller indefinitely, shrinking to a point $t$. If the average velocity over that smaller and smaller interval approaches some value, call that value the instantaneous velocity at time $t$.

Figure [fig:instvel] below shows how to choose the interval: for any time $t$ between 0 and 2.5, use the interval $\ival{t}{t+\Delta t}$, where $\Delta t$ (pronounced “delta t”) is a small positive number. So $\Delta t$ is the change in time over the interval; denote by $\Delta s$ the change in the position $s$ over that interval.

The average velocity of the object over the interval $\ival{t}{t+\Delta t}$ is $\frac{\Delta s}{\Delta t}$, so since $s(t) = -16t^2 + 100$:

\[\begin{aligned} \dfrac{\Delta s}{\Delta t} ~~&=~~ \dfrac{s(t + \Delta t) ~-~ s(t)} {\Delta t}\

\[8pt] &=~~ \dfrac{-16(t+\Delta t)^2 ~+~ 100 ~-~ (-16t^2 ~+~ 100)}{\Delta t}\

\[8pt] &=~~ \dfrac{-16t^2 ~-~ 32t\Delta t ~-~ 16(\Delta t)^2 ~+~ 100 ~+~ 16t^2 ~-~ 100}{\Delta t}\

\[8pt] &=~~ \dfrac{-32t\Delta t ~-~ 16(\Delta t)^2}{\Delta t} ~~=~~ \dfrac{\cancel{\Delta t} \,(-32t ~-~ 16\Delta t)} {\cancel{\Delta t}}\

\[6pt] &=~~ -32t ~-~ 16\Delta t ~,\end{aligned} \nonumber \]

Now let the interval $\ival{t}{t+\Delta t}$ get smaller and smaller indefinitely—that is, let $\Delta t$ get closer and closer to 0. Then the average velocity $\frac{\Delta s}{\Delta t} = -32t - 16\Delta t\,$ gets closer and closer to $-32t - 0 = -32t$. Thus, the object has instantaneous velocity $-32t$ at time $t$. This calculation can be interpreted as taking the limit of $\frac{\Delta s}{\Delta t}\,$ as $\Delta t\,$ approaches $0$, written as follows:

\[\begin{aligned} \text{instantaneous velocity at $t$} ~~&=~~ \text{limit of average velocity over $\ival{t}{t+\Delta t}$ as $\Delta t$ approaches to 0}\

\[6pt] &=~~ \lim_{\Delta t \to 0} ~\frac{\Delta s}{\Delta t}\

\[8pt] &=~~ \lim_{\Delta t \to 0} ~(-32t ~-~ 16\Delta t)\

\[6pt] &=~~ -32t - 16(0)\

\[6pt] &=~~ -32t\end{aligned} \nonumber \]

Notice that $\Delta t$ is not replaced by $0$ in the ratio $\frac{\Delta s}{\Delta t}$ until after doing as much cancellation as possible. Notice also that the instantaneous velocity of the object varies with $t$, as it should (why?). In particular, at the instant when the object hits the ground at time $t = 2.5$ sec, the instantaneous velocity is $-32(2.5) = -80$ ft/s.

If this makes sense so far, then you understand the crux of the idea of what a limit is and how to calculate a limit. The instantaneous velocity $v(t) = -32t$ is called the derivative of the position function $s(t) =-16t^2 + 100$. Calculating derivatives, analyzing their properties, and using them to solve various problems are part of differential calculus.

What does this have to do with curved shapes? Instantaneous velocity is a special case of an instantaneous rate of change of a function; in this case the instantaneous rate of change of the position (height above the ground) of the object. Similar to how the rate of change of a line is its slope, the instantaneous rate of change of a general curve represents the slope of the curve. For example, the parabola $s(t) = -16t^2 + 100$ has slope $-32t$ for all $t$. Note that the slope of this curve varies (as a function of $t$), unlike the slope of a straight line.

Finding the area inside curved regions is another type of problem that calculus can solve. The basic idea is to use simpler regions—rectangles—whose areas are known, then use those to approximate the area inside the curved region. One such method is to draw more and more rectangles of diminishing widths inside the curved region,² so that the sums of their areas approach the area of the curved region. Figure [fig:area] shows an example with four rectangles to approximate the area under a curve $y=f(x)$ over an interval $\ival{a}{b}$ on which $f(x) \ge 0$.

The limit of these sums of rectangular areas is called an integral. The study and application of integrals are part of integral calculus. Perhaps the most remarkable result in calculus is that there is a connection between derivatives and integrals—the Fundamental Theorem of Calculus, discovered in the 17^th century, independently, by the two men who invented calculus as we know it: English physicist, astronomer and mathematician Isaac Newton (1642-1727) and German mathematician and philosopher Gottfried Wilhelm von Leibniz (1646-1716).

Calculus makes extensive use of infinite sequences and series. An infinite series is just a sum of an infinite number of terms. For example, it will be shown later in the text that

\[\label{eqn:piseries} \frac{\pi}{4} ~~=~~ 1 ~-~ \frac{1}{3} ~+~ \frac{1}{5} ~-~ \frac{1}{7} ~+~ \frac{1}{9} ~-~ \cdots ~, \]

where the sum on the right involves an infinite number of terms. A power series is a particular type of infinite series applied to functions; it can be thought of as a polynomial of infinite degree. For example, the trigonometric function $\sin\;x$ does not appear to be a polynomial. But it turns out that $\sin\;x$ has a power series representation as

\[\label{eqn:sinseries} \sin\,x ~~=~~ x ~-~ \frac{x^3}{3!} ~+~ \frac{x^5}{5!} ~-~ \frac{x^7}{7!} ~+~ \frac{x^9}{9!} ~-~ \cdots ~, \]

where again the sum continues infinitely, and the formula holds for all $x$ (in radians).

The idea of replacing a function by its power series played an important role throughout the development of calculus, and is a powerful technique in many applications.

All the functions in this text will be functions of a single real variable—that is, the values that the variable can take are real numbers. Below is some standard notation for commonly-used sets of numbers:

\[\begin{aligned} \Naturals ~~&=~~ \text{the set of all \textbf{natural} numbers, i.e. the set of nonnegative integers: } 0, 1, 2, 3, 4, \ldots\

\[6pt] \Integers ~~&=~~ \text{the set of all integers: } 0, \pm 1, \pm 2, \pm 3, \pm 4, \ldots\

\[6pt] \Rationals ~~&=~~ \text{the set of all \textbf{rational} numbers $\frac{m}{n}$, where $m$ and $n$ are integers, with $n \ne 0$}\

\[6pt] \Reals ~~&=~~ \text{the set of all real numbers}\end{aligned} \nonumber \]

Note that $\Naturals ~~\subset~~ \Integers ~~\subset~~ \Rationals ~~\subset~~ \Reals$.

The set of real numbers consists of the rational numbers together with numbers that are not rational, called irrational numbers. For example, $\sqrt{2}$ is irrational. That is, 2 is not the square of a rational number. In fact, if the square of a rational number $q$ were an integer, then $q$ itself would have to be an integer: write $q$ as $m/n$, where $m$ and $n$ are positive integers with no common positive integer divisors other than 1. Since $q^2 = m^2/n^2$ simply duplicates the integer divisors of $m$ and $n$, then $q^2$ can be an integer only if $n=1$, i.e. $q$ is an integer. Clearly 2 is not the square of an integer, and thus it cannot be the square of a rational number. This argument also shows that $\sqrt{3}$, $\sqrt{5}$, $\sqrt{6}$, $\sqrt{7}$, $\sqrt{8}$, $\sqrt{10}$, and so on, are irrational.³

It turns out that there are far more irrational numbers–and hence real numbers—than rational numbers. In fact, whereas the rational numbers can be listed in a sequence (i.e. first, second, third, etc.), the set of real numbers cannot.⁴ For example, in the closed interval $\ival{0}{1}$ there is no “next” real number after the number $0$. Thus, some infinite sets are larger than others—$\Reals$ is larger than $\Rationals$. Intervals such as $\ival{0}{1}$ or $\Reals$ itself are examples of a continuum of objects, i.e. no gaps exist.⁵ A famous unsolved problem in mathematics—the Continuum Hypothesis—is whether an infinite set exists that is larger in size than $\Rationals$ but smaller than $\Reals$.

Infinity is an important notion in calculus. Whether it is the idea of infinitely large or infinitesimally small, calculus attempts to give the idea some mathematical meaning (typically by way of limits).⁶ The mathematical use of infinity has been a subject of philosophical debate.⁷

Though several centuries old, calculus was the beginning of modern mathematics. Classical mathematics (e.g. algebra, geometry, trigonometry)—whose origins date back to the ancient Babylonians, Egyptians, and Greeks—was concerned mostly with the study of static quantities. Calculus produced a way to analyze dynamic (i.e. changing) quantities. The period from the 17^th through the 19^th century also saw revolutionary advances in physics, chemistry, biology and other sciences. The birth of calculus was one part of that qualitative leap. [sec1dot1]

For Exercises 1-4, suppose that an object moves in a straight line such that its position $s$ after time $t$ is the given function $s=s(t)$. Find the instantaneous velocity of the object at a general time $t \ge 0$. You should mimic the earlier example for the instantaneous velocity when $s = -16t^2 + 100$.

$s = t^2$

$s = 9.8t^2$

$s = -16t^2 + 2t$

$s = t^3$

By equation ([eqn:piseries]), $\pi ~=~ 4\,\left(1 ~-~ \frac{1}{3} ~+~ \frac{1}{5} ~-~ \frac{1}{7} ~+~ \frac{1}{9} ~-~ \cdots ~\right)$, where the $n^{th}$ term in the sum inside the parentheses is $\frac{(-1)^{n+1}}{2n-1}$ (starting at $n=1$).⁸ So the first approximation of $\pi$ using this formula is $\pi \approx 4\,(1) = 4.0$, and the second approximation is $\pi \approx 4\,\left(1 - \frac{1}{3}\right) = 8/3 \approx 2.66667$. Continue like this until two consecutive approximations have $3$ as the first digit before the decimal point. How many terms in the sum did this require? Be careful with rounding off in the approximations. [[1.]]

In elementary geometry you learned that the area inside a circle of radius $r>0$ is $\pi r^2$ (that formula will be proved later in the text). So in particular, let $C$ be a circle of radius $1$. Then the area inside $C$ is $\pi$. That area can be approximated by Eudoxus’ method of exhaustion.⁹ The idea is to inscribe regular polygons inside the circle, i.e. the vertexes of the polygons touch $C$. Recall from geometry that a polygon is regular if its sides are of equal length. By increasing the number of sides of the polygons, the areas inside the polygons will approach the area ($\pi$) of $C$. This was an early attempt at using what is now called a limit.¹⁰

Inscribe a square inside $C$, as in Figure [fig:insquare]. Show that the area inside the square is $2$. This is a poor approximation of $\pi = 3.14159265...$, obviously.
Inscribe a regular hexagon ($6$-sided) inside $C$, as in Figure [fig:inhexagon]. Show that the area inside the hexagon is $\frac{3\,\sqrt{3}}{2} \approx 2.59807621$. This is a slightly better—though still poor—approximation of $\pi$.
Inscribe a regular dodecagon ($12$-sided) inside $C$. Show that the area inside the dodecagon is $3$. It thus takes $12$ sides for the approximation to get the first digit of $\pi$ correct.
Inscribe a regular $100$-sided polygon inside $C$. Show that the area inside this polygon is approximately $3.13952598$. This is getting closer to $\pi$.
Show that the general formula for the area inside a regular $n$-sided polygon inscribed inside $C$ is $\dfrac{n}{2}\,\sin\,\left( \dfrac{360\Degrees} {n}\right)$. (Hint: The double-angle identity $\sin\,2\theta = 2\,\sin\,\theta\;\cos\,\theta$ might help.)

[[1.]]

What is the flaw in the following “proof” that $\pi = 4$?:

Step 1: Draw a square around a circle of diameter $d = 1$. The circumference of the circle is thus $\pi\,d = \pi$, and the perimeter of the square is 4.

Step 2: Remove corners from the square as shown in the picture on the right, so that four new corners touch the circle. Notice that the perimeter of the resulting polygon is still 4, since the lengths of the removed corner pieces are duplicated in the new polygon, so that the lengths of all the vertical sides add up to 2 while the lengths of all the horizontal sides add up to 2.

Step 3: Remove corners from the polygon in Step 2, as shown in the picture on the right, so that eight new corners touch the circle. The perimeter of the resulting polygon is again still 4.

Step 4: Continue this procedure indefinitely, with each successive polygon still having a perimeter of 4 and becoming increasingly indistinguishable from the circle. Since the perimeters of the polygons always equal 4 and approach the circle’s circumference ($\pi$), then $\pi$ must equal 4.

An infinite set is countable if its members can be put into a one-to-one correspondence with the members of $\Naturals$, the set of natural numbers ($0, 1, 2, 3, 4, \ldots$). Clearly $\Naturals$ is itself countable. The set $\Integers$ of all integers is also countable, by means of the following one-to-one correspondence with $\Naturals$:

Show that $\Rationals$ (the set of all rational numbers) is countable. (Hint: The above correspondence for $\Integers$ is an infinite list in one dimension (the horizontal direction). For $\Rationals$ think two-dimensionally.)

It is more than that, of course, but that definition puts us in good company: the first European textbook on calculus, written by the French mathematician Guillaume de l’Hôpital in 1696, was titled Analyse des Infiniment Petits pour l’Intelligence des Lignes Courbes (which translates as Analysis of the Infinitely Small for Understanding Curved Lines). That book (in French) can be obtained freely in electronic form at https://archive.org↩
It will be shown later (in Chapter 5) that the rectangles do not have to be completely inside the region.↩
This argument is due to the British philosopher Bertrand Russell (1872-1970). For an alternative proof that $\sqrt{2}$ is irrational, see pp. 97-98 in Gelfand, I.M. and A. Shen, Algebra, Boston: Birkhäuser, 1993.↩
For a proof see Ch.1 in Kamke, E., Theory of Sets, New York: Dover Publications, Inc., 1950.↩
For a study of the structure of the real number system, see Burrill, C.W., Foundations of Real Numbers, New York: McGraw-Hill Book Company, 1967.↩
Not everyone agrees that calculus does this satisfactorily. For example, for an alternative development of basically the same material in “standard” calculus but without the use of limits—called infinitesimal analysis—see Keisler, H.J., Elementary Calculus: An Infinitesimal Approach, Boston: Prindle, Weber & Schmidt, 1976.↩
For example, see the essays by L. E. J. Brouwer, Hermann Weyl and David Hilbert in Heijenoort, J. van, From Frege to Gödel: A Source Book in Mathematical Logic, 1879-1931, Cambridge, MA: Harvard University Press, 1967.↩
This, by the way, is a terrible formula for calculating $\pi$; getting just the 3.14 part requires 119 terms in the sum!↩
Originally due to another ancient Greek mathematician, Antiphon (ca. 430 b.c.)↩
The great ancient Greek mathematician, physicist and astronomer Archimedes (ca. 287-212 b.c.) used this method, together with circumscribed regular polygons, to calculate $\pi$.↩
This will be proved in Chapter 5.↩
In this text, the rate of change of $f(x)$ is always taken in the direction of increasing $x$, i.e. in the positive $x$ direction.↩
Physics texts typically prefer the delta notation, since $\Delta x$ represents a small change in some physical quantity $x$.↩
It was Leibniz who created the notation $\dydx$. For this reason $\dydx$ is called the Leibniz notation for the derivative. Newton used the dot notation $\dot{y}$, which has fallen out of favor with mathematicians but is still used by many physicists, especially when the independent variable represents time. Newton called derivatives fluxions. The prime notation $f'$ is due to the French mathematician and physicist Joseph Louis Lagrange (1736-1813).↩
Bell, J.L., A Primer of Infinitesimal Analysis, Cambridge, U.K.: Cambridge University Press, 1998.↩
In an equivalent treatment, infinitesimals are part of the hyperreal number system. See Keisler, H.J., Elementary Calculus: An Infinitesimal Approach, Boston: Prindle, Weber & Schmidt, 1976.↩
The infinitesimal approach was first developed in an axiomatic manner in the landmark book Robinson, A., Non-Standard Analysis, Amsterdam: North-Holland, 1966. Robinson showed that for all practical purposes calculus can be developed without resorting to limits, with equivalent results.↩
Calculators do this for display reasons—most can show only 10-12 digits. Try this experiment on your calculator: Add $10^{30}$, $-\left(10^{30}\right)$, and 1 in two different ways: $\left(10^{30} + -\left(10^{30}\right)\right) + 1$, and $10^{30} + \left(-\left(10^{30}\right) + 1\right)$. The first way will give you the correct answer 1, but the second way yields 0. So addition is not always associative on calculators!↩
Notice that the figure implies that the Pythagorean Theorem does not apply to infinitesimal triangles. This will be discussed in Chapter 8.↩
Available at http://www.gnuplot.info.↩
The English philosopher George Berkeley (1685-1753) famously derided infinitesimals as “the ghosts of departed quantities” in his book The Analyst (1734), which had the disquieting subtitle “A Discourse Addressed to an Infidel Mathematician” (directed at Newton).↩
However, the limit approach turns out, ultimately, to be equivalent to the infinitesimal approach. In essence, only the terminology is different.↩
As Bertrand Russell noted, the name is really a misnomer: it is actually a definition of the natural numbers rather than a principle, and induction technically has a different meaning.↩
Some textbooks give dire warnings to not think that $\du$ is an actual quantity that can be canceled. However, you can safely ignore those warnings, because $\du$ is just an infinitesimal and hence can be canceled!↩
It will be shown in Chapter 2 how to define any real number as an exponent. The Power Rule extends to that case as well.↩
This is an example of a current-differencing negative feedback amplifier. See pp.473-479 in Schilling, D.L. and C. Belove, Electronic Circuits: Discrete and Integrated, 2nd ed., New York: McGraw-Hill, Inc., 1979.↩
See pp.43-45 in Heywood, J.B., Internal Combustion Engine Fundamentals, New York: McGraw-Hill Inc., 1988.↩
Yes, those really are their names, obviously inspired by a certain breakfast cereal. Snap has found some uses in flight dynamics, e.g. minimizing snap to optimize flight paths of drones.↩