# 2.1: Newton and Leibniz Get Started

- Page ID
- 7924

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

Skills to Develop

- Explain Leibniz’s approach to the Product Rule
- Explain Newton's approach to the Product Rule

### Leibniz’s Calculus Rules

The rules for calculus were ﬁrst laid out in Gottfried Wilhelm Leibniz’s 1684 paper *Nova methodus pro maximis et minimis, itemque tangentibus, quae nec fractas nec irrationales, quantitates moratur, et singulare pro illi calculi genus *(A New Method for Maxima and Minima as Well as Tangents, Which is Impeded Neither by Fractional Nor by Irrational Quantities, and a Remarkable Type of Calculus for This).

**Figure \(\PageIndex{1}\): **Gottfried Wilhelm Leibniz.

Leibniz started with subtraction. That is, if \(x_1\) and \(x_2\) are very close together then their diﬀerence, \(∆x = x_2 - x_1\), is very small. He expanded this idea to say that if \(x_1\) and \(x_2\) are inﬁnitely close together (but still distinct) then their diﬀerence, \(dx\), is inﬁnitesimally small (but not zero).

This idea is logically very suspect and Leibniz knew it. But he also knew that when he used his calculus diﬀerentialis^{1} he was getting correct answers to some very hard problems. So he persevered.

Leibniz called both \(∆x\) and \(dx\) “diﬀerentials” (Latin for diﬀerence) because he thought of them as, essentially, the same thing. Over time it has become customary to refer to the inﬁnitesimal \(dx\) as a diﬀerential, reserving “diﬀerence” for the ﬁnite case, \(∆x\). This is why calculus is often called “diﬀerential calculus.”

In his paper Leibniz gave rules for dealing with these inﬁnitely small diﬀerentials. Speciﬁcally, given a variable quantity \(x\), \(dx\) represented an inﬁnitesimal change in \(x\). Diﬀerentials are related via the slope of the tangent line to a curve. That is, if \(y = f(x)\), then \(dy\) and \(dx\) are related by

\[dy = \text{(slope of the tangent line)}\cdot dx\]

Leibniz then divided by \(dx\) giving

\[\frac{dy}{dx} = \text{(slope of the tangent line)}\]

The elegant and expressive notation Leibniz invented was so useful that it has been retained through the years despite some profound changes in the underlying concepts. For example, Leibniz and his contemporaries would have viewed the symbol \(\frac{dy}{dx}\) as an actual quotient of inﬁnitesimals, whereas today we deﬁne it via the limit concept ﬁrst suggested by Newton.

As a result the rules governing these diﬀerentials are very modern in appearance:

\[d(constant) = 0\]

\[d(z - y + w + x) = dz - dy + dw + dx\]

\[d(xv) = xdv + v dx\]

\[d\left ( \frac{v}{y} \right ) = \frac{ydv - vdy}{yy}\]

and, when \(a\) is an integer:

\[d(x^a) = ax^{a - 1} dx\]

Leibniz states these rules without proof: “*. . . the demonstration of all this will be easy to one who is experienced in such matters . . ..*” As an example, mathematicians in Leibniz’s day would be expected to understand intuitively that if \(c\) is a constant, then \(d(c) = c - c = 0\). Likewise, \(d(x + y) = dx + dy\) is really an extension of \((x_2 + y_2) - (x_1 + y_1) = (x_2 - x_1) + (y_2 - y_1)\).

### Leibniz’s Approach to the Product Rule

The explanation of the product rule using diﬀerentials is a bit more involved, but Leibniz expected that mathematicians would be ﬂuent enough to derive it. The product \(p = xv\) can be thought of as the area of the following rectangle

**Figure \(\PageIndex{2}\):** Area of a rectangle.

With this in mind, \(dp = d(xv)\) can be thought of as the change in area when \(x\) is changed by \(dx\) and \(v\) is changed by \(dv\). This can be seen as the \(L\) shaped region in the following drawing.

**Figure \(\PageIndex{3}\): **Change in area when \(x\) is changed by \(dx\) and \(v\) is changed by \(dv\).

By dividing the \(L\) shaped region into \(3\) rectangles we obtain

\[d(xv) = xdv + v dx + dx dv\]

Even though \(dx\) and \(dv\) are inﬁnitely small, Leibniz reasoned that \(dx dv\) is *even more* inﬁnitely small (quadratically inﬁnitely small?) compared to \(xdv\) and \(vdx\) and can thus be ignored leaving

\[d(xv) = xdv + v dx\]

You should feel some discomfort at the idea of simply tossing the product \(dx dv\) aside because it is “*comparatively small.*” This means you have been well trained, and have thoroughly internalized Newton’s dictum [10]: “*The smallest errors may not, in mathematical matters, be scorned.*” It is logically untenable to toss aside an expression just because it is small. Even less so should we be willing to ignore an expression on the grounds that it is “*inﬁnitely smaller*” than another quantity which is itself “*inﬁnitely small.*”

Newton and Leibniz both knew this as well as we do. But they also knew that their methods worked. They gave veriﬁably correct answers to problems which had, heretofore, been completely intractable. It is the mark of their genius that both men persevered in spite of the very evident diﬃculties their methods entailed.

### Newton’s Approach to the Product Rule

In the Principia, Newton “proved” the Product Rule as follows: Let \(x\) and \(v\) be “*ﬂowing ^{2} quantites*” and consider the rectangle, \(R\), whose sides are \(x\) and \(v\). \(R\) is also a ﬂowing quantity and we wish to ﬁnd its ﬂuxion (derivative) at any time.

**Figure \(\PageIndex{4}\): **Isaac Newton

First increment \(x\) and \(v\) by \(\frac{∆x}{2} \) and \(\frac{∆v}{2} \) respectively. Then the corresponding increment of \(R\) is

\[\left ( x + \frac{\Delta x}{2} \right ) \left ( v + \frac{\Delta v}{2} \right ) = xv + x\frac{\Delta v}{2} + v\frac{\Delta x}{2} + \frac{\Delta x \Delta v}{4}\]

Now decrement \(x\) and \(v\) by the same amounts:

\[\left ( x - \frac{\Delta x}{2} \right ) \left ( v - \frac{\Delta v}{2} \right ) = xv - x\frac{\Delta v}{2} - v\frac{\Delta x}{2} + \frac{\Delta x \Delta v}{4}\]

Subtracting the right side of equation \(\PageIndex{11}\) from the right side of equation \(\PageIndex{10}\) gives

\[∆R = x∆v + v∆x\]

which is the total change of \(R = xv\) over the intervals \(∆x\) and \(∆v\) and also recognizably the Product Rule.

This argument is no better than Leibniz’s as it relies heavily on the number \(1/2\) to make it work. If we take any other increments in \(x\) and \(v\) whose total lengths are \(∆x\) and \(∆v\) it will simply not work. Try it and see.

In Newton’s defense, he wasn’t really trying to justify his mathematical methods in the Principia. His attention there was on physics, not math, so he was really just trying to give a convincing demonstration of his methods. You may decide for yourself how convincing his demonstration is.

Notice that there is no mention of limits of difference quotients or derivatives. In fact, the term derivative was not coined until 1797, by Lagrange. In a sense, these topics were not necessary at the time, as Leibniz and Newton both assumed that the curves they dealt with had tangent lines and, in fact, Leibniz explicitly used the tangent line to relate two diﬀerential quantities. This was consistent with the thinking of the time and for the duration of this chapter we will also assume that all quantities are diﬀerentiable. As we will see later this assumption leads to diﬃculties.

Both Newton and Leibniz were satisﬁed that their calculus provided answers that agreed with what was known at the time. For example \(d(x^2)= d(xx) = xdx+xdx = 2xdx\) and \(d(x^3)= d(x^2x)= x^2 dx+xd(x^2)= x^2+x(2xdx) = 3x^2 dx\), results that were essentially derived by others in diﬀerent ways.

Exercise \(\PageIndex{1}\)

- Use Leibniz’s product rule \(d(xv) = xdv + vdx\) to show that if \(n\) is a positive integer then \(d(x^n) = nx^{n - 1} dx\)
- Use Leibniz’s product rule to derive the quotient rule \[d \left ( \frac{v}{y} \right ) = \frac{ydv - vdy}{yy}\]
- Use the quotient rule to show that if nis a positive integer, then \[d(x^{-n}) = -nx^{-n - 1} dx\]

Exercise \(\PageIndex{2}\)

Let \(p\) and \(q\) be integers with \(q\neq 0\). Show \[d\left ( x^{\frac{p}{q}} \right ) = \frac{p}{q} x^{\frac{p}{q} - 1} dx\]

Leibniz also provided applications of his calculus to prove its worth. As an example he derived Snell’s Law of Refraction from his calculus rules as follows.

Given that light travels through air at a speed of \(v_a\) and travels through water at a speed of \(v_w\) the problem is to ﬁnd the fastest path from point \(A\) to point \(B\).

**Figure \(\PageIndex{5}\): **Fastest path that light travels from point \(A\) to point \(B\).

According to Fermat’s Principle of Least Time, this fastest path is the one that light will travel.

Using the fact that \(Time = Distance/Velocity\) and the labeling in the picture below we can obtain a formula for the time \(T\) it takes for light to travel from \(A\) to \(B\).

**Figure \(\PageIndex{6}\): **Fermat’s Principle of Least Time.

\[T = \frac{\sqrt{x^2 + a^2}}{v_a} + \frac{\sqrt{(c-x)^2 + b^2}}{v_w}\]

Using the rules of Leibniz’s calculus, we obtain

\[\begin{align*} dT &= \left ( \frac{1}{v_a} \frac{1}{2} (x^2 + a^2)^{-\frac{1}{2}} (2x) + \frac{1}{v_w} \frac{1}{2} ((c-x)^2 + b^2)^{-\frac{1}{2}} (2(c-x)(-1))\right )dx\\ &= \left ( \frac{1}{v_a} \frac{x}{\sqrt{x^2 + a^2}} - \frac{1}{v_w} \frac{c-x}{\sqrt{(c-x)^2 + b^2}} \right )dx \end{align*}\]

Using the fact that at the minimum value for \(T\), \(dT = 0\), we have that the fastest path from \(A\) to \(B\) must satisfy \(\frac{1}{v_a} \frac{x}{\sqrt{x^2 + a^2}} = \frac{1}{v_w} \frac{c-x}{\sqrt{(c-x)^2 + b^2}}\). Inserting the following angles

* Figure \(\PageIndex{7}\):* Fastest path that light travels.

we get that the path that light travels must satisfy

\[\frac{\sin \theta _a}{v_a} = \frac{\sin \theta _w}{v_w}\]

which is Snell’s Law.

To compare \(18^{th}\) century and modern techniques we will consider Johann Bernoulli’s solution of the Brachistochrone problem. In 1696, Bernoulli posed, and solved, the Brachistochrone problem; that is, to ﬁnd the shape of a frictionless wire joining points \(A\) and \(B\) so that the time it takes for a bead to slide down under the force of gravity is as small as possible.

**Figure \(\PageIndex{8}\):** Finding shape of a frictionless wire joining points \(A\) and \(B\).

Bernoulli posed this “*path of fastest descent*” problem to challenge the mathematicians of Europe and used his solution to demonstrate the power of Leibniz’s calculus as well as his own ingenuity.

*I, Johann Bernoulli, address the most brilliant mathematicians in the world. Nothing is more attractive to intelligent people than an honest, challenging problem, whose possible solution will bestow fame and remain as a lasting monument. Following the example set by Pascal, Fermat, etc., I hope to gain the gratitude of the whole scientiﬁc community by placing before the ﬁnest mathematicians of our time a problem which will test their methods and the strength of their intellect. If someone communicates to me the solution of the proposed problem, I shall publicly declare him worthy of praise. [11]*

**Figure \(\PageIndex{9}\):***Johann Bernoulli.*

In addition to Johann’s, solutions were obtained from Newton, Leibniz, Johann’s brother Jacob Bernoulli, and the Marquis de l’Hopital [15]. At the time there was an ongoing and very vitriolic controversy raging over whether Newton or Leibniz had been the ﬁrst to invent calculus. An advocate of the methods of Leibniz, Bernoulli did not believe Newton would be able to solve the problem using his methods. Bernoulli attempted to embarrass Newton by sending him the problem. However Newton did solve it.

At this point in his life Newton had all but quit science and mathematics and was fully focused on his administrative duties as Master of the Mint. In part due to rampant counterfeiting, England’s money had become severely devalued and the nation was on the verge of economic collapse. The solution was to recall all of the existing coins, melt them down, and strike new ones. As Master of the Mint this job fell to Newton [8]. As you might imagine this was a rather Herculean task. Nevertheless, according to his niece:

*When the problem in 1696 was sent by Bernoulli–Sir I.N. was in the midst of the hurry of the great recoinage and did not come home till four from the Tower very much tired, but did not sleep till he had solved it, which was by four in the morning. (quoted in [2], page 201)*

He is later reported to have complained, “*I do not love ... to be ... teezed by forreigners about Mathematical things [2]*.”

Newton submitted his solution anonymously, presumably to avoid more controversy. Nevertheless the methods used were so distinctively Newton’s that Bernoulli is said to have exclaimed “*Tanquam ex ungue leonem.*”^{3}

Bernoulli’s ingenious solution starts, interestingly enough, with Snell’s Law of Refraction. He begins by considering the stratiﬁed medium in the following ﬁgure, where an object travels with velocities \(v_1, v_2, v_3, ...\) in the various layers.

**Figure \(\PageIndex{10}\): **Bernoulli's solution.

By repeatedly applying Snell’s Law he concluded that the fastest path must satisfy

\[\frac{\sin \theta _1}{v_1} = \frac{\sin \theta _2}{v_2} = \frac{\sin \theta _3}{v_3} = \cdots\]

In other words, the ratio of the sine of the angle that the curve makes with the vertical and the speed remains constant along this fastest path.

If we think of a continuously changing medium as stratiﬁed into inﬁnitesimal layers and extend Snell’s law to an object whose speed is constantly changing,

**Figure \(\PageIndex{11}\):** Snell's law for an object changing speed continuously.

then along the fastest path, the ratio of the sine of the angle that the curve’s tangent makes with the vertical, \(α\), and the speed, \(v\), must remain constant.

\[\frac{\sin \alpha }{v} = c\]

If we include axes and let \(P\) denote the position of the bead at a particular time then we have the following picture.

**Figure \(\PageIndex{11}\):** Path traveled by the bead.

In the above ﬁgure, \(s\) denotes the length that the bead has traveled down to point \(P\) (that is, the arc length of the curve from the origin to that point) and a denotes the tangential component of the acceleration due to gravity \(g\). Since the bead travels only under the inﬂuence of gravity then \(\frac{dv}{dt} = a\).

To get a sense of how physical problems were approached using Leibniz’s calculus we will use the above equation to show that \(v = \sqrt{2gy}\).

By similar triangles we have \(\frac{a}{g} = \frac{dy}{ds}\). As a student of Leibniz, Bernoulli would have regarded \(\frac{dy}{ds}\) as a fraction so

\[a ds = gdy\]

and since acceleration is the rate of change of velocity we have

\[\frac{dv}{dt} ds= gdy\]

Again, \(18^{th}\) century European mathematicians regarded \(dv\), \(dt\), and \(ds\) as inﬁnitesimally small numbers which nevertheless obey all of the usual rules of algebra. Thus we can rearrange the above to get

\[\frac{ds}{dt} dv= gdy\]

Since \(\frac{ds}{dt}\) is the rate of change of position with respect to time it is, in fact, the velocity of the bead. That is

\[v dv = g dy\]

Bernoulli would have interpreted this as a statement that two rectangles of height \(v\) and \(g\), with respective widths \(dv\) and \(dy\) have equal area. Summing (integrating) all such rectangles we get:

\[\int v dv = \int g dy\]

\[\frac{v^2}{2} = gy\]

or

\[v = \sqrt{2gy}\]

You are undoubtedly uncomfortable with the cavalier manipulation of inﬁnitesimal quantities you’ve just witnessed, so we’ll pause for a moment now to compare a modern development of equation \(\PageIndex{12}\) to Bernoulli’s. As before we begin with the equation:

\[\frac{a}{g} = \frac{dy}{ds}\]

\[a = g\frac{dy}{ds}\]

Moreover, since acceleration is the derivative of velocity this is the same as:

\[\frac{dv}{dt} = g\frac{dy}{ds}\]

Now observe that by the Chain Rule \(\frac{dv}{dt} = \frac{dv}{ds} \frac{ds}{dt}\). The physical interpretation of this formula is that velocity will depend on \(s\), how far down the wire the bead has moved, but that the distance traveled will depend on how much time has elapsed. Therefore

\[\frac{dv}{ds} \frac{ds}{dt} = g\frac{dy}{ds}\]

or

\[\frac{ds}{dt} \frac{dv}{ds} = g\frac{dy}{ds}\]

and since \(\frac{ds}{dt} = v\)

\[v\frac{dv}{ds} = g\frac{dy}{ds}\]

Integrating both sides with respect to \(s\) gives:

\[\int v\frac{dv}{ds} ds = g\int \frac{dy}{ds} ds\]

\[\int vdv = g\int dy\]

and integrating gives

\[\frac{v^2}{2} = gy\]

as before.

In eﬀect, in the modern formulation we have traded the simplicity and elegance of diﬀerentials for a comparatively cumbersome repeated use of the Chain Rule. No doubt you noticed when taking Calculus that in the diﬀerential notation of Leibniz, the Chain Rule looks like “*canceling*” an expression in the top and bottom of a fraction: \(\frac{dy}{du} \frac{du}{dx} = \frac{dy}{dx}\). This is because for 18th century mathematicians, this is exactly what it was.

To put it another way, \(18^{th}\) century mathematicians wouldn’t have recognized a need for what we call the Chain Rule because this operation was a triviality for them. Just reduce the fraction. This begs the question: Why did we abandon such a clear, simple interpretation of our symbols in favor of the, comparatively, more cumbersome modern interpretation? This is one of the questions we will try to answer in this course.

Returning to the Brachistochrone problem we observe that \(\frac{\sin \alpha }{v} = c\) and since \(\sin \alpha = \frac{dx}{ds}\) we see that

\[\frac{\frac{dx}{ds}}{\sqrt{2gy}} = c\]

\[\frac{dx}{\sqrt{2gy(ds)^2}} = c\]

\[\frac{dx}{\sqrt{2gy\left [ (dx)^2 + (dy)^2 \right ]}} = c\]

Bernoulli was then able to solve this diﬀerential equation.

Exercise \(\PageIndex{3}\)

Show that the equations \(x = \frac{t - \sin t}{4gc^2}\), \(y = \frac{t - \cos t}{4gc^2}\) satisfy equation \(\PageIndex{37}\). Bernoulli recognized this solution to be an inverted cycloid, the curve traced by a ﬁxed point on a circle as the circle rolls along a horizontal surface.

This illustrates the state of calculus in the late 1600’s and early 1700’s; the foundations of the subject were a bit shaky but there was no denying its power.

### References

^{1 }This translates, loosely, as the calculus of diﬀerences.

^{2 }Newton’s approach to calculus – his ‘Method of Fluxions’ – depended fundamentally on motion. That is, he viewed his variables (ﬂuents) as changing (ﬂowing or ﬂuxing) in time. The rate of change of a ﬂuent he called a ﬂuxion. As a foundation both Leibniz’s and Newton’s approaches have fallen out of favor, although both are still universally used as a conceptual approach, a “way of thinking,” about the ideas of calculus.

^{3 }I know the lion by his claw.

### Contributors

Eugene Boman (Pennsylvania State University) and Robert Rogers (SUNY Fredonia)