Skip to main content
Mathematics LibreTexts

C.1 Newton's Method

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Newton's method 1, also known as the Newton-Raphson method, is another technique for generating numerical approximate solutions to equations of the form \(f(x)=0\text{.}\) For example, one can easily get a good approximation to \(\sqrt{2}\) by applying Newton's method to the equation \(x^2-2=0\text{.}\) This will be done in Example C.1.2, below.

    Here is the derivation of Newton's method. We start by simply making a guess for the solution. For example, we could base the guess on a sketch of the graph of \(f(x)\text{.}\) Call the initial guess \(x_1\text{.}\) Next recall, from Theorem 2.3.4, that the tangent line to \(y=f(x)\) at \(x=x_1\) is \(y=F(x)\text{,}\) where

    \[ F(x) = f(x_1) + f'(x_1)\,(x-x_1) \nonumber \]

    Usually \(F(x)\) is a pretty good approximation to \(f(x)\) for \(x\) near \(x_1\text{.}\) So, instead of trying to solve \(f(x)=0\text{,}\) we solve the linear equation \(F(x)=0\) and call the solution \(x_2\text{.}\)

    \begin{align*} 0=F(x)=f(x_1) + f'(x_1)\,(x-x_1) & \iff x-x_1=-\frac{f(x_1)}{f'(x_1)} \\ & \iff x= x_2= x_1 -\frac{f(x_1)}{f'(x_1)} \end{align*}

    Note that if \(f(x)\) were a linear function, then \(F(x)\) would be exactly \(f(x)\) and \(x_2\) would solve \(f(x)=0\) exactly.


    Now we repeat, but starting with the (second) guess \(x_2\) rather than \(x_1\text{.}\) This gives the (third) guess \(x_3= x_2 -\frac{f(x_2)}{f'(x_2)}\text{.}\) And so on. By way of summary, Newton's method is

    1. Make a preliminary guess \(x_1\text{.}\)
    2. Define \(x_2=x_1-\frac{f(x_1)}{f'(x_1)}\text{.}\)
    3. Iterate. That is, for each natural number \(n\text{,}\) once you have computed \(x_n\text{,}\) define
      Equation C.1.1 (Newton's method).

      \[ x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)} \nonumber \]

    Example C.1.2 (Approximating \(\sqrt{2}\)).

    In this example we compute, approximately, the square root of two. We will of course pretend that we do not already know that \(\sqrt{2}=1.41421\cdots\text{.}\) So we cannot find it by solving, approximately, the equation \(f(x)=x-\sqrt{2}=0\text{.}\) Instead we apply Newton's method to the equation

    \[ f(x)=x^2-2=0 \nonumber \]

    Since \(f'(x)=2x\text{,}\) Newton's method says that we should generate approximate solutions by iteratively applying

    \[ x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=x_n-\frac{x_n^2-2}{2x_n} =\frac{x_n}{2} +\frac{1}{x_n} \nonumber \]

    We need a starting point. Since \(1^2=1\lt 2\) and \(2^2=4\gt 2\text{,}\) the square root of two must be between \(1\) and \(2\text{,}\) so let's start Newton's method with the initial guess \(x_1=1.5\text{.}\) Here goes 2 The following computations have been carried out in double precision, which is computer speak for about 15 significant digits. We are displaying each \(x_n\) rounded to 10 significant digits (9 decimal places). So each displayed \(x_n\) has not been impacted by roundoff error, and still contains more decimal places than are usually needed.:

    \begin{align*} x_1& =1.5\\ x_2& =\frac{1}{2} x_1+\frac{1}{x_1} =\frac{1}{2}(1.5)+\frac{1}{1.5}\\ & =1.416666667 \\ x_3& =\frac{1}{2} x_2+\frac{1}{x_2} =\frac{1}{2}(1.416666667)+\frac{1}{1.416666667} \\ & =1.414215686 \\ x_4& =\frac{1}{2} x_3+\frac{1}{x_3} =\frac{1}{2}(1.414215686)+\frac{1}{1.414215686} \\ & =1.414213562 \\ x_5& =\frac{1}{2} x_4+\frac{1}{x_4} =\frac{1}{2}(1.414213562)+\frac{1}{1.414213562}\\ & =1.414213562 \end{align*}

    It looks like the \(x_n\)'s, rounded to nine decimal places, have stabilized to \(1.414213562\text{.}\) So it is reasonable to guess that \(\sqrt{2}\text{,}\) rounded to nine decimal places, is exactly \(1.414213562\text{.}\) Recalling that all numbers \(1.4142135615 \le y \lt 1.4142135625\) round to \(1.414213562\text{,}\) we can check our guess by evaluating \(f(1.4142135615)\) and \(f(1.4142135625)\text{.}\) Since \(f(1.4142135615)=-2.5\times 10^{-9}\lt 0\) and \(f(1.4142135625)=3.6\times 10^{-10}\gt 0\) the square root of two must indeed be between \(1.4142135615\) and \(1.4142135625\text{.}\)

    Example C.1.3 (Approximating \(\pi\)).

    In this example we compute, approximately, \(\pi\) by applying Newton's method to the equation

    \[ f(x)=\sin x=0 \nonumber \]

    starting with \(x_1=3\text{.}\) Since \(f'(x)=\cos x\text{,}\) Newton's method says that we should generate approximate solutions by iteratively applying

    \[ x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=x_n-\frac{\sin x_n}{\cos x_n} =x_n-\tan x_n \nonumber \]

    Here goes

    \begin{align*} x_1& =3\\ x_2& =x_1-\tan x_1 =3-\tan 3\\ & =3.142546543\\ x_3& =3.142546543-\tan 3.142546543\\ & =3.141592653\\ x_4& =3.141592653-\tan 3.141592653\\ & =3.141592654\\ x_5& =3.141592654-\tan 3.141592654\\ & =3.141592654 \end{align*}

    Since \(f(3.1415926535)=9.0\times 10^{-11}\gt 0\) and \(f(3.1415926545)=-9.1\times 10^{-11}\lt 0\text{,}\) \(\pi\) must be between \(3.1415926535\) and \(3.1415926545\text{.}\) Of course to compute \(\pi\) in this way, we (or at least our computers) have to be able to evaluate \(\tan x\) for various values of \(x\text{.}\) Taylor expansions can help us do that. See Example 3.4.22.

    Example C.1.4 wild instability.

    This example illustrates how Newton's method can go badly wrong if your initial guess is not good enough. We'll try to solve the equation

    \[ f(x)=\arctan x=0 \nonumber \]

    starting with \(x_1=1.5\text{.}\) (Of course the solution to \(f(x)=0\) is just \(x=0\text{;}\) we chose \(x_1=1.5\) for demonstration purposes.) Since the derivative \(f'(x)=\frac{1}{1+x^2} \text{,}\) Newton's method gives

    \[ x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=x_n-(1+x_n^2)\arctan x_n \nonumber \]

    So 3 Once again, the following computations have been carried out in double precision. This time, it is clear that the \(x_n\)'s are growing madly as \(n\) increases. So there is not much point to displaying many decimal places and we have not done so.

    \begin{align*} x_1& =1.5\\ x_2& =1.5-(1+1.5^2)\arctan 1.5=-1.69\\ x_3& =-1.69-(1+1.69^2)\arctan (-1.69)=2.32\\ x_4& =2.32-(1+2.32^2)\arctan (2.32)=-5.11\\ x_5& =-5.11-(1+5.11^2)\arctan (-5.11)=32.3\\ x_6& =32.3-(1+32.3^2)\arctan (32.3)=-1575\\ x_7& =3,894,976 \end{align*}

    Looks pretty bad! Our \(x_n\)'s are not settling down at all!

    The figure below shows what went wrong. In this figure, \(y=F_1(x)\) is the tangent line to \(y=\arctan x\) at \(x=x_1\text{.}\) Under Newton's method, this tangent line crosses the \(x\)-axis at \(x=x_2\text{.}\) Then \(y=F_2(x)\) is the tangent to \(y=\arctan x\) at \(x=x_2\text{.}\) Under Newton's method, this tangent line crosses the \(x\)-axis at \(x=x_3\text{.}\) And so on.

    The problem arose because the \(x_n\)'s were far enough from the solution, \(x=0\text{,}\) that the tangent line approximations, while good approximations to \(f(x)\) for \(x\approx x_n\text{,}\) were very poor approximations to \(f(x)\) for \(x\approx 0\text{.}\) In particular, \(y=F_1(x)\) (i.e. the tangent line at \(x=x_1\)) was a bad enough approximation to \(y=\arctan x\) for \(x\approx0\) that \(x=x_2\) (i.e. the value of \(x\) where \(y=F_1(x)\) crosses the \(x\)-axis) is farther from the solution \(x=0\) than our original guess \(x=x_1\text{.}\)


    If we had started with \(x_1=0.5\) instead of \(x_1=1.5\text{,}\) Newton's method would have succeeded very nicely:

    \begin{gather*} x_1=0.5\qquad x_2=-0.0796\qquad x_3=0.000335\qquad x_4=-2.51\times 10^{-11} \end{gather*}

    Example C.1.5 interest rate.

    A car dealer sells a new car for $23,520. He also offers to finance the same car for payments of $420 per month for five years. What interest rate is this dealer charging?

    Solution. By way of preparation, we'll start with a simpler problem. Suppose that you will have to make a single $420 payment \(n\) months in the future. The simpler problem is to determine how much money you have to deposit now in an account that pays an interest rate of \(100 r\%\) per month, compounded monthly 4 “Compounded monthly”, means that, each month, interest is paid on the accumulated interest that was paid in all previous months., in order to be able to make the $420 payment in \(n\) months.

    Let's denote by \(P\) the initial deposit. Because the interest rate is \(100 r\%\) per month, compounded monthly,

    • the first month's interest is \(P\times r\text{.}\) So at the end of month #1, the account balance is \(P+P\,r=P(1+r)\text{.}\)
    • The second month's interest is \([P(1+r)]\times r\text{.}\) So at the end of month #2, the account balance is \(P(1+r)+P(1+r)\,r=P(1+r)^2\text{.}\)
    • And so on.
    • So at the end of \(n\) months, the account balance is \(P(1+r)^n\text{.}\)

    In order for the balance at the end of \(n\) months, \(P(1+r)^n\text{,}\) to be $420, the initial deposit has to be \(P=420(1+r)^{-n}\text{.}\) That is what is meant by the statement “The present value 5 Inflation means that prices of goods (typically) increase with time, and hence $100 now is worth more than $100 in 10 years time. The term “present value” is widely used in economics and finance to mean “the current amount of money that will have a specified value at a specified time in the future”. It takes inflation into account. If the money is invested, it takes into account the rate of return of the investment. We recommend that the interested reader do some search-engining to find out more. of a $420 payment made \(n\) months in the future, when the interest rate is \(100 r\%\) per month, compounded monthly, is \(420(1+r)^{-n}\text{.}\)”

    Now back to the original problem. We will be making 60 monthly payments of $420. The present value of all 60 payments is 6 Don't worry if you don't know how to evaluate such sums. They are called geometric sums, and will be covered in the CLP-2 text. (See (1.1.3) in the CLP-2 text. In any event, you can check that this is correct, by multiplying the whole equation by \(1-(1+r)^{-1}\text{.}\) When you simplify the left hand side, you should get the right hand side.

    \begin{align*} & 420(1+r)^{-1}+420(1+r)^{-2}+\cdots +420(1+r)^{-60} \\ & \hskip1in =420\frac{(1+r)^{-1}-(1+r)^{-61} }{1-(1+r)^{-1} } \\ & \hskip1in=420\frac{1-(1+r)^{-60} }{(1+r)-1}\\ & \hskip1in =420\frac{1-(1+r)^{-60} }{r} \end{align*}

    The interest rate \(100r\%\) being charged by the car dealer is such that the present value of 60 monthly payments of $420 is $23520. That is, the monthly interest rate being charged by the car dealer is the solution of

    \begin{align*} 23520=420\frac{1-(1+r)^{-60} }{r} \qquad& \text{or}\qquad 56=\frac{1-(1+r)^{-60} }{r} \\ & \text{or}\qquad 56r=1-(1+r)^{-60} \\ & \text{or}\qquad 56r(1+r)^{60}=(1+r)^{60}-1 \\ & \text{or}\qquad (1-56r)(1+r)^{60}=1 \end{align*}

    Set \(f(r)=(1-56r)(1+r)^{60}-1\text{.}\) Then

    \[ f'(r)=-56(1+r)^{60}+60(1-56r)(1+r)^{59} \nonumber \]


    \[ f'(r)=\big[-56(1+r)+60(1-56r)\big](1+r)^{59} =(4-3416r)(1+r)^{59} \nonumber \]

    Apply Newton's method with an initial guess of \(r_1=.002\text{.}\) (That's \(0.2\)% per month or 2.4% per year.) Then

    \begin{align*} r_2& =r_1-\frac{(1-56r_1)(1+r_1)^{60}-1}{(4-3416r_1)(1+r_1)^{59}}=0.002344\\ \ r_3& =r_2-\frac{(1-56r_2)(1+r_2)^{60}-1}{(4-3416r_2)(1+r_2)^{59}}=0.002292\\ r_4& =r_3-\frac{(1-56r_3)(1+r_3)^{60}-1}{(4-3416r_3)(1+r_3)^{59}}=0.002290\\ r_5& =r_4-\frac{(1-56r_4)(1+r_4)^{60}-1}{(4-3416r_4)(1+r_4)^{59}}=0.002290 \end{align*}

    So the interest rate is 0.229% per month or 2.75% per year.

    This page titled C.1 Newton's Method is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Joel Feldman, Andrew Rechnitzer and Elyse Yeager via source content that was edited to the style and standards of the LibreTexts platform.

    • Was this article helpful?