4.3: Numerical Approximation of Roots of Functions

Last updated
Save as PDF

Page ID: 54783

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\dsum}{\displaystyle\sum\limits} \)

\( \newcommand{\dint}{\displaystyle\int\limits} \)

\( \newcommand{\dlim}{\displaystyle\lim\limits} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\(\newcommand{\longvect}{\overrightarrow}\)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

When finding critical points of a function \(f\), you encounter the problem of solving the equation \(f'(x)=0\). The examples and exercises so far were set up carefully so that solutions to that equation could be found in a simple closed form. But in practice this will not always be the case—in fact it is almost never the case. For example, finding the critical points of \(f(x) = \sin\,x \;-\; \frac{x^2}{2}\) entails solving the equation \(f'(x) = \cos\,x \;-\; x ~=~ 0\), for which there is no solution in a closed-form expression.

What should you do in such a situation?⁵ One possibility is to use the bisection method mentioned in Section 3.3. In fact, in Example

Example \(\PageIndex{1}\): ivt

Add text here.

Solution

the solution to the equation \(\cos\,x = x\) (i.e. \(\cos\,x \;-\; x \;=\;0\)) was shown to exist in the interval \(\ival{0}{1}\), and then a demonstration of the bisection method was given to find that solution.

The bisection method is one of many numerical methods for finding roots of a function (i.e. where the function is zero). Finding the critical points of a function means finding the roots of its derivative. Though the bisection method could be used for that purpose, it is not efficient—convergence to the root is slow. A far more efficient method is Newton’s method⁶, whose geometric interpretation is shown in Figure [fig:newton] below.

The idea behind Newton’s method is simple: to find a root \(\bar{x}\) of a function \(f\), choose an initial guess \(x_0\) and then go up—or down—to the curve \(y=f(x)\) and draw the tangent line to the curve at the point \((x_0,f(x_0))\). Let \(x_1\) be where that tangent line intersects the \(x\)-axis, as shown above; repeat this procedure on \(x_1\) to get the next number \(x_2\), repeat on \(x_2\) to get \(x_3\), and so on. The resulting sequence of numbers \(x_0\), \(x_1\), \(x_2\), \(x_3\), \(\ldots\), will approach the root \(\bar{x}\). Convergence under certain conditions can be proved.⁷ The general formula for the number \(x_n\) obtained after \(n \ge 1\) iterations in Newton’s method can be determined by considering the formula for \(x_1\). First, the tangent line to \(y=f(x)\) at the point \((x_0,f(x_0))\) has slope \(f'(x_0)\), so the equation of the line is

\[\label{eqn:newton} y ~-~ f(x_0) ~=~ f'(x_0)\,(x ~-~ x_0) ~. \]

The point \((x_1,0)\) is (by design) also on that line, so that

\[0 ~-~ f(x_0) ~=~ f'(x_0)\,(x_1 ~-~ x_0) \quad\Rightarrow\quad x_1 ~=~ x_0 ~-~ \frac{f(x_0)}{f'(x_0)} \nonumber \]

provided that \(f'(x_0) \ne 0\). The general formula for \(x_n\) is given by the following algorithm:

To implement this algorithm in a programming language (for which Newton’s method is well-suited), the following language-independent pseudocode can be used as a guide:

Example \(\PageIndex{1}\): newt1

Add text here.

Solution

Use Newton’s method to find the root of \(f(x) = \cos\,x - x\).

Solution: Since the root is already known to be in the interval \(\ival{0}{1}\), choose \(x_0 = 1\) as the initial guess. The numbers \(x_n\) for \(n \ge 1\) can be computed with a hand-held scientific calculator, but the process is tedious and error-prone. Using a computer is far more efficient and allows more flexibility.

For example, the algorithm is easily implemented in the Java programming language. Save this code in a plain text file as newton.java:

public class newton {
   public static void main(String[] args) {
      int N = Integer.parseInt(args[0]); //Number of iterations
      double x = 1.0; //initial guess
      System.out.println("n=0: " + x);
      for (int i = 1; i <= N; i++) {
         x =  x - f(x)/derivf(x);
         System.out.println("n=" + i + ": " + x);
      }
   }
 
   //Define the function f(x)
   public static double f(double x) {
      return Math.cos(x) - x;
   }

   //Define the derivative f'(x)
   public static double derivf(double x) {
      return -Math.sin(x) - 1.0;
   }
}

Though knowledge of Java would help, it should not be that difficult to figure out what the above code is doing. The number of iterations \(N\) is passed as a command-line parameter to the program, and \(x_n\) is computed and printed for \(n=0\), \(1\), \(2\), \(\ldots\), \(N\). Note that the derivative of \(f(x)\) is “hard-coded” into the program.⁸ There is also no error checking for the derivative being zero at any \(x_n\). The program would simply halt on a division by zero error.

Compile the code, then run the program with \(10\) iterations:

javac newton.java
java newton 10

The output is shown below:

n=0: 1.0
n=1: 0.7503638678402439
n=2: 0.7391128909113617
n=3: 0.739085133385284
n=4: 0.7390851332151607
n=5: 0.7390851332151607
n=6: 0.7390851332151607
n=7: 0.7390851332151607
n=8: 0.7390851332151607
n=9: 0.7390851332151607
n=10: 0.7390851332151607

Note that the solution \(\bar{x} = 0.7390851332151607\) was found after only 4 iterations; the numbers \(x_n\) repeat for \(n \ge 5\). This is much faster than the bisection method.

Another root-finding numerical method similar to Newton’s method is the secant method, whose geometric interpretation is shown in Figure [fig:secant] below:

The idea behind the secant method is simple: to find a root \(\bar{x}\) of a function \(f\), choose two initial guesses \(x_0\) and \(x_1\), then go up—or down—to the curve \(y=f(x)\) and draw the secant line through the points \((x_0,f(x_0))\) and \((x_1,f(x_1))\) on the curve. Let \(x_2\) be where that secant line intersects the \(x\)-axis, as shown above; repeat this procedure on \(x_1\) and \(x_2\) to get the next number \(x_3\), and keep repeating in this way. The resulting sequence of numbers \(x_0\), \(x_1\), \(x_2\), \(x_3\), \(\ldots\), will approach the root \(\bar{x}\), under the right conditions.⁹ Since the secant line through \((x_0,f(x_0))\) and \((x_1,f(x_1))\) has slope \(\frac{f(x_1) - f(x_0)}{x_1 - x_0}\), the equation of that secant line is:

\[y ~-~ f(x_1) ~=~ \frac{f(x_1) - f(x_0)}{x_1 - x_0}\,(x ~-~ x_1) \nonumber \]

The point \((x_2,0)\) is on that line, so that

\[0 ~-~ f(x_1) ~=~ \frac{f(x_1) - f(x_0)}{x_1 - x_0}\,(x_2 ~-~ x_1) \quad\Rightarrow\quad x_2 ~=~ x_1 ~-~ \frac{(x_1 ~-~ x_0) \cdot f(x_1)}{f(x_1) ~-~ f(x_0)} \nonumber \]

provided that \(x_1 \ne x_0\). The general formula for \(x_n\) is given by the following algorithm:

One difference you might have noticed between the secant method and Newton’s method is that the secant method does not use derivatives. The secant method replaces the derivative in Newton’s method with the slope of a secant line which approximates the derivative (recall how the tangent line is the limit of slopes of secant lines). This might seem like a drawback, perhaps giving a “less accurate” slope than the tangent line, but in practice it is not really a problem. In fact, in many cases the secant method is preferable, since computing derivatives can often be quite complicated.

Example \(\PageIndex{1}\): secant1

Add text here.

Solution

Use the secant method to find the root of \(f(x) = \cos\,x - x\).

Solution: Since the root is already known to be in the interval \(\ival{0}{1}\), choose \(x_0 = 0\) and \(x_1 = 1\) as the two initial guesses. The algorithm is easily implemented in the Java programming language. Save this code in a plain text file as secant.java:

import java.math.*;
public class secant {
   public static void main(String[] args) {
      int N = Integer.parseInt(args[0]); //Number of iterations
      double x0 = 0.0; //first initial guess
      double x1 = 1.0; //second initial guess
      double f0 = f(x0);
      double f1;
      double x = 0.0;
      for (int i = 2; i <= N; i++) {
         f1 = f(x1);
         x = x1 - (x1 - x0)*f1/(f1 - f0);
         x0 = x1;
         f0 = f1; //Re-use f1 as f0 in the next iteration
         x1 = x;
         System.out.println("n=" + i + ": " + x);
      }
   }

   //Define the function f(x)
   public static double f(double x) {
      return Math.cos(x) - x;
   }
}

Compile the code, then run the program:

javac secant.java
java secant 10

The output is shown below:

n=2: 0.6850733573260451
n=3: 0.736298997613654
n=4: 0.7391193619116293
n=5: 0.7390851121274639
n=6: 0.7390851332150012
n=7: 0.7390851332151607
n=8: 0.7390851332151607
n=9: NaN
n=10: NaN

Notice that the root was found after 6 iterations (\(n=7\)). The undefined number NaN (which stands for “Not a Number”) was returned starting with the eighth iteration (\(n=9\)) because \(x_7 ~=~ x_8\), so that \(f(x_8) ~-~ f(x_7) ~=~ 0\), causing a division by zero error in the term

\[x_9 ~=~ x_8 ~-~ \frac{(x_8 ~-~ x_7) \cdot f(x_8)}{f(x_8) ~-~ f(x_7)} ~. \nonumber \]

For the function \(f(x) = \cos\,x \,-\, x\), the table below summarizes the results of 10 iterations of the bisection method, Newton’s method and the secant method:

Term	Bisection	Newton	Secant
\(x_0\)	0.5	1.0	0.0
\(x_1\)	0.75	0.7503638678402439	1.0
\(x_2\)	0.625	0.7391128909113617	0.6850733573260451
\(x_3\)	0.6875	0.739085133385284	0.736298997613654
\(x_4\)	0.71875	0.7390851332151607	0.7391193619116293
\(x_5\)	0.734375	0.7390851332151607	0.7390851121274639
\(x_6\)	0.7421875	0.7390851332151607	0.7390851332150012
\(x_7\)	0.73828125	0.7390851332151607	0.7390851332151607
\(x_8\)	0.740234375	0.7390851332151607	0.7390851332151607
\(x_9\)	0.7392578125	0.7390851332151607	undefined
\(x_{10}\)	0.73876953125	0.7390851332151607	undefined

Newton’s method found the root after 4 iterations, while the secant method needed 6 iterations. After 10 iterations the bisection method had yet to find the root to the same level of precision as the other methods—it would take 52 iterations (that is, \(x_{52}\)) to achieve similar accuracy to 16 decimal places.

In general Newton’s method requires fewer iterations to find a root than the secant method does, but this does not necessarily mean that it will always be faster. Depending on the complexity of the function and its derivative, Newton’s method could involve more “expensive” operations (i.e. computing values, as opposed to assigning values) than the secant method, so that the few more iterations possibly required by the secant method are made up for by fewer total computations.

To see this, notice that Newton’s method always requires that both \(f(x_{n-1})\) and \(f\,'(x_{n-1})\) be computed for the n^th term \(x_n\) in the sequence. The secant method needs \(f(x_{n-1})\) and \(f(x_{n-2})\) for the n^th term, but a good programmer would save the value of \(f(x_{n-1})\) so that it could be re-used (and hence not re-computed) as \(f(x_{n-2})\) in the next iteration, resulting in potentially fewer total computations for the secant method.

There are occasional pitfalls in using Newton’s method. For example, if \(f'(x_n) = 0\) for some \(n \ge 1\) then Newton’s method fails, due to division by 0 in the algorithm. The geometric reason is clear: the tangent line to the curve at that point would be parallel to the \(x\)-axis and hence would not intersect it (assuming \(f(x_n) \ne 0\)). There would be no “next number” \(x_{n+1}\) in the iteration! See Figure [fig:newtonpitfalls](a) below.

Another possible problem is that Newton’s method might move you away from the root, i.e. not get closer, typically by a poor choice of \(x_0\). See Figure [fig:newtonpitfalls](b) above. In some extreme cases, it is possible that Newton’s method simply loops back and forth endlessly between the same two numbers, as in Figure [fig:newtonloop]:

In most cases a different choice for the initial guess \(x_0\) will fix such problems. Most textbooks on the subject of numerical analysis discuss these issues.¹⁰ There are conditions under which Newton’s method is guaranteed to work, and convergence is fast. Newton’s method has a quadratic rate of convergence, meaning roughly that the error terms—the differences between approximate roots and the actual root—are being squared in the long term. More precisely, if the numbers \(x_n\) for \(n \ge 0\) converge to a root \(\bar{x}\), then the error terms \(\epsilon_n = x_n - \bar{x}\) for \(n \ge 0\) satisfy the limit

\[\lim_{n \to \infty} ~\dfrac{\abs{\epsilon_{n+1}}}{\abs{\epsilon_n}^2} ~=~ C \nonumber \]

for some constant \(C\). Squared error terms might sound like a bad thing, but the \(x_n\) terms are converging to the root, making the error terms closer to 0 for large \(n\). Squaring a number \(\epsilon_n\) when \(\abs{\epsilon_n} < 1\) results in a smaller number, not a larger one.

The numerical methods that you have learned will make it possible to sketch the graphs of many more functions, since finding local minima and maxima involves finding roots of \(f'\), and finding inflection points involves finding roots of \(f''\). You now know some methods for finding those roots.

Finally, despite being slower, the bisection method has the nice advantage of always working. The speed of modern computers makes the difference in algorithmic efficiency negligible in many cases.

[sec4dot3]

Use Newton’s method to find the root of \(f(x) = \cos\,x - 2x\).

Use Newton’s method to find the positive root of \(f(x) = \sin\,x - x/2\).

Use Newton’s method to find the solution of the equation \(e^{-x} = x\).

Use Newton’s method to find the solution of the equation \(e^{-x} = x^2\).

Use Newton’s method and \(f(x) = x^2 - 2\) to approximate \(\sqrt{2}\) accurate to six decimal places.

Use Newton’s method to approximate \(\sqrt{3}\) accurate to six decimal places.

Repeat Exercise 1 with the secant method.

Repeat Exercise 3 with the secant method.

Repeat Exercise 5 with the secant method.

Repeat Exercise 6 with the secant method.

Cosmic microwave background radiation is described by a function similar to \(f(x) = \frac{x^3}{-1 + e^x}\) for \(x \ge 0\). Use Newton’s method to find the global maximum of \(f\) accurate to four decimal places.

Would a different choice for \(x_0\) in either graph in Figure [fig:newtonloop] eliminate the infinite loop? Explain.

Draw a graph without any symmetry that has the infinite loop problem for Newton’s method.

The time \(t(p)\) required for a \(p\)-way merge of a file on a single disk drive into memory is

\[t(p) ~=~ \frac{pa + m}{\ln\,p} ~, \nonumber \]

where \(p>1\), \(a\) is the disk access time, and \(m\) is the time to read in one segment of the size of memory. Find the integer closest to the value of \(p\) that minimizes \(t(p)\) when \(a = 24.3\)ms and \(m=3500.3\)ms.

Search

Text Color

Text Size

Margin Size

Font Type