2: Root Finding
( \newcommand{\kernel}{\mathrm{null}\,}\)
The problem is to solve f(x)=0 for x when an explicit analytical solution is impossible. All methods compute a sequence x0,x1,x2,… that converges to the root x=r satisfying f(r)=0.
Bisection Method
The bisection method is conceptually the simplest method and almost always works. However, it is also the slowest method, and most of the time should be avoided.
We first choose x0 and x1 such that x0<r<x1. We say that x0 and x1 bracket the root. With f(r)=0, we want f(x0) and f(x1) to be of opposite sign, so that f(x0)f(x1)<0. We then assign x2 to be the midpoint of x0 and x1, that is x2= (x1+x0)/2, or
x2=x1−x1−x02.
The sign of f(x2) is then determined, and the value of x3 is chosen as either the midpoint of x2 and x0 or as the midpoint of x2 and x1, depending on whether x2 and x0 bracket the root, or x2 and x1 bracket the root. The root, therefore, stays bracketed at all times. The algorithm proceeds in this fashion and is typically stopped when the absolute value of the increment to the last best approximation to the root (above, given by |x1−x0|/2) is smaller than some required precision.
Estimate √2=1.41421… using x0=1 and x1=2
Now √2 is the zero of the function f(x)=x2−2. We iterate with x0=1 and x1=2. We have
x2=2−2−12=32=1.5.
Now, f(x2)=9/4−2=1/4>0 so that x2 and x0 bracket the root. Therefore,
x3=32−32−12=54=1.25.
Now, f(x3)=25/16−2=−7/16<0 so that x3 and x2 bracket the root. Therefore,
x4=54−54−322=118=1.375,
and so on.
Newton’s Method
This is the fastest method, but requires analytical computation of the derivative of f(x). If the derivative is known, then this method should be used, although convergence to the desired root is not guaranteed.
We can derive Newton’s Method from a Taylor series expansion. We again want to construct a sequence x0,x1,x2,… that converges to the root x=r. Consider the xn+1 member of this sequence, and Taylor series expand f(xn+1) about the point xn. We have
f(xn+1)=f(xn)+(xn+1−xn)f′(xn)+….
To determine xn+1, we drop the higher-order terms in the Taylor series, and assume f(xn+1)=0. Solving for xn+1, we have
xn+1=xn−f(xn)f′(xn).
Starting Newton’s Method requires a guess for x0, to be chosen as close as possible to the root x=r.
Estimate √2 using x0=1
Again, we solve f(x)=0, where f(x)=x2−2. To implement Newton’s Method, we use f′(x)=2x. Therefore, Newton’s Method is the iteration
xn+1=xn−x2n−22xn.
With our initial guess x0=1, we have
x1=1−−12=32=1.5,x2=32−94−23=1712=1.416667,x3=1712−172122−2176=577408=1.41426.
Secant Method
The Secant Method is second fastest to Newton’s Method, and is most often used when it is not possible to take an analytical derivative of the function f(x). We write in place of f′(xn),
f′(xn)≈f(xn)−f(xn−1)xn−xn−1.
Starting the Secant Method requires a guess for both x0 and x1. These values need not bracket the root, but convergence is not guaranteed.
Estimate √2 using x0=1 and x1=2
Again, we solve f(x)=0, where f(x)=x2−2. The Secant Method iterates
xn+1=xn−(x2n−2)(xn−xn−1)x2n−x2n−1=xn−x2n−2xn+xn−1.
With x0=1 and x1=2, we have
x2=2−4−23=43=1.33333x3=43−169−243+2=2115=1.4x4=2115−(2115)2−22115+43=174123=1.41463.
A fractal from Newton’s Method
Consider the complex roots of the equation f(z)=0, where
f(z)=z3−1.
These roots are the three cubic roots of unity. With
ei2πn=1,n=0,1,2,…
the three unique cubic roots of unity are given by
1,ei2π/3,ei4π/3.
With
eiθ=cosθ+isinθ,
and cos(2π/3)=−1/2,sin(2π/3)=√3/2, the three cubic roots of unity are
r1=1,r2=−12+√32i,r3=−12−√32i.
The interesting idea here is to determine in the complex plane which initial values of z0 converge using Newton’s method to which of the three cube roots of unity.
Newton’s method generalized to the complex plane is
zn+1=zn−z3n−13z2n.
If the iteration converges to r1, we color z0 red; r2, blue; r3, green. The result will be shown in lecture.
Order of convergence
Let r be the root and xn be the nth approximation to the root. Define the error as
ϵn=r−xn.
If for large n we have the approximate relationship
|ϵn+1|=k|ϵn|p,
with k a positive constant and p≥1, then we say the root-finding numerical method is of order p. Larger values of p correspond to faster convergence to the root. The order of convergence of bisection is one: the error is reduced by approximately a factor of 2 with each iteration so that
|ϵn+1|=12|ϵn|.
We now find the order of convergence for Newton’s Method and for the Secant Method.
Newton’s Method
We start with Newton’s Method
xn+1=xn−f(xn)f′(xn),
where the root r satisfies f(r)=0. Subtracting both sides from r, we have
r−xn+1=r−xn+f(xn)f′(xn),
or
ϵn+1=ϵn+f(xn)f′(xn).
We use Taylor series to expand the functions f(xn) and f′(xn) about the root r, using f(r)=0. We have
f(xn)=f(r)+(xn−r)f′(r)+12(xn−r)2f′′(r)+…,=−ϵnf′(r)+12ϵ2nf′′(r)+…;f′(xn)=f′(r)+(xn−r)f′′(r)+12(xn−r)2f′′′(r)+…,=f′(r)−ϵnf′′(r)+12ϵ2nf′′′(r)+….
To make further progress, we will make use of the following standard Taylor series:
11−ϵ=1+ϵ+ϵ2+…,
which converges for |ϵ|<1. Substituting (2.21) into (2.20), and using (2.22) yields
ϵn+1=ϵn+f(xn)f′(xn)=ϵn+−ϵnf′(r)+12ϵ2nf′′(r)+…f′(r)−ϵnf′′(r)+12ϵ2nf′′′(r)+…=ϵn+−ϵn+12ϵ2nf′′(r)1−ϵnf′′(r)f′(r)+…=ϵn+(−ϵn+12ϵ2nf′′(r)f′(r)+…)(1+ϵnf′′(r)f′(r)+…)=ϵn+(−ϵn+ϵ2n(12f′′(r)f′(r)−f′′(r)f′(r))+…)=−12f′′(r)f′(r)ϵ2n+…
Therefore, we have shown that
|ϵn+1|=k|ϵn|2
as n→∞, with
k=12|f′′(r)f′(r)|,
provided f′(r)≠0. Newton’s method is thus of order 2 at simple roots.
Secant Method
Determining the order of the Secant Method proceeds in a similar fashion. We start with
xn+1=xn−(xn−xn−1)f(xn)f(xn)−f(xn−1).
We subtract both sides from r and make use of
xn−xn−1=(r−xn−1)−(r−xn)=ϵn−1−ϵn
and the Taylor series
f(xn)=−ϵnf′(r)+12ϵ2nf′′(r)+…,f(xn−1)=−ϵn−1f′(r)+12ϵ2n−1f′′(r)+…,
so that
f(xn)−f(xn−1)=(ϵn−1−ϵn)f′(r)+12(ϵ2n−ϵ2n−1)f′′(r)+…=(ϵn−1−ϵn)(f′(r)−12(ϵn−1+ϵn)f′′(r)+…)
We therefore have
ϵn+1=ϵn+−ϵnf′(r)+12ϵ2nf′′(r)+…f′(r)−12(ϵn−1+ϵn)f′′(r)+…=ϵn−ϵn1−12ϵnf′′(r)1−12(ϵn−1+ϵn)+…f′′(r)+…=ϵn−ϵn(1−12ϵnf′′(r)f′(r)+…)(1+12(ϵn−1+ϵn)f′′(r)f′(r)+…)=−12f′′(r)f′(r)ϵn−1ϵn+…
or to leading order
|ϵn+1|=12|f′′(r)f′(r)||ϵn−1||ϵn|.
The order of convergence is not yet obvious from this equation, but should be less than quadratic because |ϵn−1|>|ϵn|. To determine the scaling law we look for a solution of the form
|ϵn+1|=k|ϵn|p.
From this ansatz, we also have
|ϵn|=k|ϵn−1|p,
and therefore
|ϵn+1|=kp+1|ϵn−1|p2.
Substitution into (2.26) results in
kp+1|ϵn−1|p2=k2|f′′(r)f′(r)||ϵn−1|p+1.
Equating the coefficient and the power of ϵn−1 results in
kp=12|f′′(r)f′(r)|,
and
p2=p+1.
The order of convergence of the Secant Method, given by p, is therefore the positive root of the quadratic equation p2−p−1=0, or
p=1+√52≈1.618,
which coincidentally is a famous irrational number that is called The Golden Ratio, and goes by the symbol Φ. We see that the Secant Method has an order of convergence lying between the Bisection Method and Newton’s Method.