Given a set of data, polynomial interpolation is a method of finding a polynomial function that fits a set of data points exactly. Though there are several methods for finding this polynomial, the polynomial itself is unique, which we will prove later.
3.2.1: Lagrange Polynomial
One of the most common ways to perform polynomial interpolation is by using the Lagrange polynomial. To motivate this method, we begin by constructing a polynomial that goes through 2 data points (x0,y0) and x1,y1. We use two equations from college algebra.
y−y1=m(x−x1)andm=y1−y0x1−x0
Combining these, we end up with:
y=y1−y0x1−x0(x−x1)+y1.
Now, to derive a formula similar to that used for the Lagrange polynomial, we perform some algebra. We begin by swapping the terms y1−y0 and x−x1:
y=x−x1x1−x0(y1−y0)+y1.
Distributing the fraction:
y=x−x1x1−x0y1−x−x1x1−x0y0+y1.
Multiplying the right-most y1 term by x1−x0x1−x0=1:
y=x−x1x1−x0y1−x−x1x1−x0y0+x1−x0x1−x0y1.
Combining to the y1 terms:
y=−x−x1x1−x0y0+x−x0x1−x0y1.
and finally flipping the denominator of the first term to get rid of the negative:
y=x−x1x0−x1y0+x−x0x1−x0y1.
In the above, we can observe that, when x=x1, it follows that the first term cancels out with a zero on top and the second term ends up as 1⋅y1=y1, as desired. Similarly, if x=x0, then the first term ends up as 1⋅y0=y0 and the second term cancels out with a zero on top, causing the entire expression to be y0, as desired.
For three data points (x0,y0),(x1,y1),and(x2,y2), we can derive a polynomial that behaves similarly:
y=(x−x1)(x−x2)(x0−x1)(x0−x2)y0+(x−x0)(x−x2)(x1−x0)(x1−x2)y1+(x−x0)(x−x1)(x2−x0)(x2−x1)y2.
In the above, plugging in, for example, x=x1 has the first and third terms cancelling out with zero and the middle term turning into 1⋅y1, as desired. We can follow this pattern to arrive at the full Lagrange polynomial.
THE LAGRANGE POLYNOMIAL
Given n+1 data points (x0,y0),(x1,y1),…,(xn,yn) with x0<x1<⋯<xn, the Lagrange polynomial is the nth degree polynomial passing through each of these points and written as: y=(x−x1)(x−x2)⋯(x−xn)(x0−x1)(x0−x2)⋯(x0−xn)y0+(x−x0)(x−x2)⋯(x−xn)(x1−x0)(x1−x2)⋯(x1−xn)y1+⋯+(x−x0)(x−x1)⋯(x−xn−1)(xn−x0)(xn−x1)⋯(xn−xn−1)yn Equivalently, we can use product notation: y=n∑i=0(yin∏j=0j≠ix−xjxi−xj)
Note that in the above polynomial, each numerator is written such that, for x=xi, each coefficient vanishes except for the coefficient to yi, which evaluates to yi. Thus the above polynomial passes through each of the desired data points and, as can be checked, is of degree n.
Now, let's work with Python. To do this, we use function blocks for the first time. We have used functions in the past in Python, such as sin(x) or cos(x). Here, we create our own function using the def keyword. Below, we define f(o), where o is the dynamic variable. At the end of a function block, we "return", using the return keyword, the result of the function calculation. In the code below, the variable o represents the variable x in the Lagrange Polynomial definition above. We do this since x is already used for our data points.
02 | import matplotlib.pyplot as plot |
10 | prange = np.linspace(x[ 0 ],x[pts], 50 ) |
12 | plot.plot(x,y,marker = 'o' , color = 'r' , ls = '',markersize = 10 ) |
15 | z = ((o - x[ 1 ]) / (x[ 0 ] - x[ 1 ])) * y[ 0 ] + ((o - x[ 0 ]) / (x[ 1 ] - x[ 0 ])) * y[ 1 ] |
18 | plot.plot(prange,f(prange)) |
output:

Now, we do the same as the above except with 3 data points. Notice how the only changes are the data point lists and the function. Below, we again use the "\" symbol to let Python know to continue on the next line. This is done to make the polynomial easier to read.
02 | import matplotlib.pyplot as plot |
09 | prange = np.linspace(x[ 0 ],x[n], 50 ) |
11 | plot.plot(x,y,marker = 'o' , color = 'r' , ls = '', markersize = 10 ) |
15 | (((o - x[ 1 ]) * (o - x[ 2 ])) / ((x[ 0 ] - x[ 1 ]) * (x[ 0 ] - x[ 2 ]))) * y[ 0 ] + \ |
16 | (((o - x[ 0 ]) * (o - x[ 2 ])) / ((x[ 1 ] - x[ 0 ]) * (x[ 1 ] - x[ 2 ]))) * y[ 1 ] + \ |
17 | (((o - x[ 0 ]) * (o - x[ 1 ])) / ((x[ 2 ] - x[ 0 ]) * (x[ 2 ] - x[ 1 ]))) * y[ 2 ] |
20 | plot.plot(prange,f(prange)) |
output:

Using the product notation of Lagrange Polynomials, we can even come up with a program that accepts any number of data points. y=n∑i=0(yin∏j=0j≠ix−xjxi−xj) In the code below, i and j represent the i and j in the definition above.
02 | import matplotlib.pyplot as plot |
04 | x = [ 1 , 3 , 5 , 7 , 8 , 9 , 10 , 12 , 13 ] |
05 | y = [ 50 , - 30 , - 20 , 20 , 5 , 1 , 30 , 80 , - 10 ] |
08 | prange = np.linspace( min (x), max (x), 500 ) |
10 | plot.plot(x,y,marker = 'o' , color = 'r' , ls = '', markersize = 10 ) |
18 | prod = prod * (o - x[j]) / (x[i] - x[j]) |
22 | plot.plot(prange,f(prange)) |
output:

While Lagrange polynomials are among the easiest methods to understand intuitively and are efficient for calculating a specific y(x), they fail when when attempting to either find an explicit formula y=a0+a1x+⋯+anxn or when performing incremental interpolation, that is, adding data points after the initial interpolation is performed. For incremental interpolation, we would need to complete re-perform the entire evaluation.
3.2.2: Newton interpolation
Newton interpolation is an alternative to the Lagrange polynomial. Though it appears more cryptic, it allows for incremental interpolation and provides an efficient way to find an explicit formula y=a0+a1x+⋯+anxn.
Newton interpolation is all about finding coefficients and then using those coefficients to calculate subsequent coefficients. Since an important part of Newton interpolation is that it can be used for incremental interpolation, let's start with a single data point and show the calculations as we add points.
With one data point (x0,y0), the calculation is simple:
b0=y0
and the polynomial is
y=b0.
Let's add a new data point (x1,y1). The next coefficient b1 is usually denoted by [y0,y1]:
b1=[y0,y1]=y1−y0x1−x0
and the polynomial is
y=b0+b1(x−x0).
Notice how we did not have to re-calculate the entire polynomial, only the new coefficient to (x−x0).
With a third data point (x2,y2), we find the coefficient b2=[y0,y1,y2]:
b2=[y0,y1,y2]=[y1,y2]−[y0,y1]x2−x0=y2−y1x2−x1−y1−y0x1−x0x2−x0
with polynomial
y=b0+b1(x−x0)+b2(x−x0)(x−x1).
You may have noticed the recursive nature of the previous definitions. This continues for Newton interpolation in general.
NEWTON INTERPOLATION
Given n+1 data points (x0,y0),(x1,y1),…,(xn,yn) with x0<x1<⋯<xn, the Newton interpolation polynomial is the nth degree polynomial passing through each of these points and written as: y=b0+b1(x−x0)+b2(x−x0)(x−x1)+⋯+bn(x−x0)(x−x1)⋯(x−xn−1) where b0=y0b2=[y0,y1]=y1−y0x1−x0b2=[y0,y1,y2]=y2−y1x2−x1−y1−y0x1−x0x2−x0⋮bn=[y0,y2,…,yn]=[y1,y2,…,yn]−[y0,y1,…,yn−1]xn−x0
Let's begin by finding a polynomial with three data points using Newton's Method.
Example: Use Newton's method to calculating a polynomial through the points (1,3),(5,7) and (8,0).
For this example, we will use a "tableau" to help organize our data.
x0 |
y0 |
|
|
x1 |
y1 |
[y1,y0] |
|
x2 |
y2 |
[y2,y1] |
[y0,y1,y2] |
We start by filling in our data points:
1 |
3 |
|
|
5 |
7 |
[y1,y0] |
|
8 |
0 |
[y2,y1] |
[y0,y1,y2] |
Then we calculate: [y1,y0]=7−35−1=1 [y2,y1]=0−78−5=−73 and fill in this data:
1 |
3 |
|
|
5 |
7 |
1 |
|
8 |
0 |
−73 |
[y0,y1,y2] |
Now, we calculate the remaining item, using the previously-calculated terms: [y0,y1,y2]=[y1,y2]−[y0,y1]x2−x0=−73−18−1=−1021
Note, now, that b0,b1, and b2 are written on the top diagonal. Thus our final polynomial is: y=b0+b1(x−x0)+b2(x−x0)(x−x1)=3+1(x−1)−1021(x−1)(x−5)
Now, let's program this for three data points to check our work!
02 | import matplotlib.pyplot as plot |
08 | prange = np.linspace( min (x), max (x), 500 ) |
10 | plot.plot(x,y,marker = 'o' , color = 'r' , ls = '', markersize = 10 ) |
14 | b1 = (y[ 1 ] - y[ 0 ]) / (x[ 1 ] - x[ 0 ]) |
15 | b1p = (y[ 2 ] - y[ 1 ]) / (x[ 2 ] - x[ 1 ]) |
16 | b2 = (b1p - b1) / (x[ 2 ] - x[ 0 ]) |
17 | poly = b0 + b1 * (o - x[ 0 ]) + b2 * (o - x[ 0 ]) * (o - x[ 1 ]) |
20 | plot.plot(prange,f(prange)) |
output:

A general program can be made by using a recursive function.
02 | import matplotlib.pyplot as plot |
04 | x = [ - 5 , - 4 , - 3 , - 2 , - 1 , 0 , 1 , 2 , 3 , 4 , 5 ] |
05 | y = [ 0 , 0 , 0.1 , 0.3 , 0.7 , 1 , 0.7 , 0.3 , 0.1 , 0 , 0 ] |
07 | plot.plot(x,y,marker = 'o' , color = 'r' , ls = '',markersize = 10 ) |
11 | return (grad(a - 1 ,b) - grad(a - 1 ,a - 1 )) / (x[b] - x[a - 1 ]) |
16 | for p in range ( len (x)): |
22 | prange = nm.linspace(x[ 0 ],x[ - 1 ], 200 ) |
23 | plot.plot(prange,f(prange)) |
output:

Note that above, we have something strange happening near the end points. This is known as Runge's phenomenon and mainly occurs at the endges of an interval when using polynomial interpolation with polynomials of high degree over a set of points that are equally spaced. Because of Runge's phenomenon, it is often the case that going to higher degrees does not always improve accuracy. To combat this, we can use a method such as cubic splines, which we discuss here.
3.2.3: Cubic Splines
A spline is a function defined piecewise by polynomials. Instead of having a single polynomial that goes through each data point, as we have done so far, we instead define several polynomials between each of the points. While defining several polynomials might take more work, it is usually preferred since it gives similar results while avoiding Runge's phenomenon.
A linear spline is created by simply drawing lines between each of the data points. Using our data points from earlier, we can create a reasonable approximation of our underlying function.

While simplistic, this approximation is likely better than that generated when using the same data points and Newton's method. This is due to the absence of Runge's phenomenon. To make the picture look even better, we can replace the lines by cubic functions. When doing so, we will end up with a number of cubic functions equal to one less than the number of data points - that is, one for every interval between the data points. We call this function s(x), which can be defined as follows.
\boldsymbol{s(x) = \left\{ \begin{array}{lr} s_0(x)=a_0x^3+b_0x^2+c_0x+d_0 & \text{if } x_0\leq x\leq x_1\\ s_1(x)=a_1x^3+b_1x^2+c_1x+d_1 & \text{if } x_1\leq x\leq x_2\\ \vdots\\ s_{n-1}(x)=a_{n-1}x^3+b_{n-1}x^2+c_{n-1}x+d_{n-1} & \text{if } x_{n-1}\leq x\leq x_n\\ \end{array} \right}