Polynomials - Mat 6570 notes

Polynomial interpolation and Approximation
Dr. John Travis

Weierstrass Approximation Theorem: If feC[a,b], then there exists a polynomial P(x) such that P(x) is arbitrarily close to f(x) on [a,b].

Pf: Wolog, assume [a,b] = [0,1].
Indeed, if not we can always apply the theorem to the function g(x) = f(a+x(b-a)) to get g(0)=f(a) and g(1)=f(b).
Further, wolog assume that f(0) = f(1) = 0.
Indeed, if not, we can apply the theorem to the function h(x) = f(x) - f(a) - [f(b) - f(a)]*(x-a)/(b-a) to get h(a) = h(b) = 0.
In either case, we can then convert the resulting polynomial P(x) back to the original settings.
Define Q_n(x) = c_n(1 - x²)ⁿ, for n=1, 2, 3, ... where c_n is chosen so that

1 = ó
õ 1
-1
Q_n(x) dx

Lemma: (1 - x²)ⁿ> (1 - nx²), for 0< x < 1.
Pf of Lemma: Consider h(x) = (1 - x²)ⁿ - (1 - nx²).
Notice h(0) = 0 and h'(x) = 2nx(1 - (1-x²)^n-1) > 0 for 0 < x < 1.
        => h(x) is increasing with a minimum of zero
        => h(x) is non-negative for 0 < x < 1.
Note,

ó
õ 1
-1
(1-x²)ⁿ dx =

2 ó
õ 1
0
(1-x²)ⁿ dx >

2 ó
õ 1
1/n^1/2
(1-x²)ⁿ dx >

2 ó
õ 1
1/n^1/2
(1 - nx²) dx =

                                    = 4/ (3 Ön )
                                    > 1 / Ön.
Therefore, c_n < Ön.
Now, for d < |x| < 1, Q_n(x) = c_n(1 - x²)ⁿ < Ön (1 - d²)ⁿ and so Q_n(x) -> 0 uniformly for d< |x| < 1.
Consider

P_n(x) = ó
õ 1-x
-x
f(x+t)Q_n(t) dt =

ó
õ 1
0
f(t)Q_n(t-x) dt

using a simple change of variable.
Notice, the last integral is clearly a polynomial in x since the variable t which involves the function f(t) disappears with the integration and the only place the x appears is in the Q_n(t-x) term, where Q_n is a polynomial.
Thus {P_n} is a sequence of polynomials.
Since f is a continuous function, given e > 0, choose d > 0 such that |(x+t) - x| < d => |f(x+t) - f(x)| < e/2. Let M = sup |f(x)|.
Then, for 0 < x < 1, noting that we have chosen Q_n so that it integrates to 1,

| P_n(x) - f(x) | =

| ó
õ 1
-1
f(x+t)Q_n(t)dt - f(x) ó
õ 1
-1
Q_n(t) dt |

< ó
õ 1
-1
|f(x+t)-f(x)|Q_n(t)dt

, by pulling the absolute value on the inside

< 2M ó
õ -d
-1
Q_n(t)dt +

e/2 ó
õ d
-d
Q_n(t)dt +

2M ó
õ 1
d
Q_n(t)dt

                      < 4M Ön (1 - d²)ⁿ + e/2, noting that Q_n(t) is even on the first integral.
                        < e, for all n large enough...say N
For this N, P_N(x) is a finite degree polynomial within e distance from f(x), as desired.

Remark: This is an existence theorem. In practice, finding the polynomial might be very difficult and of very high degree.

Taylor Polynomials: Given feCⁿ⁺¹[a,b] and some value c with a<c<b,

f(x) = f(c) + f'(c)(x-c) + f''(c) (x-c)²/2! + f'''(c) (x-c)³/3! + ... f⁽ⁿ⁾(c) (x-c)ⁿ/n! + R_n(x) where R_n(x) = f⁽ⁿ⁺¹⁾(z) (x-c)ⁿ⁺¹/(n+1)!, where z is some point between c and x.

Pf: Write the polynomial part above as P_n(x). So we want to prove f(x) = P_n(x) + R_n(x).
Certainly for x=c, the statement holds.
Assume x is not equal to c and define R_n(x) = f(x) - P_n(x).
Define g(t) = f(x) - f(t) - f'(t)(x-t) - f''(t) (x-t)²/2! - f'''(t) (x-t)³/3! - ... f⁽ⁿ⁾(t) (x-t)ⁿ/n! - R_n(x) (x-t)ⁿ⁺¹/(x-c)ⁿ⁺¹. Notice, g'(t) = - f⁽ⁿ⁺¹⁾(t) (x-t)ⁿ/n! + (n+1)R_n(x) (x-t)ⁿ/(x-c)ⁿ⁺¹, for all t between c and x.
Moreover, for the fixed x, g(c) = 0 and g(x) = 0.
Thus, by the Mean Value Theorem (actually, the special case known as Rolle's Theorem), there is a number z between c and x such that g'(z) = 0.
Substituting this into the expression above yields 0 = - f⁽ⁿ⁺¹⁾(z) (x-z)ⁿ/n! + (n+1)R_n(x) (x-z)ⁿ/(x-c)ⁿ⁺¹, and by solving yields R_n(x) = f⁽ⁿ⁺¹⁾(z) (x-c)ⁿ⁺¹/(n+1)! Finally, since g(c)=0, plugging this back into the definition of g(c) yields 0 = f(x) - f(c) - f'(t)(x-c) - f''(c) (x-c)²/2! - f'''(c) (x-c)³/3! - ... f⁽ⁿ⁾(c) (x-c)ⁿ/n! - R_n(x) (x-c)ⁿ⁺¹/(x-c)ⁿ⁺¹, from which the result easily follows.

These are created by using Taylor's theorem and throwing away the remainder term.

One must know the actual function value as well as its derivatives values at x=c before this can be used. Often this is requiring too much and so this may not be useful.
The Taylor's polynomial is only valid for interpolation in a small interval about x=c.
This is not such a great method to use for real-life examples where the derivative values would most likely not be given or even easily approximated.
Generally, Taylor's polynomials are only used for theoretical purposes.

Bad Ex: Approximate f(x)=1/x about c=1 using higher and higher Taylor's polynomials. The approximation gets worse and worse as the degree of the polynomial gets larger.

LaGrange Polynomials: Given n+1 values of the function (x₀,f(x₀)) , (x₁,f(x₁)) , ... , (x_n,f(x_n)), find the polynomial of degree at most n which passes through all these points.

Derivation:

Utilize a general nth degree polynomial and plug in the data values to get a system of equations in the coefficients.
Use Lagrange cardinal formulas. Writing these using a Horner-type arrangement gives a faster implementation.
Newton's Divided Differences...generally used when actually computing the polynomial.

Ex: Approximate f(x)=1/x using x₀=1, x₁=2, etc. for a few values. This gives a much better approximation than Taylors did.

Lagrange error bound Theorem: Let x₀,x₁, , ... , x_n be distinct numbers in the interval [a,b] and let feCⁿ⁺¹[a,b]. If P(x) is the Lagrange Interpolating Polynomial, then for each x in [a,b], there exists a number z in (a,b) such that

f(x) = P(x) + f⁽ⁿ⁺¹⁾(z) (x-x₀) (x-x₁) ... (x-x_n)/(n+1)!.

Pf: Easily, if x = x_k, for one of the given x values, then f(x) = P(x) and the remainder is zero.
So, suppose x is a fixed value different that all of the given data values {x_k}. Consider g(t) = f(t) - P(t) - [f(x) - P(x)] (t-x₀) (t-x₁) ... (t-x_n) / [(x-x₀) (x-x₁) ... (x-x_n)] Certainly g has n+1 continuous derivatives.
Now, for t = x_k, g(x_k) = 0.
Moreover, g(x) = 0.
Thus, g(t) has n+2 zeros in the interval [a,b].
So, by the generalized Rolle's Theorem, there exists z in (a,b) such that
g⁽ⁿ⁺¹⁾(z) = 0, or
0 = f⁽ⁿ⁺¹⁾(z) - P⁽ⁿ⁺¹⁾(z) - [f(x) - P(x)] D⁽ⁿ⁺¹⁾[ (t-x₀) (t-x₁) ... (t-x_n) ] / [(x-x₀) (x-x₁) ... (x-x_n)].
Note, P(t) is a polynomial of degree n so its (n+1)st derivative will be zero.
Further, the term (t-x₀) (t-x₁) ... (t-x_n) is a polynomial of degree n+1 so its (n+1)st derivative will be a constant. Indeed, the leading term in the product is tⁿ⁺¹ whose (n+1)st derivative is (n+1)!
Therefore, the last equation becomes
0 = f⁽ⁿ⁺¹⁾(z) - [f(x) - P(x)] (n+1)! / [(x-x₀) (x-x₁) ... (x-x_n)].
Solving for f(x) yields the desired result.

Lagrange Uniqueness Theorem: Any two polynomials of the same degree n which agree at n+1 points are equal.

Pf: Suppose f(x), g(x) are of degree n and agree at n+1 points.
Then, r(x)=f(x)-g(x) is of degree at most n with n+1 roots gives r(x)=0.

Ex: Construct a parabola passing through (1,4), (-1,0) and (2,9). Then p(x)= ax² + bx + c and by plugging in the points,

4 = a + b + c
0 = a - b + c
9 = 4a + 2b + c

which is a 3x3 system of equations to solve. If the polynomial p(x) were of higher degree, then hard to solve.

But if we write p(x)=c + b(x-1) + a(x-1)(x+1), then plugging in the points yields (solving as we go along)

4 = c
0 = c - 2b, or b=2
9 = c + b + 3a, or a=1,

which is very easy to solve successively. This second form is called Newton's form.

Newton's Divided Differences: Successively build up the polynomial by finding the coefficients as above.

(x₀,f₀) gives p₀(x) = f₀, passes through the one point.
(x₁,f₁) gives p₁(x) = p₀(x) + (x-x₀)b₁, passes through the first two points if b₁=(f₁-f₀)/(x₁-x₀).
(x₂,f₂) gives p₂(x) = p₁(x) + (x-x₀)(x-x₁)b₂, passes through the first three points if b₂=?.

Continuing in this manner shows the coefficients b_i are given by the divided difference table, page 63, etc.

f[x_j,...,x_k] = (f[x_j+1,...,x_k] - f[x_j,...,x_k-1]) / (x_k - x_j), for j<k.

If x₀ < x₁ < ... < x_n, this is called Newton's Forward Divided Difference Polynomial.
If x₀ > x₁ > ... > x_n, this is called Newton's Backward Divided Difference Polynomial.

If Dx is constant, the Divided Difference Formulas can be written in a closed form involving the binomial coefficents.

Program Subroutine To Determine the Newton's Divided Differences:

c_j=f(x_j) for j=0,1,...,n

for k= 1:n

for j=n:-1:k

c_j:=(c_j-c_j-1)/(x_j-x_j-k)

end

Then, the Newton's Interpolation polynomial is given by (with the c_j from above):
p(x) = c₀ + c₁(x-x₀) + c₂(x-x₀)(x-x₁) + c₃(x-x₀)(x-x₁)(x-x₂) + ... + c_n(x-x₀)(x-x₁)...(x-x_n-1).
= c₀ + (x-x₀){c₁ + (x-x₁) {c₂ + (x-x₂) {c₃ + ... + (x-x_n-2) {c_n-1 + c_n(x-x_n-1) }...} } }.

Homework: Go to some busy local store and collect some data regarding the number of customers that enter the store during a given amount of time. Carefully count the cummulative number that enter the store within 2, 5, 7 and 10 minutes (or other similar, non-uniform time steps but no more than five points). Using this data:

Create the Lagrange polynomial that interpolates this data. Use all three approaches illustrated above and simplify to demonstrate that the answer is the same polynomial regardless of which method is used to create it.
Use your model to predict the number of customers that would enter in an hour's time.
Discuss drawbacks of your model and ways that you could improve it.

Piecewise Interpolation: Instead of using one polynomial (possibly of very high degree) for the entire interval of interest, use several polynomials each defined on small intervals. Then, to obtain the y-value for a given x-value, one must first determine which formula to use. Then, evaluate that formula for the given x-value. If x₀ < x₁ < ... < x_n, and x₀ < x < x_n, then one can determine which interval x belongs to by considering the iteration:

k = 0

Repeat

k = k+1

(Use the kth formula...)

Until x < x_k

Return k

Notice, if we desire to evaluate this piecewise approximation for a fixed number of equally spaced points in each sub-interval, we can evaluate the basis polynomials once and store the results in vectors. These then can be reused for each interval.

Linear Splines: Piece linear segments L_k(x) = a_k + b_k(x-x_k-1) together to interpolate y-values only at the ends:

S_k(x_k-1) = y_k-1, which yields a_k = y_k-1.
S_k(x_k) = y_k, which yields b_k = (y_k - y_k-1)/(x_k - x_k-1), i.e. the slope!

Cubic Splines: Piece cubics S_k(x) = a_k + b_k(x-x_k-1) + c_k(x-x_k-1)² + d_k(x-x_k-1)³, for x_k-1 < x < x_k, together in such a way that the resulting approximation is continuous and has 2 continuous derivatives (no visible kinks). So, we impose the conditions:

S_k(x_k-1) = y_k-1, which yields n equations, for k=1 to n,

a_k = y_k-1

S_k(x_k) = y_k, which yields n equations, for k=1 to n,

a_k + b_kDx_k + c_kDx_k² + d_kDx_k³ = y_k

S_k^'(x_k) = S_k+1^'(x_k), which yields n-1 equations, for k=1 to n-1,

b_k + 2c_kDx_k + 3d_kDx_k² = b_k+1

S_k^''(x_k) = S_k+1^''(x_k), which yields n-1 equations, for k=1 to n-1,

2c_k + 6d_kDx_k = 2c_k+1

The derivation for these equations...

Hence, we have 4n unknowns (the vectors of coefficients a, b, c and d) and 4n-2 equations. By adding two additional equations, we can uniquely solve for the unknown vectors.

End Conditions:

Free spline - Assume S''(a) = S''(b)=0. This means the curve "loses" its concavity as you approach the endpoints.
Clamped spline - For m_kgiven, set S'(a)=m₀, S'(b)=m₁. This means the curve is forced to go in a given direction at each end.
Bessel ends - Set m_k automatically by using the tangent to the parabola which passes through the first three data points.
Quadratic ends - For a=x₀, set S''(a)=S''(x₁) and similarly for the other end.
Not-a-knot - Force the cubic polynomials on the first two and the last two intervals to actually be the same spline

ó
õ

-1

(1-x²)ⁿ dx =