4.3: Polynomials

Last updated
Save as PDF

Page ID: 99071

Bob Dumas and John E. McCarthy
University of Washington and Washington University in St. Louis

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We now use the machinery developed in Section \(4.2\) to undertake a modest mathematical program. As we indicated in the first chapter of this book, most of you, until now, have used mathematical results to solve problems in computation. Here we are interested in proving a result with which you may be familiar.

This result concerns polynomials with real coefficients (i.e. coefficients that are real numbers). You have spent a good deal of your mathematical life investigating polynomials, and undoubtedly can make many interesting and truthful claims about them. But how confident are you that these claims are true? It is possible that your belief in these claims is, by and large, mere confidence in the claims and beliefs of experts in the field. In practice, one can do worse than to acquiesce to the assertions of specialists, and practical limitations generally compel us to accept many claims on faith. Of course, this practice carries risks. For hundreds of years, the assertions of Aristotle were broadly accepted, often in spite of empirical evidence to the contrary. Naturally, we continue to accept many claims on faith. In the case of modern science, we generally do not have first hand access to primary evidence on which modern scientific theories are based. Mathematics is different from every other field of intellectual endeavor because you have the opportunity to verify virtually every mathematical claim you encounter. You are now at the point in your mathematical career at which you can directly confirm mathematical results.

The theorem we wish to prove is that the number of real roots of a real polynomial is at most the degree of the polynomial. You may be familiar with this claim, but uncertain of why it holds. This result is interesting, in part, because it guarantees that the graph of a polynomial will cross any horizontal line only finitely many times. Put another way, level sets of polynomials cannot have more elements than the degree of the polynomial.

Notation. \(\mathbb{R}[x] \mathbb{R}[x]\) is the set of polynomials with real coefficients in the variable \(x\).

THEOREM 4.10. Let \(N \in \mathbb{N}\) and \(p \in \mathbb{R}[x]\) have degree \(N \geq 1\). Then \(p\) has at most \(N\) real roots.

Discussion. This result is sufficiently difficult that we shall have to prove three preliminary results. These lemmas \({ }^{2}\) are proved within the argument for the theorem. Throughout the argument we shall be investigating a general polynomial, \(p\), of degree \(N\).

\({ }^{2}\) A lemma is an auxiliary result that one uses in the proof of a theorem - sort of like a subroutine. In German, a theorem is called "Satz" and a lemma is called "Hilfsatz", a "helper theorem". Proof. We prove first that the distributive property generalizes to an arbitrary number of summands.

LEMMA 4.11. Let \(N \in \mathbb{N}^{+}\)and, for \(0 \leq n \leq N, a_{n} \in \mathbb{R}\). If \(c \in \mathbb{R}\), then \[\sum_{n=0}^{N} c a_{n}=c\left(\sum_{n=0}^{N} a_{n}\right) .\] Discussion. This result generalizes the distributive property to more than two summands. We are assuming the distributive property of real numbers: for \(a, b, c \in \mathbb{R}\), \[c \cdot(a+b)=c a+c b\] We prove the lemma by induction. It is surprising that a claim that seems so obvious uses the powerful machinery of induction. But remember that we are proving this for all finite sums of arbitrarily many summands. Of course, you may feel that the lemma is altogether obvious. If so, you should try to produce your own proof, or read this one for practice in mathematical induction in a context where the mathematical content is easy.

We shall argue by induction on the number of terms in the sum. The base case is for sums with two summands - this is just the distributive property. In the induction step we prove the conditional result that if the lemma holds for all sums with \(N\) terms, then it holds for all sums with \(N+1\) terms. At each step of the argument (base and induction steps) we are arguing for infinitely many concrete claims by arguing for a single abstract claim.

Proof. We argue by induction on \(N\).

Base case: \(N=1\) Let \(c, a_{0}, a_{1} \in \mathbb{R}\). By the distributive property, \[\begin{aligned} \sum_{n=0}^{1} c a_{n} &=c a_{0}+c a_{1} \\ &=c\left(a_{0}+a_{1}\right) \\ &=c\left(\sum_{n=0}^{1} a_{n}\right) . \end{aligned}\] Induction step:

Let \(c \in \mathbb{R}\) and \(a_{n} \in \mathbb{R}\), for \(0 \leq n \leq N+1\). We assume \[\sum_{n=0}^{N} c a_{n}=c\left(\sum_{n=0}^{N} a_{n}\right) .\] We have \[\begin{aligned} \sum_{n=0}^{N+1} c a_{n} &=\left(\sum_{n=0}^{N} c a_{n}\right)+c a_{N+1} \\ &={ }_{I H} \quad c\left(\sum_{n=0}^{N} a_{n}\right)+c a_{N+1} \end{aligned}\] By the distributive law (for two summands) \[\begin{aligned} c\left(\sum_{n=0}^{N} a_{n}\right)+c a_{N+1} &=c\left(\sum_{n=0}^{N} a_{n}+a_{N+1}\right) \\ &=c\left(\sum_{n=0}^{N+1} a_{n}\right) . \end{aligned}\] Therefore, \[\sum_{n=0}^{N+1} c a_{n}=c\left(\sum_{n=0}^{N+1} a_{n}\right) .\] By the induction principle the result holds for all \(N \in \mathbb{N}\). LEMMA 4.12. If \(x, y \in \mathbb{R}\) and \(n \in \mathbb{N}^{+}\), then \[\begin{aligned} x^{n}-y^{n} &=(x-y)\left(x^{n-1}+x^{n-2} y+\cdots+x y^{n-2}+y^{n-1}\right) \\ &=(x-y)\left(\sum_{\substack{i, j \in \mathbb{N} \\ i+j=n-1}} x^{i} y^{j}\right) . \end{aligned}\] Discussion. The notation in the last line of the lemma means that the sum is taken over all natural numbers \(i\) and \(j\) that have the property that \(i+j=n-1\).

PROOF. By Lemma 4.11, \[\begin{aligned} (x-y)\left(\sum_{\substack{i, j \in \mathbb{N} \\ i+j=n-1}} x^{i} y^{j}\right) &=x\left(\sum_{\substack{i, j \in \mathbb{N} \\ i+j=n-1}} x^{i} y^{j}\right)-y\left(\sum_{\substack{i, j \in \mathbb{N} \\ i+j=n-1}} x^{i} y^{j}\right) \\ &=\sum_{\substack{i, j \in \mathbb{N} \\ i+j=n-1}} x^{i+1} y^{j}-\sum_{\substack{i, j \in \mathbb{N} \\ i+j=n-1}} x^{i} y^{j+1} \\ &=x^{n}-y^{n} . \end{aligned}\] The next lemma associates roots of polynomials and linear factors.

LEMMA 4.13. Let \(p\) be a polynomial of degree \(N\). A real number, \(c\), is a root of \(p\) iff \[p(x)=(x-c) q(x),\] where \(q(x)\) is a polynomial of degree \(N-1\).

Discussion. This lemma is a biconditional statement. That is, the lemma is propositionally equivalent to the conjunction of two conditional statements. We prove the conditional statements independently. One of the conditional statements is obvious (can you determine which?). The more difficult conditional statement will use Lemma 4.12. When proving a biconditional, \(P \Longleftrightarrow Q\), by proving the conditional statements \(P \Rightarrow Q\) and \(Q \Rightarrow P\), we often use \((\Rightarrow)\) and \((\Leftarrow)\) to identify the conditional statement under consideration. Proof. Let \(p\) be a polynomial of degree \(N\). Then there are \(a_{0}, a_{1}, \ldots, a_{N} \in\) \(\mathbb{R}, a_{N} \neq 0\), such that, \[p(x)=\sum_{n=0}^{N} a_{n} x^{n} .\] ( \(\Leftarrow\) ) Assume that there is \(c \in \mathbb{R}\) and a polynomial \(q\) of degree \(N-1\) such that \[p(x)=(x-c) q(x) .\] Then \[p(c)=(c-c) q(c)=0 .\] So \(c\) is a root of \(p\).

\((\Rightarrow)\) Let \(c \in \mathbb{R}\) be a root of \(p\). Then \[\begin{aligned} p(x) &=p(x)-p(c) \\ &=a_{0}-a_{0}+\sum_{n=1}^{N} a_{n}\left(x^{n}-c^{n}\right) \\ &=\sum_{n=1}^{N} a_{n}\left(x^{n}-c^{n}\right) . \end{aligned}\] By Lemma 4.12, for \(n \geq 1\), \[x^{n}-c^{n}=(x-c) q_{n}(x)\] where \[q_{n}(x)=x^{n-1}+c x^{n-2}+\cdots+c^{n-2} x+c^{n-1}=\sum_{\substack{i, j \in \mathbb{N} \\ i+j=n-1}} x^{i} c^{j} .\] By Lemma 4.11, \[p(x)=\sum_{n=1}^{N} a_{n}\left(x^{n}-c^{n}\right)=(x-c) \sum_{n=1}^{N} a_{n} q_{n}(x) .\] Let \[q(x)=\sum_{n=1}^{N} a_{n} q_{n}(x) .\] For all \(n\) between 1 and \(N, q_{n}(x)\) has degree \((n-1)\). So the degree of \(q(x)\) is less than \(N\). However the coefficient of \(x^{N-1}\) in \(q(x)\) is \(a_{N}\), and \(a_{N} \neq 0\) by assumption. So the degree of \(q(x)\) is \(N-1\), and \[p(x)=(x-c) q(x) .\] We complete the proof of the theorem. Let \(p\) be a polynomial of degree \(N\). We argue by induction on the degree of \(p\).

Base case: \(N=1\).

If \(p\) is a polynomial of degree 1 , then it is of the form \[p(x)=a_{1} x+a_{0},\] and the only root is \(-a_{0} / a_{1}\).

Induction step:

Assume that the theorem holds for \(N \in \mathbb{N}^{+}\). Let \(p\) have degree \(N+1\). If \(p\) has no roots, the theorem holds for \(p\). So assume that \(p\) has a real root, \(c \in \mathbb{R}\). By Lemma \(4.13\), \[p(x)=(x-c) q(x),\] where \(q\) is of degree \(N\). By the induction hypothesis, \(q\) has at most \(N\) real roots. If \(x\) is a root of \(p\), then by (4.14) either \(x\) is a root of \(q\) or \(x=c\). Therefore \(p\) has at most \(N+1\) roots, proving the induction step.

As a function, a polynomial in a particular variable is the same as a polynomial with the same coefficients in a different variable. Let \(p \in \mathbb{R}[x]\) be \[p(x)=\sum_{n=0}^{N} a_{n} x^{n},\] and \(q \in \mathbb{R}[y]\) be \[q(y)=\sum_{n=0}^{N} a_{n} y^{n} .\] Then as real functions \(p\) and \(q\) are the same function. That is, \[\operatorname{graph}(p)=\operatorname{graph}(q) .\] As algebraic objects, however, one might occasionally wish to distinguish between polynomials in distinct variables.

We end this section by proving that polynomials are equal as functions if and only if they have the same coefficients.

COROLLARY 4.15. Let \(p, q \in \mathbb{R}[x]\). The coefficients of \(p\) and \(q\) are equal iff \[(\forall x \in \mathbb{R}) \quad p(x)=q(x) .\] Proof. \((\Rightarrow)\) If the coefficients of \(p\) and \(q\) are all equal, then, letting \(a_{n}\) denote the \(n^{\text {th }}\) coefficient, we have \[(\forall x \in \mathbb{R}) p(x)=\sum_{n=0}^{N} a_{n} x^{n}=q(x) .\] ( \(\Leftarrow\) ) Suppose \((\forall x \in \mathbb{R}) p(x)=q(x)\). Then \(p-q\) is a polynomial with infinitely many roots. If \(p\) and \(q\) disagree on any coefficient, then \(p-q\) is a non-zero polynomial, has a degree, and by Theorem 4.10, finitely many roots. Therefore, \(p\) and \(q\) must agree on all coefficients.