3.1: The Fundamental Theorem of Algebra

Last updated
Save as PDF

Page ID: 301

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\dsum}{\displaystyle\sum\limits} \)

\( \newcommand{\dint}{\displaystyle\int\limits} \)

\( \newcommand{\dlim}{\displaystyle\lim\limits} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\(\newcommand{\longvect}{\overrightarrow}\)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

The aim of this section is to provide a proof of the Fundamental Theorem of Algebra using concepts that should be familiar to you from your study of Calculus, and so we begin by providing an explicit formulation.

Theorem 3.1.1. Given any positive integer \(n \in \mathbb{Z}_{+}\) and any choice of complex numbers \(a_0, a_1, \ldots, a_n \in \mathbb{C}\) with \(a_n\neq 0\), the polynomial equation

\[ a_n z^n + \cdots + a_1 z + a_0 = 0 \tag{3.1.1} \]

has at least one solution \(z\in\mathbb{C}\).

This is a remarkable statement. No analogous result holds for guaranteeing that a real solution exists to Equation (3.1) if we restrict the coefficients \(a_0, a_1, \ldots, a_n\) to be real numbers. E.g., there does not exist a real number \(x\) satisfying an equation as simple as \(\pi x^2 + e = 0\). Similarly, the consideration of polynomial equations having integer (resp. rational) coefficients quickly forces us to consider solutions that cannot possibly be integers (resp. rational numbers). Thus, the complex numbers are special in this respect.

The statement of the Fundamental Theorem of Algebra can also be read as follows: Any non-constant complex polynomial function defined on the complex plane \(\mathbb{C}\) (when thought of as \(\mathbb{R}^{2}\)) has at least one root, i.e., vanishes in at least one place. It is in this form that we will provide a proof for Theorem 3.1.1.

Given how long the Fundamental Theorem of Algebra has been around, you should not be surprised that there are many proofs of it. There have even been entire books devoted solely to exploring the mathematics behind various distinct proofs. Different proofs arise from attempting to understand the statement of the theorem from the viewpoint of different branches of mathematics. This quickly leads to many non-trivial interactions with such fields of mathematics as Real and Complex Analysis, Topology, and (Modern) Abstract Algebra. The diversity of proof techniques available is yet another indication of how fundamental and deep the Fundamental Theorem of Algebra really is.

To prove the Fundamental Theorem of Algebra using Differential Calculus, we will need the Extreme Value Theorem for real-valued functions of two real variables, which we state without proof. In particular, we formulate this theorem in the restricted case of functions defined on the closed disk \(D\) of radius \(R>0\) and centered at the origin, i.e.,

\[ D=\{(x_1,x_2)\in\mathbb{R}^2 \mid x_1^2 + x_2^2 \leq R^2\}. \]

Theorem 3.1.2. Let \(f: D\to \mathbb{R}\) be a continuous function on the closed disk \(D\subset \mathbb{R}^2\). Then \(f\) is bounded and attains its minimum and maximum values on \(D\). In other words, there exist points \(x_m, x_M \in D\) such that

\[ f(x_m) \leq f(x) \leq f(x_M) \]

for every possible choice of point \(x\in D\).

If we define a polynomial function \(f:\mathbb{C}\to\mathbb{C}\) by setting \(f(z) = a_n z^n + \cdots + a_1 z + a_0\) as in Equation (3.1.1), then note that we can regard \((x,y)\mapsto \vert f(x+i y)\vert\) as a function \(\mathbb{R}^2\to \mathbb{R}\). By a mild abuse of notation, we denote this function by \(\vert f(\,\cdot\,)\vert\) or \(|f|\). As it is a composition of continuous functions (polynomials and the square root), we see that \(\vert f\vert\) is also continuous.

Lemma 3.1.3. Let \(f:\mathbb{C}\to\mathbb{C}\) be any polynomial function. Then there exists a point \(z_0\in \mathbb{C}\) where the function \(\vert f\vert\) attains its minimum value in \(\mathbb{R}\).

Proof.

If \(f\) is a constant polynomial function, then the statement of the Lemma is trivially true since \(|f|\) attains its minimum value at every point in \(\mathbb{C}\). So choose, e.g., \(z_{0} = 0\).

If \(f\) is not constant, then the degree of the polynomial defining \(f\) is at least one. In this case, we can denote \(f\) explicitly as in Equation (3.1.1). That is, we set

\[ f(z) = a_n z^n + \cdots + a_1 z + a_0 \]

with \(a_n\neq 0\). Now, assume \(z\neq 0\), and set \(A=\max\{\vert a_0 \vert, \ldots,\vert a_{n-1}\vert\}\). We can obtain a lower bound for \(\vert f(z)\vert\) as follows:

\begin{eqnarray*}
\vert f(z)\vert
&=& \vert a_n\vert \,\vert z\vert^n \,
\bigl\vert 1 +\frac{a_{n-1}}{a_n} \frac{1}{z} +\cdots+
\frac{a_{0}}{a_n} \frac{1}{z^n}\bigr\vert\\
&\geq& \vert a_n\vert \, \vert z\vert^n \,
\bigl(1-\frac{A}{\vert a_n\vert}\sum_{k=1}^\infty\frac{1}{ \vert z\vert^k}\bigr)
= \vert a_n\vert \, \vert z\vert^n
\bigl(1-\frac{A}{\vert a_n\vert} \frac{1}{\vert z \vert -1}\bigr).
\end{eqnarray*}

For all \(z\in \mathbb{C}\) such that \(\vert z\vert \geq 2\), we can further simplify this expression and obtain

\begin{eqnarray*}
\vert f(z)\vert
&\geq& \vert a_n\vert \, \vert z\vert^n
\bigl(1-\frac{2A}{\vert a_n\vert \vert z\vert}\bigr).
\end{eqnarray*}

It follows from this inequality that there is an \(R>0\) such that \(\vert f(z) \vert > \vert f(0)\vert\), for all \(z \in \mathbb{C}\) satisfying \(\vert z\vert > R\). Let \(D\subset \mathbb{R}^2\) be the disk of radius \(R\) centered at \(0\), and define a function \(g:D\to\mathbb{R}\), by

\[ g(x,y)=\vert f(x+i y)\vert. \]

Since \(g\) is continuous, we can apply Theorem 3.1.2 in order to obtain a point \((x_0,y_0)\in D\) such that \(g\) attains its minimum at \((x_{0}, y_{0})\). By the choice of \(R\) we have that for \(z\in\mathbb{C}\setminus D\), \(\vert f(z)\vert > \vert g(0,0)\vert\geq \vert g(x_0,y_0)\vert\). Therefore, \(\vert f\vert\) attains its minimum at \(z=x_0+i y_0\).

We now prove the Fundamental Theorem of Algebra.

Proof of Theorem 3.1.1.

For our argument, we rely on the fact that the function \(\vert f\vert\) attains its minimum value by Lemma 3.1.3. Let \(z_0\in \mathbb{C}\) be a point where the minimum is attained. We will show that if \(f(z_0)\neq 0\), then \(z_0\) is not a minimum, thus proving by contraposition that the minimum value of \(\vert f(z)\vert\) is zero. Therefore, \(f(z_0)=0\).

If \(f(z_0)\neq 0\), then we can define a new function \(g:\mathbb{C}\to\mathbb{C}\) by setting

\[ g(z)=\frac{f(z+z_0)}{f(z_0)}, \mbox{ for all } z\in \mathbb{C}. \]

Note that \(g\) is a polynomial of degree \(n\), and that the minimum of \(\vert f\vert\) is attained at \(z_0\) if and only if the minimum of \(\vert g\vert\) is attained at \(z=0\). Moreover, it is clear that \(g(0)=1\).

More explicitly, \(g\) is given by a polynomial of the form

\[ g(z) = b_n z^n + \cdots + b_k z^k +1, \]

with \(n\geq 1\) and \(b_k\neq 0\), for some \(1\leq k\leq n\). Let \(b_k=\vert b_k\vert e^{i\theta}\), and consider \(z\) of the form

\begin{equation}
z=r\vert b_k\vert^{-1/k} e^{i(\pi - \theta)/k},
\label{eqn:z r} \tag{3.1.2}
\end{equation}

with \(r>0\).

For \(z\) of this form we have

\[ g(z) = 1-r^k + r^{k+1} h(r), \]

where \(h\) is a polynomial. Then, for \(r<1\), we have by the triangle inequality that

\[ \vert g(z)\vert \leq 1-r^k + r^{k+1}\vert h(r)\vert. \]

For \(r > 0\) sufficiently small we have \(r\vert h(r)\vert <1\), by the continuity of the function \(r h(r)\) and the fact that it vanishes in \(r=0\). Hence

\[ \vert g(z)\vert \leq 1-r^k(1- r\vert h(r)\vert)<1, \]

for some \(z\) having the form in Equation (3.1.2) with \(r\in(0,r_0)\) and \(r_0>0\) sufficiently small. But then the minimum of the function \(\vert g\vert:\mathbb{C}\to\mathbb{R}\) cannot possibly be equal to \(1\).

Contributors

Both hardbound and softbound versions of this textbook are available online at WorldScientific.com.

Search

Text Color

Text Size

Margin Size

Font Type