# 4.2: Principle of Induction

- Page ID
- 99070

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)We begin by proving a theorem that is equivalent to the principle of induction.

THEOREM 4.3. If

(1) \(X \subseteq \mathbb{N}\)

(2) \(0 \in X\)

(3) \((\forall n \in \mathbb{N}) n \in X \Rightarrow(n+1) \in X\),

then \[X=\mathbb{N}\] DiSCussion. We shall argue by contradiction. We assume that \(X \neq \mathbb{N}\). Let \(Y\) be the complement of \(X\) in \(\mathbb{N}\). Since \(Y\) is non-empty, it will have a least element. The third hypothesis of the theorem will not permit a least element in \(Y\), other than 0 , and this is impossible by the second hypothesis. Therefore \(Y\) is necessarily empty.

Proof. Let \(X\) satisfy the hypotheses of the theorem. Let \[Y=\mathbb{N} \backslash X .\] We assume \(Y\) is non-empty. Since \(Y \subseteq \mathbb{N}, Y\) is well-ordered by \(\leq\). Let \(a \in Y\) be the least element of \(Y\). We note that \(a\) is not 0 , since \(0 \in X\). Therefore \(a \geq 1\) and is a successor, so \(a-1\) is in \(\mathbb{N}\) and not in \(Y\). Hence \(a-1\) is in \(X\). But then by hypothesis (3) of the theorem, \(a-1+1 \in X\). This is a contradiction, therefore \(Y\) is empty and \(X=\mathbb{N}\).

REMARK. We will occasionally include informal, labelled discussions in our proofs in order to guide you in your reading. This is not a usual practice. You should not include such discussions in your proofs unless your instructor requests it.

Theorem \(4.3\) is more easily applied in the following form.

COROLLARY 4.4. Principle of induction Let \(P(x)\) be a formula in one variable. If

(1) \(P(0)\)

(2) \((\forall x \in \mathbb{N}) P(x) \Rightarrow P(x+1)\),

then \[(\forall x \in \mathbb{N}) P(x) .\] Proof. Let \[\chi_{P}=\{x \in \mathbb{N} \mid P(x)\} .\] We wish to show that \(\chi_{P}=\mathbb{N}\). By assumption (1), \(P(0)\), so \(0 \in \chi_{P}\). Assume that \(n \in \chi_{P}\). Then \(P(n)\). By assumption (2) \[P(n) \Rightarrow P(n+1) .\] Therefore \(P(n+1)\) and \(n+1 \in \chi_{P}\). Since \(n\) is arbitrary, \[(\forall n \in \mathbb{N}) n \in \chi_{P} \Rightarrow n+1 \in \chi_{P} .\] By Theorem 4.3, \(\chi_{P}=\mathbb{N}\) and \[(\forall x \in \mathbb{N}) P(x)\] Suppose that you wish to show that a formula \(P(x)\) holds for all natural numbers. When arguing by induction, the author must show that the hypotheses for the theorem are satisfied. Typically, the author first proves that \(P(0)\). This is called the base case of the proof by induction. It is very often an easy, even trivial, conclusion. Nonetheless, it is necessary to prove a base case in order to argue by induction (can you demonstrate this?). Having proved the base case, the author will then prove the second hypothesis, namely, that the claim being true for an arbitrary natural number implies that it is true at the successor of that natural number. This is the induction step. The induction step requires proving a conditional statement, which is often proved directly. It is important to understand that the author is not claiming that \(P\) holds at an arbitrary natural number, otherwise the argument would be circular and invalid. Rather, the author will demonstrate that if the result were true at an arbitrary natural number, then it would be true for the subsequent natural number. The assumption that \(P\) holds at a fixed and arbitrary natural number is called the induction hypothesis. If the author successfully proves the base case and the induction step, then the assumptions of Corollary \(4.4\) are satisfied, and \(P\) holds at all natural numbers.

Proposition 4.5. Let \(N \in \mathbb{N}\). Then \[\sum_{n=0}^{N} n=\frac{N(N+1)}{2} .\] Discussion. This is a good first example of a proof by induction. The argument is a straightforward application of the technique and the result is of historical and practical interest. We argue by induction on the upper index of the sum. That is, the formula we are proving for all natural numbers is \[P(x): \sum_{n=0}^{x} n=\frac{x(x+1)}{2} .\] It is important to identify the quantity over which you are applying the principle of induction, but some authors who are writing an argument for readers who are familiar with induction may not explicitly state the formula.

We prove a base case, \(N=0\), that corresponds to the sum with the single term 0 . We then argue the induction step. This is our first argument using the principle of induction. Pay close attention to the structure of this proof. You should strive to follow the conventions for proofs by induction that we establish in this book.

Proof. Base case: \(N=0\).

Discussion. Note that the base case is the statement \(P(0)\).

Since \[\sum_{n=0}^{0} n=0=\frac{(0)(1)}{2},\] \(P(0)\) holds.

Induction step:

Discussion. We prove the universal statement \[(\forall x \in \mathbb{N}) P(x) \Rightarrow P(x+1) .\] by showing that for an arbitrary natural number \(N\) \[P(N) \Rightarrow P(N+1) .\] Thus we reduce proving a universal statement to proving an abstract conditional statement. We prove the resulting conditional statement directly. That is, we assume \(P(N)\) and derive \(P(N+1)\). We remind the reader that we are not claiming the result holds at \(N\) - that is, we do not claim \(P(N)\). Rather, we are proving the conditional statement by assuming the antecedent, the induction hypothesis, and deriving the consequence. If you do not use the induction hypothesis, you are not arguing by induction. Of course, in the body of the argument this is transparent, without reference to the underlying logical principles.

Let \(N \in \mathbb{N}\) and assume that \[\sum_{n=0}^{N} n=\frac{N(N+1)}{2} .\] Then \[\begin{aligned} \sum_{n=0}^{N+1} n &=\left(\sum_{n=0}^{N} n\right)+N+1 \\ &={ }_{I H} \frac{N(N+1)}{2}+N+1 \end{aligned}\] by the induction hypothesis.

DiSCUSSION. It is a good habit, and a consideration for your reader, to identify when you are invoking the induction hypothesis. We will use the subscript \({ }_{I H}\) to indicate where we invoke the induction hypothesis.

So \[\begin{aligned} \sum_{n=0}^{N+1} n &=\frac{N(N+1)}{2}+N+1 \\ &=\frac{N(N+1)}{2}+\frac{2 N+2}{2} \\ &=\frac{N^{2}+3 N+2}{2} \\ &=\frac{(N+1)((N+1)+1)}{2} . \end{aligned}\] Therefore, \[(\forall N \in \mathbb{N}) P(N) \Rightarrow P(N+1) .\] By the principle of induction, the proposition follows.

Proposition 4.6. Let \(N \in \mathbb{N}\). Then \[\sum_{n=0}^{N} n^{2}=\frac{N(N+1)(2 N+1)}{6} .\] Proof. The assertion \(P(N)\) is that the equation (4.7) holds. The base case, \(N=0\), is obvious: \[\sum_{n=0}^{0} n^{2}=\frac{0(0+1)(2 \cdot 0+1)}{6} .\] Induction step:

Assume that \(N \in \mathbb{N}\) and \[\sum_{n=0}^{N} n^{2}=\frac{N(N+1)(2 N+1)}{6}\] We prove that \[\sum_{n=0}^{N+1} n^{2}=\frac{(N+1)(N+2)(2 N+3)}{6}\] Indeed \[\begin{aligned} \sum_{n=0}^{N+1} n^{2} &=\left(\sum_{n=0}^{N} n^{2}\right)+(N+1)^{2} \\ &=I H \\ &=\frac{N(N+1)(2 N+1)}{6}+(N+1)^{2} . \\ &=\frac{N(N+1)(2 N+1)}{6}+(N+1)^{2} \\ &=\frac{2 N^{3}+9 N^{2}+13 N+6}{6} \\ & \frac{(N+1)(N+2)(2(N+1)+1)}{6} . \end{aligned}\] The proposition follows from the principle of induction.

Discussion. The proof of Proposition \(4.6\) is very similar to the proof of Proposition 4.5. You may wish to confirm the algebraic identities in the latter portion of the proof, since they are not obvious. Just enough detail is included to guide you through the proof of the implication. The author of a proof by induction will assume that you are comfortable with the technique, and thereby may provide less detail than you like.

REMARK. There is more to Propositions \(4.5\) and \(4.6\) than just the proofs. There are also the formulas. Indeed, one use of induction is that if you guess a formula, you can use induction to prove your formula is correct. See Exercises \(4.12\) and 4.16.

Why is a base case necessary? Consider the following argument for the false claim \(\sum_{n=0}^{N} n<\frac{N(N+1)}{2}\). Let \(N \in \mathbb{N}\) and assume \(P(N)\), where \(P(N)\) is the statement \[\sum_{n=0}^{N} n<\frac{N(N+1)}{2} .\] Then \[\begin{aligned} \sum_{n=0}^{N+1} n &=\left(\sum_{n=0}^{N} n\right)+N+1 \\ <_{I H} & \frac{N(N+1)}{2}+N+1 \\ &=\frac{N^{2}+3 N+2}{2} \\ &=\frac{(N+1)((N+1)+1)}{2} . \end{aligned}\] Hence, \[(\forall N \in \mathbb{N}) P(N) \Rightarrow P(N+1) .\] Of course the inequality \(P(N)\) is easily demonstrated to be false. What went wrong? Without a base case, proving \[(\forall N \in \mathbb{N}) P(N) \Rightarrow P(N+1)\] is not sufficient to prove \((\forall N \in \mathbb{N}) P(N)\). If \(P(0)\) were true, then \(P(1)\) would be true, and if \(P(1)\) were true, then \(P(2)\) would be, and so on. Indeed, if we are able to prove \(P(N)\) for any \(N \in \mathbb{N}\), then we know \(P(M)\) for any natural number \(M>N\). But the sequence of statements \(\langle P(0), P(1), P(2), \ldots\rangle\) never gets started. \(P(N)\) fails for all \(N\).

Another way to think of induction is in terms of guarantees. Suppose you decide to buy a car. First you go to Honest Bob’s. Bob guarantees that any car he sells will go at least one mile. You buy a car, drive it off the lot, and after 3 miles it breaks down and can’t be fixed. You walk back angrily, but Bob won’t give you your money back because the car lived up to the guarantee.

Then you cross the road to Honest John’s. John guarantees that if he sells you a car, once it starts it will never stop. This sounds pretty good, so you buy a car, put the keys in the ignition, and ... nothing. The car won’t start. John won’t give you your money back either, because the car did not fail to do what he claimed.

Feeling desperate, you end up at Honest Stewart’s. Stewart’s cars come with two guarantees:

(1) The car will start and go at least one mile.

(2) No matter how far the car has gone, it can always be driven an extra mile.

You think this over, and eventually decide that the car will go for ever. Best of all, the lease is only \(\$ 1\) a month for the first two months. You sign the lease, and drive home rather pleased with yourself. \({ }^{1}\)

There are many handy generalizations of the principle of induction. The first we discuss is called strong induction. It is so-named because the induction hypothesis is stronger than the induction hypothesis in standard induction, and hence the induction step is sometimes easier to prove in an argument by strong induction.

COROLLARY 4.8. Strong induction Let \(P(x)\) be a formula such that

\({ }^{1}\) You are correct that the Principle of Induction guarantees that your car will drive forever. However, as your mother points out when you show her the lease, after the first two months your payment each month is the sum of your payments in the previous two months. How much will you be paying after 5 years? (1) \(P(0)\)

(2) For each \(n \in \mathbb{N}\), \[[(\forall x<n) P(x)] \Rightarrow[P(n)]\] then \[(\forall x \in \mathbb{N}) P(x)\] Intuitively this is not very different from basic induction. You start at a base case, and once started you can continue through the remainder of the natural numbers. The distinction is just in the number of assumptions you use when when proving something by strong induction. In practice, it gives the advantage that in the induction step you can reduce case \(N\) to any previous case, rather than the immediately preceding case, \(N-1\). In particular this simplifies arguments about divisibility and integers.

DiscUSSION. We reduce the principle of strong induction to the principle of induction. We accomplish this by introducing a formula, \(Q(x)\), which says, "P(y) is true for all \(y<x "\). Strong induction on \(P(x)\) is equivalent to basic induction on \(Q(x) .\)

PROOF. Assume that \(P(x)\) satisfies the hypotheses of the corollary. Let \(Q(x)\) be the formula \[(\forall y \leq x) P(y)\] where the universe of \(y\) is \(\mathbb{N}\). Then \(Q(0) \equiv P(0)\), so is true. Let \(N \in \mathbb{N}\), \(N \geq 1\), and assume \(Q(N)\). So \[(\forall y \leq N) P(y)\] and therefore \(P(N+1)\). Hence \[(\forall n \leq N+1) P(y)\] and thus \(Q(N+1)\). Therefore \[(\forall x \in \mathbb{N}) Q(x) \Rightarrow Q(x+1)\] By the principle of induction, \[(\forall x \in \mathbb{N}) Q(x) .\] However, for any \(N \in \mathbb{N}, Q(N) \Rightarrow P(N)\), so \[(\forall x \in \mathbb{N}) P(x) .\] Strong induction is particularly useful when proving claims about division. There are examples of the technique throughout Chapter 7 . The results in Chapter 7 do not require Chapter 5 and Chapter 6 , so you may easily skip ahead. See for example Section 7.1, where the Fundamental Theorem of Arithemetic is proved using strong induction. Induction does not have to start at 0 , or even at a natural number.

COROLLARY 4.9. Let \(k \in \mathbb{Z}\), and \(P(x)\) be a formula in one variable such that

(1) \(P(k)\)

(2) \((\forall x \geq k) P(x) \Rightarrow P(x+1)\).

Then \[(\forall x \in \mathbb{Z}) x \geq k \Rightarrow P(x)\] Discussion. This can be proved by a defining a new formula that can be proved with standard induction. Can you define the formula?