# 1.5: Induction

- Page ID
- 9682

You are familiar, no doubt, with proofs by induction. They are the bane of most mathematics students from their first introduction in high school through the college years. It is our goal in this section to discuss the proofs by induction that you know so well, put them in a different light, and then generalize that notion of induction to a setting that will allow us to use induction to prove things about terms and formulas rather than just the natural numbers.

Just to remind you of the general form of a proof by induction on the natural numbers, let us state and prove a familiar theorem, assuming for the moment that the set of natural numbers is \(\{1, 2, 3, \ldots \}\).

**Theorem 1.4.1.** *For every natural number \(n\),*

\[1 + 2 + \cdots + n = \frac{n \left( n + 1 \right)}{2}.\]

*Proof.* If \(n = 1\), simple computation shows that the equality holds. For the inductive case, fix \(k \geq 1\) and assume that

\[1 + 2 + \cdots + k = \frac{k \left( k + 1 \right)}{2}.\]

If we add \(k + 1\) to both sides of this equation, we get

\[1 + 2 + \cdots + k + \left( k + 1 \right) = \frac{k \left( k + 1 \right)}{2} + \left( k + 1 \right),\]

and simplifying the right-hand side of this equation shows that

\[1 + 2 + \cdots + \left( k + 1 \right) = \frac{\left( k + 1 \right) \left( \left( k + 1 \right) + 1 \right)}{2},\]

finishing the inductive step, and the proof.

As you look at the proof of this theorem, you notice that there is a base case, when \(n = 1\), and an inductive case. In the inductive step of the proof, we prove the implication

If the formula holds for \(k\), then the formula holds for \(k + 1\).

We prove this implication by assuming the antecedent, that the theorem holds for a (fixed, but unknown) number \(k\), and from that assumption proving the consequent, that the theorem hods for the next number, \(k + 1\). Notice that this is *not* the same as assuming the theorem that we are trying to prove. The theorem is a universal statement - it claims that a certain formula holds for every natural number.

Looking at this from a slightly different angle, what we have done is to construct a set of numbers with a certain property. If we let \(S\) stand for the set of numbers for which our theorem holds, in our proof by induction we show the following facts about S:

- The number 1 is an element of \(S\). We prove this explicitly in the base case of the proof.
- If the number \(k\) is an element of \(S\), then the number \(k + 1\) is an element of \(S\). This is the content of the inductive step of the proof.

But now, notice that we know that the collection of natural numbers can be defined as the smallest set such that:

- The number 1 is a natural number.
- If \(k\) is a natural number, then \(k + 1\) is a natural number.

So \(S\), the collection of numbers for which the theorem holds, is identical with the set of natural numbers, thus the theorem holds for every natural number \(n\), as needed. (If you caught the slight lie here, just substitute "superset" where appropriate.)

So what makes a proof by induction work is the fact that the natural numbers can be defined recursively. There is a base case, consisting of the smallest natural number ("1 is a natural number"), and there is a recursive case, showing how to construct bigger natural numbers from smaller ones ("If \(k\) is a natural number, then \(k + 1\) is a natural number").

Now, let us look at Definition 1.3.3, the definition of a formula. Notice that the five clauses of the definition can be separated into two groups. The first two clauses, the atomic formulas, are explicitly defined: For example, the first case says that anything that is of the form \(= t_1 t_2\) is a formula if \(t_1\) and \(t_2\) are terms. These first two clauses form the base case of the definition. The last three clauses are the recursive case, showing how if \(\alpha\) and \(\beta\) are formulas, they can be used to build more complex formulas, such as \(\left( \alpha \lor \beta \right)\) or \(\left( \forall v \right) \left( \alpha \right)\).

Now since the collection of formulas is defined recursively, we can use an inductive-style proof when we want to prove that something is true about *every* formula. The inductive proof will consist of two parts, a base case and an inductive case. In the base case of the proof we will verify that the theorem is true about every atomic formula - about every string that is known to be a formula from the base case of the definition. In the inductive step of the proof, we assume that the theorem is true about simple formulas (\(\alpha\) and \(\beta\)), and use that assumption to prove that the theorem holds a more complicated formula \(\phi\) that is generated by a recursive clause of the definition. This method of proof is called *induction on the complexity of the formula*, or *induction on the structure of the formula*.

There are (at least) two ways to think about the word "simple" in the last paragraph. One way in which a formula \(\alpha\) might be simpler than a complicated formula \(\phi\) is if \(\alpha\) is a subformula of \(\phi\). The following theorem, although mildly interesting in its own right, is included here mostly so that you can see an example of a proof by induction in this setting:

**Theorem 1.4.2.** *Suppose that \(\phi\) is a formula in the language \(\mathcal{L}\). Then the number of left parentheses occurring in \(\phi\) is equal to the number of right parentheses occurring in \(\phi\).*

*Proof.* We will present this proof in a fair bit of detail, in order to emphasize the proof technique. As you become accustomed to proving theorems by induction on complexity, not so much detail is needed.'

**Base Case.** We begin our inductive proof with the base case, as you would expect. Our theorem makes an assertion about all formulas, and the simplest formulas are the atomic formulas. They constitute our base case. Suppose that \(\phi\) is an atomic formula. There are two varieties of atomic formulas: Either \(\phi\) begins with an equals sign followed by two terms, or \(\phi\) begins with a relation symbol followed by several terms. As there are no parentheses in any term (we are using the official definition of term, here), there are no parentheses in \(\phi\). Thus, there are as many left parentheses as right parentheses in \(\phi\), and we have established the theorem if \(\phi\) is an atomic formula.

**Inductive Case.** The inductive step of a proof by induction on complexity of a formula takes the following form: Assume that \(\phi\) is a formula by virtue of clause (3), (4), or (5) of Definition 1.3.3. Also assume that the statement of the theorem is true when applied to the formulas \(\alpha\) and \(\beta\). With those assumptions we will prove that the statement of the theorem is true when applied to the formula \(\phi\). Thus, as every formula is a formula either by virtue of being an atomic formula or by application of clause (3), (4), or (5) of the definition, we will have shown that the statement of the theorem is true when applied to any formula, which has been our goal.

So, assume that \(\alpha\) and \(\beta\) are formulas that contain equal numbers of left and right parentheses. Suppose that there are \(k\) left parentheses and \(k\) right parentheses in \(\alpha\) and \(l\) left parentheses and \(l\) right parentheses in \(\beta\).

If \(\phi\) is a formula by virtue of clause (3) of the definition, then \(\phi : \equiv \left( \neg \alpha \right)\). We observe that there are \(k + 1\) left parentheses and \(k + 1\) right parentheses in \(\phi\), and thus \(\phi\) has an equal number of left and right parentheses, as needed.

If \(\phi\) is a formula because of clause (4), then \(\phi : \equiv \left( \alpha \lor \beta \right)\), and \(\phi\) contains \(k + l + 1\) left and right parentheses, an equal number of each type.

Finally, if \(\phi : \equiv \left( \forall v \right) \left( \alpha \right)\), then \(\phi\) contains \(k + 2\) left parentheses and \(k + 2\) right parentheses, as needed.

This concludes the possibilities for the inductive case of the proof, so we have established that in every formula, the number of left parentheses is equal to the number of right parentheses.

A second way in which we might structure a proof by induction on the structure of the formula is to say that \(\alpha\) is simpler than \(\phi\) if the number of connectives/quantifiers in \(\alpha\) is less than the number in \(\phi\). In this case one could argue that the induction argument is really an ordinary induction on the natural numbers. Here is an outline of how such a proof might proceed:

*Proof.* We argue by induction on the structure of \(\phi\).

**Base Case.** Assume \(\phi\) has 0 connectives/quantifiers. This means that \(\phi\) is an atomic formula. {Insert argument establishing the theorem for atomic formulas.}

**Inductive Case.** Assume that \(\phi\) has \(k + 1\) connectives/quantifiers. Then either \(\phi : \equiv \neg \alpha\), or \(\phi : \equiv \alpha \lor \beta\) or \(\phi : \equiv \left( \forall x \right) \alpha\), and we can assume that the theorem holds for *every* formula that has \(k\) or fewer connectives/quantifiers. We now argue that the theorem holds for the formula \(\phi\). {Insert argument for the three inductive cases.}

Between the base case and the inductive case we have established that the theorem holds for \(\phi\) no matter how many connectives/quantifiers the formula \(\phi\) contains, so by induction on the structure of \(\phi\), we have established that the theorem holds for all formulas \(\phi\).

This might be a bit confusing on first glance, but the power of this proof technique will become very evident as you work through the following exercises and when we discuss the semantics of our language.

Notice also that the definition of a term (Definition 1.3.1) is also a recursive definition, so we can use induction on the complexity of a term to prove that a theorem holds for every term.

## Exercises

- Prove, by ordinary induction on the natural numbers, that

\[1^2 + 2^2 + \cdots + n^2 = \frac{n \left( n + 1 \right) \left( 2n + 1 \right)}{6}.\] - Prove, by induction, that the sum of the interior angles in a convex \(n\)-gon is \(\left( n - 2 \right) 180^\text{o}\). (A convex \(n\)-gon is a polygon with \(n\) sides, where the interior angles are all less than \(180^\text{o}\).)
- Prove by induction that if \(A\) is a set consisting of \(n\) elements, then \(A\) has \(2^n\) subsets.
- Suppose that \(\mathcal{L}\) is \(\{0, f, g\}\), where 0 is a constant symbol, \(f\) is a binary function symbol, and \(g\) is a 4-ary function symbol. Use induction on complexity to show that every \(\mathcal{L}\)-term has an odd number of symbols.
- If \(\mathcal{L}\) is \(\{0, <\}\), where 0 is a constant symbol and \(<\) is a binary relation symbol, show that the number of symbols in any formula is divisible by 3.
- If \(s\) and \(t\) are strings, we say that \(s\) is an
*initial segment*of \(t\) if there is a nonempty string \(u\) such that \(t : \equiv su\), where \(su\) is the string \(s\) followed by the string \(u\). For example, \(KUMQ\) is an initial segment of \(KUMQUAT\) and \(+24\) is an initial segment of \(+24 u - v\). Prove, by induction on the complexity of \(s\), that if \(s\) and \(t\) are terms, then \(s\) is not an initial segment of \(t\). [*Suggestion:*The base case, when \(s\) is either a variable or a constant symbol, should be easy. Then suppose that \(s\) is an initial segment of \(t\) and \(s : \equiv f t_1 t_2 \ldots t_n\), where you know that each \(t_i\) is not an initial segment of any other term. Look for a contradiction.] - A language is said to satisfy unique readability for terms if, for each term \(t\), \(t\) is in exactly one of the following categories:

(a) Variable

(b) Constant symbol

(c) Complex term

and furthermore, if \(t\) is a complex term, then there is a unique function symbol \(f\) and a unique sequence of terms \(t_1, t_2, \ldots, t_n\) such that \(t : \equiv f t_1 t_2 \ldots t_n\). Prove that our languages satisfy unique readability for terms. [*Suggestion:*You mostly have to worry about uniqueness - for example, suppose that \(t\) is \(c\), a constant symbol. How do you know that \(t\) is not also a complex term? Suppose that \(t\) is \(f t_1 t_2 \ldots t_n\). How do you show that the \(f\) and the \(t_i\)'s are unique? You may find Exercise 6 useful.] - To say that a language satisfies unique readability for formulas is to say that every formula \(\phi\) is in exactly one of the following categories:

(a) Equality (if \(\phi : \equiv = t_1 t_2\))

(b) Other atomic (if \(\phi : \equiv R t_1 t_2 \ldots t_n\) for an \(n\)-ary relation symbol \(R\))

(c) Negation

(d) Disjunction

(e) Quantified

Also, it must be that if \(\phi\) is both \(= t_1 t_2\) and \(= t_3 t_4\), then \(t_1\) is identical to \(t_3\) and \(t_2\) is identical to \(t_4\), and similarly for other atomic formulas. Furthermore, if (for example) \(\phi\) is a negation \(\left( \neg \alpha \right)\), then it must be the case that there is not another formula \(\beta\) such that \(\phi\) is also \(\left( \neg \beta \right)\), and similarly for disjunctions and quantified formulas. You will want to look at, and use, Exercise 7. You may have to prove an analog of Exercise 6, in which it may be helpful to think about the parentheses in an initial segment of a formula, in order to prove that no formula is an initial segment of another formula. - Take the proof of Theorem 1.4.2 and write it out in the way that you would present it as part of a homework assignment. Thus, you should cut out all of the inessential motivation and present only what is needed to make the proof work.