1.4: Terms and Formulas

Last updated
Save as PDF

Page ID: 9681

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Suppose that \(\mathcal{L}\) is the language \(\{ 0, +, < \}\), and we are going to use \(\mathcal{L}\) to discuss portions of arithmetic. If we were to write down the string of symbols from \(\mathcal{L}\),

\[\left( v_1 + 0 \right) < v_1,\]

and the string

\[v_{17} ) ( \forall + + ( ( ( 0,\]

you would probably agree that the first string conveyed some meaning, even if that meaning were incorrect, while the second string was meaningless. It is our goal in this section to carefully define which strings of symbols of \(\mathcal{L}\) we will use. In other words, we will select the strings that will have meaning.

Now, the point of having a language is to be able to make statements about certain kinds of mathematical systems. Thus, we will want the statements in our language to have the ability to refer to objects in the mathematical structures under consideration. So we will need some of the strings in our language to refer to those objects. Those strings are called the terms of \(\mathcal{L}\).

Definition 1.3.1. If \(\mathcal{L}\) is a language, a term of \(\mathcal{L}\) is a nonempty finite string \(t\) of symbols from \(\mathcal{L}\) such that either:

\(t\) is a variable, or
\(t\) is a constant symbol, or
\(t : \equiv f t_1 t_2 \ldots t_n\), where \(f\) is an \(n\)-ary function symbol of \(\mathcal{L}\) and each of the \(t_i\) is a term of \(\mathcal{L}\).

A couple of things about this definition need to be pointed out. First, there is the symbol \(: \equiv\) in the third clause. The symbol \(: \equiv\) is not a part of the language \(\mathcal{L}\). Rather it is a meta-linguistic symbol that means that the strings of \(\mathcal{L}\)-symbols on each side of the \(: \equiv\) are identical. Probably the best natural way to read clause 3 would be to say that "\(t\) is \(f t_1 t_2 \ldots t_n\)".

The other thing to notice about Definition 1.3.1 is that this is a definition by recursion, since in the third clause of the definition, \(t\) is a tern if it contains substrings that are terms. Since the substrings of \(t\) are shorter (contain fewer symbols) than \(t\), and as none of the symbols of \(\mathcal{L}\) are made up of other symbols of \(\mathcal{L}\), this causes no problems.

Example 1.3.2. Let \(\mathcal{L}\) be the language \(\{ \bar{0}, \bar{1}, \bar{2}, \ldots, +, \cdot\}\), with one constant symbol for each natural number and two binary function symbols. Here are some of the terms of \(\mathcal{L}\): \(\bar{714}\), \(+ \bar{3} \bar{2}\), \(\cdot + \bar{3} \bar{2} \bar{4}\). Notice that \(\bar{1} \bar{2} \bar{3}\) is not a term of \(\mathcal{L}\), but rather is a sequence of three terms in a row.

Chaff: The term \(+ \bar{3} \bar{2}\) looks pretty annoying at this point, but we will use this sort of notation (called Polish notation) for functions rather than the infix notation \(\left( \bar{3} + \bar{2} \right)\) that you are used to. We are not really being odd here: You have certainly seen some functions written in Polish notation: \(\sin \left( x \right)\) and \(f \left( x, y, z \right)\) come to mind. We are just being consistent in treating addition in the same way. What makes it difficult is that it is hard to remember that addition really is just another function of two variables. But we are sure that by the end of this book, you will be very comfortable with that idea and with the notation that we are using.

A couple of points are probably worth emphasizing, just this once. Notice that in the application of the function symbols, there are no parentheses and no commas. Also notice that all of our functions are written with the operator on the left. So instead of \(\bar{3} + \bar{2}\), we write \(+ \bar{3} \bar{2}\). The reason for this is for consistency and to make sure that we can parse our expressions.

Let us give an example. Suppose that, in some language or other, we wrote down the string of symbols \(\heartsuit \yen \uparrow \diamondsuit \# \# \int\). Assume that two of our colleagues, Humphrey and Ingrid, were waiting in the hall while we wrote down the string. If Humphrey came into the room and announced that our string was a 3-ary function symbol followed by three terms, whereas Ingrid proclaimed that the string was really a 4-ary relation symbol followed by two terms, this would be rather confusing. It would be really confusing if they were both correct! So we need to make sure that the strings that we write down can be interpreted in only one way. This property, called unique readability, is addressed in Exercise 7 of Section 1.4.

Chaff: Unique readability is one of those things that, in the opinion of the authors, is important to know, interesting to prove, and boring to read. Thus the proof is placed in (we do not mean "relegated to") the exercises.

Suppose that we look more carefully at the term \(\cdot + \bar{3} \bar{2} \bar{4}\). Assume for now that the symbols in this term are supposed to be interpreted in the usual way, so that \(\cdot\) means multiply, \(+\) means add, and \(\bar{3}\) means three. Then if we add some parentheses to the term in order to clarify its meaning, we get

\[\cdot \left( + \bar{3} \bar{2} \right) \bar{4},\]

which ought to have the same meaning as \(\cdot \bar{5} \bar{4}\), which is \(\bar{20}\), just as you suspected.

Rest assured that we will continue to use infix notation, commas, and parentheses as seem warranted to increase the readability (by humans) of this text. So \(f t_1 t_2 \ldots t_n\) will be written \(f \left( t_1, t_2, \ldots, t_n \right)\) and \(+ \bar{3} \bar{2}\) will be written \(\bar{3} + \bar{2}\), with the understanding that this is shorthand and that our official version is the version given in Definition 1.3.1.

The terms of \(\mathcal{L}\) play the role of the nouns of the language. To make meaningful mathematical statements about some mathematical structure, we will want to be able to make assertions about the objects of the structure. These assertions will be the formulas of \(\mathcal{L}\).

Definition 1.3.3. If \(\mathcal{L}\) is a first-order language, a formula of \(\mathcal{L}\) is a nonempty finite string \(\phi\) of symbols from \(\mathcal{L}\) such that either:

\(\phi : \equiv = t_1 t_2\), where \(t_1\) and \(t_2\) are terms of \(\mathcal{L}\), or
\(\phi : \equiv R t_1 t_2 \ldots t_n\), where \(R\) is an \(n\)-ary relation symbol of \(\mathcal{L}\) and \(t_1, t_2, \ldots, t_n\) are all terms of \(\mathcal{L}\), or
\(\phi : \equiv \left( \neg \alpha \right)\), where \(\alpha\) is a formula of \(\mathcal{L}\), or
\(\phi : \equiv \left( \alpha \lor \beta \right)\), where \(\alpha\) and \(\beta\) are formulas of \(\mathcal{L}\), or
\(\phi : \equiv \left( \forall v \right) \left( \alpha \right)\), where \(v\) is a variable and \(\alpha\) is a formula of \(\mathcal{L}\).

If a formula \(\psi\) contains the subformula \(\left( \forall v \right) \left( \alpha \right)\) [meaning that the string of symbols that constitute the formula \(\left( \forall v \right) \left( \alpha \right)\) is a substring of the string of symbols that make up \(\psi\)], we will say that the scope of the quantifier \(\forall\) is \(\alpha\). Any symbol in \(\alpha\) will be said to lie within the scope of the quantifier \(\forall\). Notice that a formula \(\psi\) can have several different occurrences of the symbol \(\forall\), and each occurrence of the quantifier will have its own scope. Also notice that one quantifier can lie within the scope of another.

The atomic formulas of \(\mathcal{L}\) are those formulas that satisfy clause (1) or (2) of Definition 1.3.3.

You have undoubtedly noticed that there are no parentheses or commas in the atomic formulas, and you have probably decided that we will continue to use both commas and infix notation as seems appropriate. You are correct on both counts. So, instead of writing the official version

\[< SSSSS0SS0\]

in a language containing constant symbol 0, unary function symbol \(S\), and binary relation symbol <, we will write

\[SSSSS0 < SS0\]

or (after some preliminary definitions)

\[\bar{5} < \bar{2}\]

Also notice that we are using infix notation for the binary logical connective \(\lor\). We hope that this will make your life somewhat easier.

You will be asked in Exercise 8 in Section 1.4 to prove that unique readability holds for formulas as well as terms. We will, in our exposition, use different-size parentheses, different shapes of delimiters, and omit parentheses in order to improve readability without (we hope) introducing confusion on your part.

Notice that a term is not a formula! If the terms are the nouns of the language, the formulas will be the statements. Statements can be either true or false. Nouns cannot. Much confusion can be avoided if you keep this simple dictum in mind.

For example, suppose that you are looking at a string of symbols and you notice that the string does not contain either the symbol = or any other relation symbol from the language. Such a string cannot be a formula, as it makes no claim that can be true or false. The string might be a term, it might be nonsense, but it cannot be a formula.

Chaff: We do hope that you have noticed that we are dealing only with the syntax of our language here. We have not mentioned that the symbol \(\neg\) will be used for denial, or that \(\lor\) will mean "or", or even that \(\forall\) means "for every". Don't worry, they will mean what you think they should mean. Similarly, do not worry about the fact that the definition of a formula left out symbols for conjunctions, implications, and biconditionals. We will get to them in good time.

Exercises

Suppose that the language \(\mathcal{L}\) consists of two constant symbols, \(\diamondsuit\) and \(\heartsuit\), a unary relation symbol \(\yen\), a binary function symbol \(\flat\), and a 3-ary function symbol \(\sharp\). Write down at least three distinct terms of the language \(\mathcal{L}\). Write down a couple of nonterms that look like they might be terms and explain why they are not terms. Write a couple of formulas and a couple of nonformulas that look like they ought to be formulas.
The fact that we write all of our operations on the left is important for unique readability. Suppose, for example, that we wrote our binary operations in the middle (and did not allow the use of parentheses). If our language included the binary function symbol \(\#\), then the term
\[u \# v \# w\]
could be interpreted two ways. This can make a difference: Suppose that the operation associated with the function symbol \(\#\) is "subtract". Find three real numbers \(u\), \(v\), and \(w\) such that the two different interpretations of \(u \# v \# w\) lead to different answers. Any nonassociative binary function will yield another counterexample to unique readability. Can you think of three such functions?
The language of number theory is
\[\mathcal{L}_{NT} \: \text{is} \: \{0, S, +, \cdot, E, <\},\]
where the intended meanings of the symbols are as follows: 0 stands for the number zero, \(S\) is the successor function \(S \left( x \right) = x + 1\), the symbols \(+\), \(\cdot\), and \(<\) mean what you expect, and \(E\) stands for exponentiation, so \(E \left( 3, 2 \right) = 9\). Assume that \(\mathcal{L}_{NT}\)-formulas will be interpreted with respect to the nonnegative integers and write an \(\mathcal{L}_{NT}\)-formula to express the claim that \(p\) is a prime number. Can you write the statement of Lagrange's Theorem, which states that every natural number is the sum of four squares?
Write a formula stating that there is no largest prime number. How would we express the Goldbach Conjecture, that every even number greater than two can be expressed as the sum of two primes?
What is the formal statement of the Twin Primes Conjecture, which says that there are infinitely many pairs \(\left( x, y \right)\) such that \(x\) and \(y\) are both prime and \(y = x + 2\)? The Bounded Gap Theorem, proven in 2013, says that there are infinitely many pairs of prime numbers that differ by 70,000,000 or less. Write a formal statement of that theorem.
Use shorthand in your answers to this problem. For example, after you have found the formula which says that \(p\) is prime, call the formula Prime\(\left( p \right)\) and use Prime\(\left( p \right)\) in your later answers.
Suppose that our language has infinitely many constant symbols of the form \(','', ''', \ldots\) and no function or relation symbols other than =. Explain why this situation leads to problems by looking at the formula \(=''''''\). Where in our definitions do we outlaw this sort of problem?