1.9: Substitutions and Substitutability

Last updated
Save as PDF

Page ID: 9977

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Suppose you knew that the sentence \(\forall x \phi \left( x \right)\) was true in a particular structure \(\mathfrak{A}\). Then, if \(c\) is a constant symbol in the language, you would certainly expect \(\phi \left( c \right)\) to be true in \(\mathfrak{A}\) as well. What we have done is substitute the constant symbol \(c\) for the variable \(x\). This seems perfectly reasonable, although there are times when you do have to be careful.

Suppose that \(\mathfrak{A} \models \forall x \exists y \neg \left( x = y \right)\). This sentence is, in fact, true in any structure \(\mathfrak{A}\) such that \(A\) has at least two elements. If we then proceed to replace the variable \(x\) by the variable \(u\), we get the statement \(\exists y \neg \left( u = y \right)\), which will still be true in \(\mathfrak{A}\), no matter what value we give to the variable \(u\). If, however, we take our original formula and replace \(x\) by \(y\), then we find ourselves looking at \(\exists y \neg \left( y = y \right)\), which will be false in any structure. So by a poor choice of substituting variable, we have changed the truth value of our formula. The rules of substitutability that we will discuss in this section are designed to help us avoid this problem, the problem of attempting to substitute a term inside a quantifier that binds a variable involved in the term.

We begin by defining exactly what we mean when we substitute a term \(t\) for a variable \(x\) in either a term \(u\) or a formula \(\phi\).

Definition 1.8.1. Suppose that \(u\) is a term, \(x\) is a variable, and \(t\) is a term. We define the term \(u_t^x\) (read "\(u\) with \(x\) replaced by \(t\)") as follows:

If \(u\) is a variable not equal to \(x\), then \(u_t^x\) is \(u\).
If \(u\) is \(x\), then \(u_t^x\) is \(t\).
If \(u\) is a constant symbol, then \(u_t^x\) is \(u\).
If \(u : \equiv f u_1 u_2 \ldots u_n\), where \(f\) is an \(n\)-ary function symbol and the \(u_i\) are terms, then \(u_t^x\) is \(f \left( u_1 \right)_t^x \left( u_2 \right)_t^x \ldots \left( u_n \right)_t^x.\)

Chaff: In the fourth clause of the definition above and in the first two clauses of the next definition, the parentheses are not really there. However, we believe that no one can look at \(u_1 \:_t^x\) and figure out what it is supposed to mean. So the parentheses have been added in the interest of readability.

For example, if we let \(t\) be \(g \left( c \right)\) and we let \(u\) be \(f \left( x, y \right) + h \left( z, x, g \left( x \right) \right)\), then \(u_t^x\) is

\[f \left( g \left( c \right), y \right) + h \left( z, g \left( c \right) , g \left( g \left( c \right) \right) \right).\]

The definition of substitution into a formula is also by recursion:

Definition 1.8.2. Suppose that \(\phi\) is an \(\mathcal{L}\)-formula, \(t\) is a term, and \(x\) is a variable. We define the formula \(\phi_t^x\) (read "\(\phi\) with \(x\) replaced by \(t\)") as follows:

If \(\phi : \equiv = u_1 u_2\), then \(\phi_t^x\) is \(= \left( u_1 \right)_t^x \left( u_2 \right)_t^x\).
If \(\phi : \equiv R u_1 u_2 \ldots u_n\), then \(\phi_t^x\) is \(R \left( u_1 \right)_t^x \left( u_2 \right)_t^x \ldots \left( u_n \right)_t^x\).
If \(\phi : \equiv \neg \left( \alpha \right)\), then \(\phi_t^x\) is \(\neg \left( \alpha_t^x \right)\).
If \(\phi : \equiv \left( \alpha \lor \beta \right)\), then \(\phi_t^x\) is \(\left( \alpha_t^x \lor \beta_t^x \right)\).
If \(\phi : \equiv \left( \forall y \right) \left( \alpha \right)\), then
\[\phi_t^x = \begin{cases} \phi \: \: \: \: \: \: \: \: \: \: \: \: \: \: \: \: \: \: \: \: \: \text{if} \: x \: \text{is} \: y \\ \left( \forall y \right) \left( \alpha_t^x \right) \: \: \: \text{otherwise} \end{cases}\]

As an example, suppose that \(\phi\) is the formula

\[P \left( x, y \right) \rightarrow \left[ \left( \forall x \right) Q \left( g \left( x \right), z \right) \right) \lor \left( \forall y \right) \left( R \left( x, h \left( x \right) \right) \right].\]

Then, if \(t\) is the term \(g \left( c \right)\), we get

\[\phi_t^x \: \text{is} P \left( g \left( c \right), y \right) \rightarrow \left[ \left( \forall x \right) \left( Q \left( g \left( x \right), z \right) \right) \right) \lor \left( \forall y \right) \left( R \left( g \left( c \right), h \left( g \left( c \right) \right) \right) \right].\]

Having defined what we mean when we substitute a term for a variable, we will now define what it means for a term to be substitutable for a variable in a formula. The idea is that if \(t\) is substitutable for a variable in a formula. The idea is that if \(t\) is substitutable for \(x\) in \(\phi\), we will not run into the problems discussed at the beginning of this section - we will not substitute a term in such a way that a variable contained in that term is inadvertently bound by a quantifier.

Definition 1.8.3. Suppose that \(\phi\) is an \(\mathcal{L}\)-formula, \(t\) is a term, and \(x\) is a variable. We way that \(t\) is substitutable for \(x\) in \(\phi\) if

\(\phi\) is atomic, or
\(\phi : \equiv \neg \left( \alpha \right)\) and \(t\) is substitutable for \(x\) in \(\alpha\), or
\(\phi : \equiv \left( \alpha \lor \beta \right)\) and \(t\) is substitutable for \(x\) in both \(\alpha\) and \(\beta\), or
\(\phi : \equiv \left( \forall y \right) \left( \alpha \right)\) and either
(a) \(x\) is not free in \(\phi\), or
(b) \(y\) does not occur in \(t\) and \(t\) is substitutable for \(x\) in \(\alpha\)

Notice that \(\phi_t^x\) is defined whether or not \(t\) is substitutable for \(x\) in \(\phi\). Usually we will not want to do a substitution unless we check for substitutability, but we have the ability to substitute whether or not it is a good idea. In the next chapter, however, you will often see that certain operations are allowed only if \(t\) is substitutable for \(x\) in \(\phi\). That restriction is there for good reason, as we will be concerned with preserving the truth of formulas after performing substitutions.

Exercises

For each of the following, write out \(u_t^x\):
(a) \(u : \equiv \cos x\), \(t\) is \(\sin y\).
(b) \(u : \equiv y\), \(t\) is \(Sy\).
(c) \(u : \equiv \sharp \left( x, y , z \right)\), \(t\) is \(423 - w\).
For each of the following, first write out \(\phi_t^x\), then decide if \(t\) is substitutable for \(x\) in \(\phi\), and then (if you haven't already) use the definition of substitutability to justify your conclusions.
(a) \(\phi : \equiv \forall x \left( x = y \rightarrow Sx = Sy \right)\), \(t\) is \(S0\).
(b) \(\phi : \equiv \forall y \left( x = y \rightarrow Sx = Sy \right)\), \(t\) is \(Sy\).
(c) \(\phi : \equiv x = y \rightarrow \left( \forall x \right) \left( Sx = Sy \right)\), \(t\) is \(Sy\).
Show that if \(t\) is variable-free, then \(t\) is always substitutable for \(x\) in \(\phi\).
Show that \(x\) is always substitutable for \(x\) in \(\phi\).
Prove that if \(x\) is not free in \(\psi\), then \(\psi_t^x\) is \(\psi\).
You might think that \(\left( \phi_y^x \right)_x^y\) is \(\phi\), but a moment's thought will give you an example to show that this doesn't always work. (What if \(y\) is free in \(\phi\)?) Find an example that shows that even if \(y\) is not free in \(\phi\), we can still have \(\left( \phi_y^x \right)_x^y\) different from \(\phi\). Under what conditions do we know that \(\left( \phi_y^x \right)_x^y\) is \(\phi\)?
Write a computer program (in your favorite language, or in pseudo-code) that accepts as input a formula \(\phi\), a variable \(x\), and a term \(t\) and outputs "yes" or "no" depending on whether or not \(t\) is substitutable for \(x\) in \(\phi\).