9.1: Proof concepts

Last updated
Save as PDF

Page ID: 97277

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

A proof is essentially a chain of reasoning, in which each step can be logically deduced from the ones that preceded it. It’s a way of putting your thought process on display so it can be scrutinized to make sure it holds water. Any step of your reasoning which was unwarranted will be exposed, and perhaps reveal that the conclusion you thought was true isn’t necessarily dependable after all.

Here’s an example from everyday life. I’m driving home from work one afternoon, and I believe that my wife and children will be gone when I arrive. I’ll be coming home to an empty house.

Now why do I believe this? Well, if I unravel my reasoning, it goes like this. First, today is Wednesday. On Wednesday nights, my wife and children normally go to church for dinner and service. Second, my wife likes to call me ahead of time if this plan changes. My cell phone is in my pocket, and has not rung, and so I conclude that the plan has not changed. I look at my watch, and it reads 5:17pm, which is after the time they normally leave, so I know I’m not going to catch them walking out the door. This is, roughly speaking, my thought process that justifies the conclusion that the house will be empty when I pull into the garage.

Notice, however, that this prediction depends precariously on several facts. What if I spaced out the day of the week, and this is actually Thursday? All bets are off. What if my cell phone battery has run out of charge? Then perhaps she did try to call me but couldn’t reach me. What if I set my watch wrong and it’s actually 4:17pm? Etc. Just like a chain is only as strong as its weakest link, a whole proof falls apart if even one step isn’t reliable.

Knowledge bases in artificial intelligence systems are designed to support these chains of reasoning. They contain statements expressed in formal logic that can be examined to deduce only the new facts that logically follow from the old. Suppose, for instance, that we had a knowledge base that currently contained the following facts:

A\(\Rightarrow\)C
\(\neg\)(C\(\wedge\)D)
(F\(\vee \neg\)E)\(\Rightarrow\)D
A\(\vee\)B

These facts are stated in propositional logic, and we have no idea what any of the propositions really mean, but then neither does the computer, so hey. Fact #1 tells us that if proposition A (whatever that may mean) is true, then we know C is true as well. Fact #2 tells us that we know C\(\wedge\)D is false, which means at least one of the two must be false. And so on. Large knowledge bases can contain thousands or even millions of such expressions. It’s a complete record of everything the system “knows."

Now suppose we learn an additional fact: \(\neg\)B. In other words, the system interacts with its environment and comes to the conclusion that proposition B must be false. What else, if anything, can now be safely concluded from this?

It turns out that we can now conclude that F is also false. How do we know this? Here’s how:

Fact #4 says that either A or B (or both) is true. But we just discovered that B was false. So if it ain’t B, it must be A, and therefore we conclude that A must be true. (For the curious, this rule of common sense is called a “disjunctive syllogism.")
Now if A is true, we know that C must also be true, because fact #1 says that A implies C. So we conclude that C is true. (This one goes by the Latin phrase “modus ponens.")
Fact #2 says that C\(\wedge\)D must be false. But we just found out that C was true, so it must be D that’s false in order to make the conjunction false. So we conclude that D is false. (This is a disjunctive syllogism in disguise, combined with De Morgan’s law.)
Finally, fact #3 tells us that if either F were true or E were false, then that would imply that D would be true. But we just found out that D is false. Therefore, neither F nor \(\neg\)E can be true. (This step combines “modus tollens" with “disjunction elimination.") So we conclude that F must be false. Q.E.D.

(The letters “Q.E.D." at the end of a proof stand for a Latin phrase meaning, “we just proved what we set out to prove." It’s kind of a way to flex your muscles as you announce that you’re done.)

Not all proofs are performed in formal logic like this; some use algebra, set theory, or just plain English. But the idea is the same: start with what you know, procede to derive new knowledge using only legal operations, and end with your conclusion.

The things we’re allowed to start with are called axioms (or postulates). An axiom is a presupposition or definition that is given to be true, and so it is legal grounds from which to start. A proof can’t even get off the ground without axioms. For instance, in step 1 of the above proof, we noted that either A or B must be true, and so if B isn’t true, then A must be. But we couldn’t have taken this step without knowing that disjunctive syllogism is a valid form of reasoning. It’s not important to know all the technical names of the rules that I included in parentheses. But it is important to see that we made use of an axiom of reasoning on every step, and that if any of those axioms were incorrect, it could lead to a faulty conclusion.

When you create a valid proof, the result is a new bit of knowledge called a theorem which can be used in future proofs. Think of a theorem like a subroutine in programming: a separate bit of code that does a job and can be invoked at will in the course of doing other things. One theorem we learned in chapter was the distributive property of sets; that is, that X \(\cap\) (Y \(\cup\) Z) = (X \(\cap\) Y) \(\cup\) (X \(\cap\) Z). This can be proven through the use of Venn diagrams, but once you’ve proven it, it’s accepted to be true, and can be used as a “given" in future proofs.