The choice of postulates in Stephen Cole Kleene’s Introduction to Metamathematics

The book introduces a list of postulates for a family of formal systems without a immediate explanation on numbering scheme and the reasons behind that particular choice. These are my notes on the subject.

Postulates

There are three groups of axioms (p82), that can be used to build a variety of formal systems, mainly in an additive fashion (e.g. group A1 and A2 for predicate calculus).

Postulates
Group A1. Postulates for the propositional calculus
$$1a.$$ $$A \to (B \to A)$$ $$2.$$ $${A, A \to B \over B}$$
$$1b.$$ $$(A \to B) \to ((A \to (B \to C)) \to (A \to C))$$
$$3.$$ $$A \to (B \to A \& B)$$ $$4a.$$ $$A \& B \to A$$
$$4b.$$ $$A \& B \to B$$
$$5a.$$ $$A \to A \lor B$$ $$6.$$ $$(A \to C) \to ((B \to C)\\ \to (A \lor B \to C))$$
$$5b.$$ $$B \to A \lor B$$
$$7.$$ $$(A \to B) \to ((A \to \lnot B) \to \lnot A)$$ $$8^\circ.$$ $$\lnot \lnot A \to A$$
Group A2. Additional postulates for the predicate calculus
$$9.$$ $${C \to A(x) \over C \to \forall x A(x)}$$ $$10.$$ $$\forall x A(x) \to A(t)$$
$$11.$$ $$A(t) \to \exists x A(x)$$ $$12.$$ $${A(x) \to C \over \exists x A(x) \to C}$$
Group B. Additional postulates for number theory
$$13.$$ $$A(0) \& \forall x (A(x) \to A(x')) \to A(x)$$
$$14.$$ $$a' = b' \to a = b$$ $$15.$$ $$\lnot a' = 0$$
$$16.$$ $$a = b \to (a = c \to b = c)$$ $$17.$$ $$a = b \to a' = b'$$
$$18.$$ $$a + 0 = a$$ $$19.$$ $$a + b' = (a + b)'$$
$$20.$$ $$a \cdot 0 = 0$$ $$21.$$ $$a \cdot b' = a \cdot b + a$$

Restrictions:

  • for \(9\) and \(12\): \(C\) does not contain \(x\) free
  • for \(10\) and \(11\): \(t\) is a term free for \(x\) in \(A(x)\)

The postulates give a list of:

  • (plain) axioms: a formula, e.g. \(a + 0 = a\)
  • axiom schemas: formula rules, e.g. \(A \& B \to A\), where \(A\) and \(B\) can be any formula
  • rules of inference: e.g. \({A, A \to B \over B}\) where the formula below the line is called immediate consequence and from the formulas above the line is called a premise (they still follow the pattern of axiom schemas, i.e. \(A\) and \(B\) can be any formula)

Axiom schemas is a metamathematical device of specifying an infinite class of axioms.

The postulates can be used to build deductions. Given a set of formulas called assumptions, deductions are (non-empty) sequences of formulas where each formula is either an assumption, an axiom, matches an axiom schema or matches immediate consequence for premise formulas preceding it in the sequence. The deduction is said to be a deduction for the last formula in the sequence.

Proofs are deductions with no assumptions (i.e. only using the postulates).

Propositional calculus

The first group, A1, are Hilbert-style postulates for propositional calculus.

It uses a variety of logical operations. In fact:

  • 1a and 1b relate to implication \(\to\) rules
  • 3, 4a and 4b relate to conjunction \(\&\)
  • 5a, 5b and 6 relate to disjunction \(\lor\)
  • 7 and \(8^\circ\) relate to negation \(\lnot\)

One could use less of the logical operators, even a single primitive operator symbol (p139). There are two such single primitive operator choices: alternative denial (NAND) or joint denial (NOR). The first observation is that using a single primitive operator, would make a verbose set of axioms even more low level. For another reason see intuitionism below.

Of the logic operators, the truth table for the (material) implication is not intuitive:

$$A$$ $$B$$ $$A \to B$$
False False True
False True True
True False False
True True True

In particular when \(A\) is false, for statements like “if 1 + 1 = 0 then Paris is the capital of France”, it’s not clear intuitively that they should be true.

But when looking at statements like “for all \(x\) if \(x\) is a positive odd number, then \(x^2\) is a positive odd number”, then:

  • providing an \(x\) that is not a positive odd number (a false statement) cannot be used to refute.
  • the only refutation by example is to provide a positive odd number for which the square is not a positive odd.

There is only one rule of inference, 2. This is called modus ponens, the history of which goes back to antiquity.

\[\tag{2} {A, A \to B \over B}\]

It means that in a deduction if you have \(A\), and you also have \(A \to B\) then you also have \(B\). Later he introduces the notation \(A, A \to B \vdash B\) for that.

In the system described here there is a close linked between the material implication logical operator (\(\to\)) and yields (\(\vdash\)): sometimes the word implication is used for either of them.

I find it easier to grasp it’s logic when the implication is phrased instead in terms of negation and logical OR.

\[\tag{2'} {A, \lnot A \lor B \over B}\]

You could have a formal system without rules of inference (rule 2 in this case). However deduced formulas will only get longer.

Instead of rule \(1b\): \(\tag{1b} (A \to B) \to ((A \to (B \to C)) \to (A \to C))\)

other Hilbert-style systems use other variations e.g.:

\[\tag{1b'} (A \to (B \to C)) \to ((A \to B) \to (A \to C))\]

but they have the same meaning: if you have \(A \to B\) and \(A \to (B \to C)\) then by rule 2 (modus ponens) you can deduce \(A \to C\)

Postulates \(5a\) is the dual-converse of \(4a\). This is explained in the 2nd part of the Corollary Theorem 8 (p124): “If \(\vdash E \to F\) then \(\vdash F' \to E'\) (where \(E'\) and \(F'\) are obtained from \(E\) and \(F\) by interchanging \(\&\) and \(\lor\) throughout)”. Same for \(5b\) and \(4b\).

Deduction theorem

Very high level the deduction theorem says that “implies implies implies”.

There are two symbols for which we used the word “imply”:

  • \(\to\) which is part of the formal system, is a symbol like \(\&\) and \(\lor\) for which eventually one can write a truth table, with the catch that false values for the left side result in true for the implication formula.
  • \(\vdash\) is used in the meta discussion, means that there is a deduction from the formulas on the left to the formulas on the right, with the catch that when talking about a deduction one has to specify the formal system (and therefore the allowable deduction rules). This is sometimes done by using as a subscript for \(\vdash\) the name of the specific formal system that a deduction refers to (e.g. for a formal system named \(K\) use \(\vdash_K\).

The role of 1a, 1b and 2 is to prove the deduction theorem for the propositional calculus formal system.

Concretely one can prove:

DEDUCTION THEOREM: For the propositional calculus, if \(\Gamma, A \vdash B\) > then \(\Gamma \vdash A \to B\).

In more words: if there is a deduction from a set of propositions and A reaching B, then there is a deduction from that set of proposition reaching \(A \to B\).

The proof of this theorem is done by induction on the length of the subsidiary deduction and it takes case by case the question “how did we prove the last formula in the deduction, by which rule?”.

\(\Gamma, A \vdash B\) is a subsidiary deduction, it is different and does not necessarily appear within the final deduction. Adding new axioms impacts subsidiary deductions rules.

A direct deduction rule remains correct when a system is extended by adding new axioms (p94). That’s because a direct deduction rule only states that certain deduction can be constructed and new axioms only change the situation by providing additional means of constructing the same deduction.

But for for a rule that employs a subsidiary deduction, such as the deduction theorem, adding new axioms can create cases of subsidiary deductions and raise the question if they can be matched in the resulting deduction (p95).

That’s why the deduction theorem is proved for propositional logic, but when the system is extended it with more postulates the proof needs to be revisited.

In practice there are many cases of postulates that cause no problem, but rules of inference add cases to the proof of the deduction theorem, which has to be then revisited.

That explains why in group A1 above for the propositional calculus there is a single rule of inference (rule 2).

A deduction theorem for the predicate calculus (which also includes group A2), where there are more rules of inference, is possible, but it comes with some additional restrictions.

DEDUCTION THEOREM: For the predicate calculus, if \(\Gamma, A \vdash B\), with the free variables not varied (i.e. held constant) for the assumption formula \(A\), then \(\Gamma \vdash A \to B\).

See here for the definition of varied.

There are of course weaker variants e.g. requiring \(A\) to be closed dispenses with the more complex “not varied” requirement.

There is choice for a postulate system that uses a substitution rule instead of the axiom schemas used in Kleene’s system (though rules of inference like modus ponens would still remain an axiom schema). He claims that a consequence of using the substitution rule as a postulate is that even for the propositional calculus the deduction theorem comes with restrictions similar to the ones for the deduction theorem for the predicate calculus (p140).

Link to natural deduction

The name “natural deduction” is misleading: there is nothing natural about it. It’s just that decomposing the formal system to low level postulates makes proofs quite long, the natural deduction rules are higher level and generally allow for shorter proofs.

The weird numbering scheme \(1a\), \(1b\) etc. was chosen to match the layout for the “natural deduction” rules (p98), shown below with numbering. The explanation for this numbering scheme is alluded to in Lemma 11 (p106).

Natural deduction rules derived from the postulates
Rules for the propositional calculus
Introduction Elimination
Implication \(1.\) If \(\Gamma, A \vdash B\)
then \(\Gamma \vdash A \to B\)
\(2.\) \(A, A \to B \vdash B\)
Modus ponens
Conjunction \(3.\) \(A, B \vdash A \& B\) \(4.\) \(A \& B \vdash A\)
\(A \& B \vdash B \)
Disjunction \(5.\) \(A \vdash A \lor B\)
\(B \vdash A \lor B\)
\(6.\) If \(\Gamma, A \vdash C\) and \(\Gamma, B \vdash C\)
then \(\Gamma \vdash A \lor B \vdash C\)
Proof by cases
Negation \(7.\) If \(\Gamma, A \vdash B\) and \(\Gamma, A \vdash \lnot B\)
then \(\Gamma \vdash \lnot A\)
Reductio ad absurdum
\(8^\circ.\) \(\lnot \lnot A \vdash A\)
Discharge of double negation
Additional rules for the propositional calculus
Introduction Elimination
Generality \(9.\) \(A(x) \vdash^x \forall x A(x)\) \(10.\) \(\forall x A(x) \vdash A(t)\)
Existence \(11.\) \(A(t) \vdash \exists x A(x)\) \(12.\) If \(\Gamma(x), A(x) \vdash C\)
then \(\Gamma(x), \exists x A(x) \vdash^x C\)

Restrictions:

  • the superscript as in \(\vdash^x\) in generality-introduction (\(9\)) and in existence-elimination (\(12\)) that variation might happen for \(x\) (i.e. \(x\) is not necessarily held constant) in constructing the resulting deduction
  • for predicate calculus in subsidiary deductions the free variables must not be varied (i.e. held constant) for the assumption formula to be discharged. Subsidiary deductions are employed for implication-introduction (\(1\)), disjunction-elimination (\(6\)), negation-introduction (\(7\)), and for existence-elimination (\(12\)). These restrictions are related to the restriction for the deduction rule for predicate calculus (that happens to be the implication-introduction \(1\) above).

The postulate numbering scheme makes sense from the link to the “natural deduction”-like rules as follows:

  • \(1\) in the natural deduction is the deduction theorem, requiring postulates \(1a\), \(1b\) and \(2\) for it’s proof
  • \(2\) in the natural deduction is postulate \(2\)
  • \(3\) in the natural deduction also requires postulate \(3\)
  • \(4\) in the natural deduction also requires postulates \(4a\) and \(4b\)
  • \(5\) in the natural deduction also requires postulates \(5a\) and \(5b\)
  • etc.

though in general \(1a\), \(1b\) and \(2\) are additionally used for most of the natural deduction rules.

In Kleene’s system, the “natural deduction”-like rules are deduced from the postulates above.

When we looked at the deduction theorem it turns out that induction rules in particular cause issues with it’s proof, but other the other postulates it’s easy to check that they do not. If the natural deduction rules are taken as a basis, as some logic course do, this issue of the impact of adding more rules is not as clear.

Choices for postulate 8 on negation and intuitionism

To start with, “\(8^\circ\): \(\lnot \lnot A \to A\)” is the choice made by Kleene for this formal system (the classical system), presumably for similarities with the natural deduction rule for the discharge of double negation (“\(\lnot \lnot A \vdash A\)”).

Another choice could have been “\(A \lor \lnot A\)” (named “the law of excluded middle” which is equivalent to \(8^\circ\) (see p120): either could have been chosen for his classical system.

Postulate 8 has a mark \(^\circ\) to indicate further choices for formal systems. It is related to intuitionism.

Kleene provides a sympathetic view to intuitionism and in IM he explores the consequences of the inuitionistic choices. Intuitionism is a misleading term because it sounds mystical and illogical, it not about intuition in general, actually it turns out that specific claims can be identified such as the rejection of the negation rules above (e.g. due to objections to extending rules that apply to finite sets to infinite ones).

Intuitionists reject the rule \(8^\circ\), but accept the following rule “\(8^I\): \(\lnot A \to (A \to B)\) (“from falsehood, anything follows”). If in the classical system we replace rule \(8^\circ\) with \(8^I\) (while keeping the other rules) we obtain a different formal system: the (corresponding) intuitionistic system.

Rule \(8^I\) can be deduced in the classical system accepting \(8^\circ\), but not the other way around (see p101).

When using rule \(8^I\) in the intuitionistic system, the equivalent natural deduction rule is “\(A, \lnot A \vdash B\)” called “the rule of weak \(\lnot\)-elimination”, which reads: any formula is deducible from a contradiction. The rule of weak \(\lnot\)-elimination can be proven in the classical system as well.

The importance of the weak \(\lnot\)-elimination in formal systems is as follows.

A formal systems having a negation symbol, e.g. \(\lnot\), is said to be (simply) consistent) if for no formula both \(A\) and \(\lnot A\) are provable in the system; and to be (simply) inconsistent if for some formula both \(\vdash A\) and \(\vdash \lnot A\).

For formal systems that have \(\&\)-elimination and weak \(\lnot\)-elimination (either as postulates or as derived rules) this is equivalent to (simply) consistent if there is some unprovable formula; (simply) inconsistent if every formula is provable (see p124-125).

Intuitionists accept however the axiom schema \(7: (A \to B) \to ((A \to \lnot B) \to \lnot A)\)” (“the law of contradiction”).

Another choice linked to intuitionism is the number of logical operators in propositional calculus (e.g. many or two or a single primitive one as discussed above). It turns out that it the intuitionist propositional calculus none of the four operators can be expressed in terms of the remaining ones (p141).

Predicate calculus

This requires postulates in group A1 and A2. It introduces two quantifiers: \(\exists\) and \(\forall\).

Theoretically one could use only one of them, but similarly to the boolean operators it would choose to go for a lower level outside the scope of the book. Also both of them are required for the inuitionist logic, where one does not suffice.

This is a first-order logic, which means that the quantifiers apply to variables of values \(x\) as opposed also predicates/properties.

It introduces two rules of inference, \(9\) and \(12\), with consequences for the deduction theorem touched above.

Under the (heuristic) interpretation that \(\forall\) is a conjunction over the domain and \(\exists\) is a disjunction over the domain, the postulates 3 - 6 for \(\&\) and \(\lor\) suggest the postulates 9 - 12 for \(\forall\) and \(\exists\), allowing for certain differences in detail. The analogy is clearer in the “natural deduction”-like rules (see p179).

Similar to the propositional calculus there is the option base the predicate calculus on a substitution rule (see note on p179 and the relatively complex ch34, Theorem 15 in particular).

Number theory

The postulates in group A3 follow a Peano style. The tick sign is the successor (i.e. “plus one”).

  • Postulate \(13\) is used for the induction rule (see below)
  • Postulates \(14\) to \(17\) relate to equality
  • Postulates \(18\) and \(19\) relate to addition
  • Postulates \(20\) and \(21\) relate to multiplication

Addition and multiplication are defined recursive/inductive way that includes a base case, for when \(b\) is \(0\), and definition of the operation for \(b'\) in terms of the same operation for just \(b\).

Induction

Induction is used at two levels. It is used informally as for proving metamathematical theorems. Then it appears as a metamathematical rule for the formal system above as:

INDUCTION RULE: Let \(x\) be a variable, \(A(x)\) be a formula, and \(\Gamma\) be a list of formulas not containing \(x\) free. If \(\Gamma \vdash A(0)\) and \(\Gamma, A(x) \vdash A(x')\) with the free variables held constant for A(x), then \(\Gamma \vdash A(x)\).

The induction rule is proven as such:

  1. \(\Gamma, A(x) \vdash A(x')\): given
  2. \(\Gamma \vdash A(x) \to A(x')\): from 1 by implication introduction (x held constant for A(x) in 1
  3. \(A(x) \to A(x') \vdash^x \forall x (A(x) \to A(x')\): by generality introduction
  4. \(\Gamma \vdash \forall x (A(x) \to A(x'))\): from 2 and 3 (superscript discarded as \(\Gamma\) does not contain \(x\) free)
  5. \(\Gamma \vdash A(0)\): given
  6. \(\Gamma \vdash A(0) \& \forall x (A(x) \to A(x'))\): from 4 and 5 by conjunction introduction
  7. \(A(0) \& \forall x (A(x) \to A(x')) \to A(x)\): Postulate \(13\)
  8. \(\Gamma \vdash A(x)\): from 6 and 7 by implication elimination

References

Stephen Cole Kleene: Introduction to Metamathematics (Ishi Press: 2009 reprint)

Thanks to Alex Kruckman for answer on StackExchange

Stanford Encyclopedia of Philosophy: Intuitionistic Logic