Professional

The Shen
Logic Lab: the Fast Theorem
Prover
The logic of FTP is firstorder
logic. fol.wff is a recognised type in this package. The FTP is a theorem prover for firstorder logic. FTP has two parts; a preprover P that breaks down logic problems and a main engine SPTTP that compiles firstorder logic formulae to Prolog code. The main engine SPTTP (Sequent Based Prolog Technlogy Theorem Prover) is based on Stickel's Prolog Technology Theorem Prover of the 80s but adapted for sequents. SPTTP is programmed to perform bounded depthfirst search up to a bound provided by the user. The performance of SPTTP is about 750,000 inferences per second per GHz. FTP is designed to work with single conclusion sequents  that is, a set of assumptions and a conclusion. More precisely it accepts a. A list of fol.wffs L which may
be empty. The preprover P is actually a little theorem prover itself, is complete for propositional calculus, and has the capacity to solve problems independently of E. It essentially works by converting all formulae in L and W to prenex form and demodulating if demodulation rules are provided (see demodulation). The formulae of L are skolemised to eliminate the existental quantifiers and W is 'reverse skolemised' to eliminate the universal quantifiers. Remaining quantifiers are eliminated in favour of free Prolog variables. What remains is in CNF. The conclusion remains in CNF. The conjunctions in the hypothesis list are decomposed into their constituent clauses. Therefore the problem becomes Prove L _{D} W where L is a list of clauses and W is a formula in CNF. Since v and & are associative we take both as polyadic prefix operators taking (& P_{1},...P_{n}) to mean the conjunction of P_{1},...P_{n} (every element of P_{1},...P_{n }is true) and (v P_{1},...P_{n}) to mean the disjunction of P_{1},...P_{n} (at least one of P_{1},...P_{n }is true) . We write [... P_{n}] + L to indicate the result of appending the elements of the list [... P_{n}] to L. We write neg([..., P_{n}]) to indicate the list of the complements of [..., P_{n}]. We refer to as 'problems' all proof obligations generated from the process. Here is the subsequent preprover proof rules. 1. If W = (v P_{1},...P_{n})
then prove neg([..., P_{n}]) + L _{D} P_{1}. These rules follow well known logical principles; 1. is simply a generalisation of the rule L  (P v Q) if L + [(~ Q)]  P known as the Law of Disjunctive Syllogism. 2. is the Law of & Introduction. 3. is a degenerate case; the conjunction is empty so every element is true. 4. is a generalisation of reasoning by cases. 5. is a degenerate case; the disjunction is false because in 'at least one of P_{1},...P_{n }is true' because there are no elements (i.e. (v) is therefore necessarily false just as (&) is necessarily true). 6. is a generalised instance of proof by hypothesis modulo unification. 7. is a generalised case of proof by contradiction modulo unification. 8. is a form of paramodulation. 9. is a version of the structural exchange rule that allows us to change the ordering of formulae in L by placing the first formula to the end of the remaining assumptions. The rules are applied by the preprover to a fixpoint with the exchange rule 9 being used to rotate formulae if no rules can be applied to the leading formula in the hypothesis list but the hypothesis list L is not a list of literals. The problems are either all solved and true is returned or a nonempty list R of problems is generated of the form L' _{D} P where P is a literal and every element of L is a literal but neither 6. nor 7. can be used to solve the problem. For each such case the problem is then sent to the engine SPTTP for processing to determine if the problem can be solved relative to the background theory D. If the predicate in P is unknown to SPTTP, then the search is refused and SPTTP fails the proof obligation returning false. If the search fails to produce a solution within a depth set by the user, SPTTP returns false. If all elements in R are provable by SPTTP then true is returned. It is optional to add the rule prove L _{D} W by proving [(all x (x = x))] + L _{D} W at the first step of proof. Several of the rules of the preprover are nondeterministic; specifically 6., 7. and 8. In each case there can be choices as to which literals to unify and the wrong choice can lead to an insoluble problem. These rules thus represent choice points in the computation. The conventional wisdom would dictate that FTP should in that event backtrack and search for another solution. Much of FTP is written in a dialect of Prolog and in point of fact, rules 2. and 4. have to be added to the list of choice points. For example if we code rule 2. as: preprover(L, [&, P  Ps]) : ! preprover(L, P), !, preprover(L, [&  Ps]). then the second cut will force FTP to commit to the bindings found in preprover(L, P). So this cut must be omitted if completeness is desired. A similar consideration affects rule 4. FTP in fact contains 4 flags for controlling the degree of determinacy with each of these rules with the default being total 'don't care' determinacy. This default gives fast results at the cost returning false negatives (incompleteness). The default settings are hypdet 1 (determinacy wrt rule 6.), p&~pdet 1 (determinacy wrt rule 7.), &det 1 (determinacy wrt rule 2) and vdet 1 (determinacy wrt rule 4). The variables in the preprover are rigid variables that retain their values. In some cases solutions are lost because variables need to be instantiated in different ways, as when, for example, a universally quantified assumption requires to be used twice with different instantiations. The accepted solution is to amplify these assumptions by creating copies that are alpha variants. FTP can do this by replacing (all x P) by ((all x P) & (all x P)) (amplification level 2 or amplification 2) and in the conclusion (exists x P) by (exists x P) v (exists x P). The proof process then draws in the extra copies. However it is known that there is no decision procedure for determining the level of amplification needed to compute any arbitrary solution. A complete strategy for FOL without equality FTP sets hypdet, p&~pdet, &det and vdet to zero and iteratively increases amplification. Such a process can be nonterminating and in any case quite quickly exhausts memory. It is therefore not much used. The paramodulation rule in FTP is very deterministic. It is complete providing all terms in the problem are ground, which may not always be true. However catering for the nonground case raises combinatorial problems. FTP copes with this by arranging clauses so that identities and their surrounding formulae are more likely to be ground by delaying paramodulation till late in the proof. This is achieved by ordering assumptions according to a weighting system. Clauses are ordered in L by length, shortest first, with the exception of nonground equalities which are placed last. The top level functions are 1. compilefol of type (list fol.wff) > boolean which compiles a background theory D to efficient code. 2. ftp of type ((list fol.wff) > fol.wff > (list (symbol * number)) > boolean). ftp takes a list L of fol.wffs and an fol.wff W and attempts to prove L _{D} W. If D is undefined then L  W is attempted. By virtue of a macro, the optional list of settings can be submitted as a series. The BNF is
3. fol.demodulate of type fol.wff > fol.wff; this is defined by the user. 4. undemod of type (> boolean), which removes all demodulation code from the system. PC Examples from Pelletier
THe SPTTP uses the same contrapositive compilation that Stickel uses for the PTTP but uses bounded search. The rule L  P can be proved if [(~ P) + L]  P can be proved is used in place of framing and ancestor resolution to exactly the same effect. The SPTTP carries around the assumption list as an extra argument. Rule 6. 6. If P is an element of L and P and W are unifiable, then L _{D} W is solved and the MGU s is used to dereference all remaining problems. is retained. Loop detection is built into the SPTTP. Here is an example of the SPTTP on set theory. The st function gives some axioms of naive set theory; m means member of, sub subset etc. You can probably work out the rest.
Let's compile this down. This is done on my 1.3GHz laptop.
Two problems from the original set were not solved  by experiment these can be solved at depths 8 and 7.
However sadly SBCL died with
memory problems on these two. Only by cutting the axiom
set down to those axioms needed to solve the problem was
FTP able to solve both problems. The preprover P can apply demodulation (rewriting) via a userdefined function fol.demodulate to simplify a problem. Suitably programmed P can actually solve certain problems faster than E. Demodulation or rewriting to a normal form gives significant advantages in dealing with equivalences that are simply definitions. So here I enter the equivalences as definitions; I generate a fresh variable V here to avoid variable capture.
Off we go.
