Professional

Home
Learn Shen
Videos
Wiki
Community
OS Kernel
OS Library
Shen Professional

The Shen Logic Lab: the Fast Theorem Prover
 

This is a portal for help in learning about components of the Shen Logic Lab.   The logic lab is a cloud based system which is maintained over the web; as with all SP components you do not have reinstall updates which flow into the system month by month.  

This page deals with the FTP (Fast Theorem Prover). This program sits in the fol package. FTP has been tested under a 4.7 GHz AMD processor as delivering over 3 million inferences per second.

The FTP is terminating and therefore incomplete.

The Language of FTP
How FTP Works
Choice Points
The Functions of FTP
PC Examples from Pelletier
The SPTTP
Demodulation

The Language of FTP

The logic of FTP is first-order logic.

1. A logical constant is
v, ~, &, =>, <=>, exists, all, =.
2. A recognised symbol is a symbol other than a logical constant.
3. A recognised symbol is a term.
4. A list composed of a recognised symbol followed by terms is a term.
5. A predicate is a recognised symbol.
6. A list composed of a predicate followed by terms or a recognised symbol (propositional case) is an atom.
7. If T and T' are terms then [T
= T'] is an atom.
8. An atom is a fol.wff.
9. If P and Q are fol.wffs and V is a recognised symbol then [
~ P], [P v Q], [P & Q], [P => Q], [P <=> Q], [all V P], [exists V P] are fol.wffs.

fol.wff is a recognised type in this package.

How FTP Works

The FTP is a theorem prover for first-order logic. FTP has two parts; a preprover P that breaks down logic problems and a main engine SPTTP that compiles first-order logic formulae to Prolog code. The main engine SPTTP (Sequent Based Prolog Technlogy Theorem Prover) is based on Stickel's Prolog Technology Theorem Prover of the 80s but adapted for sequents. SPTTP is programmed to perform bounded depth-first search up to a bound provided by the user. The performance of SPTTP is about 750,000 inferences per second per GHz.

FTP is designed to work with single conclusion sequents - that is, a set of assumptions and a conclusion. More precisely it accepts

a. A list of fol.wffs L which may be empty.
b. An fol.wff W which is the conclusion.
c. A optional list of settings which may be empty.

FTP attempts to prove

L |-
D W i.e. L entails W under D.

The preprover P is actually a little theorem prover itself, is complete for propositional calculus, and has the capacity to solve problems independently of E. It essentially works by converting all formulae in L and W to prenex form and demodulating if demodulation rules are provided (see demodulation). The formulae of L are skolemised to eliminate the existental quantifiers and W is 'reverse skolemised' to eliminate the universal quantifiers. Remaining quantifiers are eliminated in favour of free Prolog variables. What remains is in CNF. The conclusion remains in CNF. The conjunctions in the hypothesis list are decomposed into their constituent clauses. Therefore the problem becomes

Prove L |-D W where L is a list of clauses and W is a formula in CNF.

Since v and & are associative we take both as polyadic prefix operators taking (& P1,...Pn) to mean the conjunction of P1,...Pn (every element of P1,...Pn is true) and (v P1,...Pn) to mean the disjunction of P1,...Pn (at least one of P1,...Pn is true) . We write [... Pn] + L to indicate the result of appending the elements of the list [... Pn] to L. We write neg([..., Pn]) to indicate the list of the complements of [..., Pn]. We refer to as 'problems' all proof obligations generated from the process.

Here is the subsequent preprover proof rules.

1. If W = (v P1,...Pn) then prove neg([..., Pn]) + L |-D P1.
2. If W = (& P1,...Pn) then prove L |-
D P and L |-D (& ...Pn).
3. If W = (&) then L |-
D W is proved.
4. If L = [(v P1,...Pn)] + L' then prove [P] + L' |-
D W and [(v ...Pn)] + L' |-D W.
5. If L = [(v)] + L' then L |-
D W is solved.
6. If P is an element of L and P and W are unifiable, then L |-
D W is solved and the MGU s is used to dereference all remaining problems.
7. If P and (~ Q) are elements of L where P and Q are unifiable then the MGU
s is used to dereference all remaining problems.
8. If L = [(x = y)] + L' and x and y are not syntactically identical then prove
s(L'x/y |-D Wx/y) where L'x/y signifies the result of replacing x by y throughout L' using unification where the MGU is s. This MGU is used to dereference all remaining problems.
9. If L = [P] + L' then prove L' + [P] |-
D W.

These rules follow well known logical principles; 1. is simply a generalisation of the rule L |- (P v Q) if L + [(~ Q)] |- P known as the Law of Disjunctive Syllogism. 2. is the Law of & Introduction. 3. is a degenerate case; the conjunction is empty so every element is true. 4. is a generalisation of reasoning by cases. 5. is a degenerate case; the disjunction is false because in 'at least one of P1,...Pn is true' because there are no elements (i.e. (v) is therefore necessarily false just as (&) is necessarily true). 6. is a generalised instance of proof by hypothesis modulo unification. 7. is a generalised case of proof by contradiction modulo unification. 8. is a form of paramodulation. 9. is a version of the structural exchange rule that allows us to change the ordering of formulae in L by placing the first formula to the end of the remaining assumptions.

The rules are applied by the preprover to a fixpoint with the exchange rule 9 being used to rotate formulae if no rules can be applied to the leading formula in the hypothesis list but the hypothesis list L is not a list of literals.

The problems are either all solved and true is returned or a non-empty list R of problems is generated of the form L' |-D P where P is a literal and every element of L is a literal but neither 6. nor 7. can be used to solve the problem. For each such case the problem is then sent to the engine SPTTP for processing to determine if the problem can be solved relative to the background theory D. If the predicate in P is unknown to SPTTP, then the search is refused and SPTTP fails the proof obligation returning false. If the search fails to produce a solution within a depth set by the user, SPTTP returns false. If all elements in R are provable by SPTTP then true is returned.

It is optional to add the rule

prove L |-D W by proving [(all x (x = x))] + L |-D W

at the first step of proof.

Choice Points

Several of the rules of the preprover are non-deterministic; specifically 6., 7. and 8. In each case there can be choices as to which literals to unify and the wrong choice can lead to an insoluble problem. These rules thus represent choice points in the computation. The conventional wisdom would dictate that FTP should in that event backtrack and search for another solution. Much of FTP is written in a dialect of Prolog and in point of fact, rules 2. and 4. have to be added to the list of choice points. For example if we code rule 2. as:

preprover(L, [&, P | Ps]) :- ! preprover(L, P), !, preprover(L, [& | Ps]).

then the second cut will force FTP to commit to the bindings found in preprover(L, P). So this cut must be omitted if completeness is desired. A similar consideration affects rule 4.

FTP in fact contains 4 flags for controlling the degree of determinacy with each of these rules with the default being total 'don't care' determinacy. This default gives fast results at the cost returning false negatives (incompleteness). The default settings are -hypdet 1 (determinacy wrt rule 6.), -p&~pdet 1 (determinacy wrt rule 7.), -&det 1 (determinacy wrt rule 2) and -vdet 1 (determinacy wrt rule 4).

The variables in the preprover are rigid variables that retain their values. In some cases solutions are lost because variables need to be instantiated in different ways, as when, for example, a universally quantified assumption requires to be used twice with different instantiations. The accepted solution is to amplify these assumptions by creating copies that are alpha variants. FTP can do this by replacing (all x P) by ((all x P) & (all x P)) (amplification level 2 or -amplification 2) and in the conclusion (exists x P) by (exists x P) v (exists x P). The proof process then draws in the extra copies. However it is known that there is no decision procedure for determining the level of amplification needed to compute any arbitrary solution. A complete strategy for FOL without equality FTP sets -hypdet, -p&~pdet, -&det and -vdet to zero and iteratively increases amplification. Such a process can be non-terminating and in any case quite quickly exhausts memory. It is therefore not much used.

The paramodulation rule in FTP is very deterministic. It is complete providing all terms in the problem are ground, which may not always be true. However catering for the non-ground case raises combinatorial problems. FTP copes with this by arranging clauses so that identities and their surrounding formulae are more likely to be ground by delaying paramodulation till late in the proof. This is achieved by ordering assumptions according to a weighting system. Clauses are ordered in L by length, shortest first, with the exception of non-ground equalities which are placed last.

The Functions of FTP

The top level functions are

1. compile-fol of type (list fol.wff) --> boolean which compiles a background theory D to efficient code.

2. ftp of type ((list fol.wff) --> fol.wff --> (list (symbol * number)) --> boolean). ftp takes a list L of fol.wffs and an fol.wff W and attempts to prove L |-D W. If D is undefined then L |- W is attempted. By virtue of a macro, the optional list of settings can be submitted as a series. The BNF is

(ftp <list of fol.wffs> <fol.wff> <options>) or (ftp <list of fol.wffs> <fol.wff>)

where

<options> := <option> <options>
<option> :=
-depth <nn> | -amplify <nn> | -hypdet <flag> | -p&~pdet <flag> | -verbose <0-2> | -id <flag> | -vdet <flag> | -&det <flag>
<flag> := 0 | 1
<0-2> := 0 | 1 | 2
<nn> := a positive integer.

-depth sets the maximum search depth of SPTTP. The default is 5.
-hypdet, -p&~pdet, -&det 1 and -vdet, -amplify as explained set by default to 1.
-verbose at value 1 prints times together with the inferences expended. At 2 the preprover P shows its steps (this is for debugging). The default is 0.
-id 1
adds the preprover rule

prove L |-D W by proving [(all x (x = x))] + L |-D W

at the first step of proof. The default is 0.

3. fol.demodulate of type fol.wff --> fol.wff; this is defined by the user.

4. undemod of type (--> boolean), which removes all demodulation code from the system.

PC Examples from Pelletier

All these are solved by the preprover.

(23-) (ftp [] [[~ [~ p]] => p] -verbose 1)

run time: 0.0 secs
28 inferences
true

(24-) (ftp [] [[~ [p => q]] => [q => p]] -verbose 1)

run time: 0.0 secs
45 inferences
true

(25-) (ftp [] [[[p v q] => [p v r]] => [p v [q => r]]] -verbose 1)

run time: 0.0 secs
161 inferences
true

(26-) (ftp [] [p v [~ p]] -verbose 1)

run time: 0.0 secs
28 inferences
true

(27-) (ftp [] [p v [~ [~ [~ p]]]] -verbose 1)

run time: 0.0 secs
28 inferences
true

(28-) (ftp [] [[[p => q] => p] => p] -verbose 1)

run time: 0.0 secs
58 inferences
true

(29-) (ftp [[p v q] [[~ p] v q] [p v [~ q]]] [~ [[~ p] v [~ q]]] -verbose 1)

run time: 0.0 secs
717 inferences
true

(30-) (ftp [[q => r] [r => [p & q]] [p => [q v r]]] [p <=> q] -verbose 1)

run time: 0.0 secs
2709 inferences
true

(31-) (ftp [] [[[p <=> q] <=> r] <=> [p <=> [q <=> r]]] -verbose 1)

run time: 0.030975341796875 secs
12161 inferences
true

(32-) (ftp [] [[p v [q & r]] <=> [[p v q] & [p v r]]] -verbose 1)

run time: 0.0 secs
507 inferences
true

(33-) (ftp [] [[p <=> q] <=> [[q v [~ p]] & [[~ q] v p]]] -verbose 1)

run time: 0.0 secs
721 inferences
true

(34-) (ftp [] [[[p & [q => r]] => s] <=> [[[~ p] v [q v s]] & [[~ p] v [[~ r]]]] -verbose 1)

run time: 0.0 secs
1235 inferences
true

The SPTTP

THe SPTTP uses the same contrapositive compilation that Stickel uses for the PTTP but uses bounded search. The rule L |- P can be proved if [(~ P) + L] |- P can be proved is used in place of framing and ancestor resolution to exactly the same effect. The SPTTP carries around the assumption list as an extra argument. Rule 6.

6. If P is an element of L and P and W are unifiable, then L |-D W is solved and the MGU s is used to dereference all remaining problems.

is retained. Loop detection is built into the SPTTP.

Here is an example of the SPTTP on set theory. The st function gives some axioms of naive set theory; m means member of, sub subset etc. You can probably work out the rest.

(define st
{--> (list fol.wff)}
-> [[all x [~ [m x e]]]
[all x [all y [[[sub x y] & [sub y x]] <=> [eq x y]]]]
[all x [all y [[sub x y] <=> [all z [[m z x] => [m z y]]]]]]
[all x [all y [[pow x y] <=> [all z [[m z x] <=> [sub z y]]]]]]
[all x [all y [[com x y] <=> [all z [[m z x] <=> [~ [m z y]]]]]]]
[all x [all y [all z [[prod x y z] <=> [all w [[m w x] <=> [[pair w] & [[m [fst w] y] & [m [snd w] z]]]]]]]]]
[all x [all y [all z [[int x y z] <=> [all w [[m w x] <=> [[m w y] & [m w z]]]]]]]]
[all x [all y [all z [[un x y z] <=> [all w [[m w x] <=> [[m w y] v [m w z]]]]]]]] ])

Let's compile this down. This is done on my 1.3GHz laptop.

(20+) (compile-fol (st))
107 Horn clauses generated
true

(21+) (ftp [] [un a a a] -verbose 1)

run time: 0.0149993896484375 secs
17654 inferences
true : boolean

(22+) (ftp [] [int a a a] -verbose 1)

run time: 0.016002655029296875 secs
18807 inferences
true : boolean

(23+) (ftp [] [all x [all y [[pow x y] => [m y x]]]] -verbose 1)

run time: 0.03099822998046875 secs
42741 inferences
true : boolean

(24+) (ftp [] [all x [all y [[prod x y e] => [eq x e]]]] -verbose 1)

run time: 0.0 secs
1574 inferences
true : boolean

(25+) (ftp [] [all x [all y [[sub x y] => [un y x y]]]] -verbose 1)

run time: 0.046001434326171875 secs
41113 inferences
true : boolean

(26+) (ftp [] [all x [~ [pow e x]]] -verbose 1)

run time: 0.0 secs
1935 inferences
true : boolean

Two problems from the original set were not solved - by experiment these can be solved at depths 8 and 7.

(ftp [] [all x [all y [all z [[un x y z] <=> [un x z y]]]]] -verbose 1 -depth 8)

(ftp [] [all x [all y [all z [[sub z y] => [[un x y z] => [un x y y]]]]]] -verbose 1 -depth 7)

However sadly SBCL died with memory problems on these two. Only by cutting the axiom set down to those axioms needed to solve the problem was FTP able to solve both problems.

However we can choose to set up these problems differently. All the equivalences in
st apart from the first are really implicit definitions. Therefore we can demodulate all of these problems and use the first axiom as part of the explicit theory. When we do that the preprover actually solves the problems itself at great speed. This is because treating equivalences clausally is inefficient (107 clauses is a lot!). Let's see this in action.

Demodulation

The preprover P can apply demodulation (rewriting) via a user-defined function fol.demodulate to simplify a problem. Suitably programmed P can actually solve certain problems faster than E. Demodulation or rewriting to a normal form gives significant advantages in dealing with equivalences that are simply definitions. So here I enter the equivalences as definitions; I generate a fresh variable V here to avoid variable capture.

(define fol.demodulate
{fol.wff --> fol.wff}
[eq X Y] -> [[sub X Y] & [sub Y X]]
[sub X Y] -> (let V (gensym v) [all V [[m V X] => [m V Y]]])
[pow X Y] -> (let V (gensym v) [all V [[m V X] <=> [sub V Y]]])
[com X Y] -> (let V (gensym v) [all V [[m V X] <=> [~ [m V Y]]]])
[prod X Y Z] -> (let V (gensym v)
[all V [[m V X]
<=> [[pair V] & [[m [fst V] Y] & [m [snd V] Z]]]]])
[int X Y Z] -> (let V (gensym v) [all V [[m V X] <=> [[m V Y] & [m V Z]]]])
[un X Y Z] -> (let V (gensym v) [all V [[m V X] <=> [[m V Y] v [m V Z]]]])
P -> P)

(define e
{--> (list fol.wff)}
-> [[all x [~ [m x e]]]])

Off we go.

(54+) (ftp (e) [un a a a] -verbose 1)

run time: 0.0159912109375 secs
410 inferences
true : boolean

(55+) (ftp (e) [int a a a] -verbose 1)

run time: 0.0 secs
407 inferences
true : boolean

(56+) (ftp (e) [all x [all y [all z [[un x y z] <=> [un x z y]]) -verbose 1)

run time: 0.0 secs
4332 inferences
true : boolean

(57+) (ftp (e) [all x [all y [[pow x y] => [m y x]]]] -verbose 1)

run time: 0.0 secs
1188 inferences
true : boolean

(58+) (ftp (e) [all x [all y [[prod x y e] => [eq x e]]]] -verbose 1)

run time: 0.014984130859375 secs
6665 inferences
true : boolean

(59+) (ftp (e) [all x [all y [[sub x y] => [un y x y]]]] -verbose 1)

run time: 0.0 secs
848 inferences
true : boolean

(60+) (ftp (e)[all x [all y [all z [[sub z y] => [[un x y z] => [un x y y]]]]]] -verbose 1)

run time: 0.0159912109375 secs
4114 inferences
true : boolean

(61+) (ftp (e) [all x [~ [pow e x]]] -verbose 1)

run time: 0.0 secs
551 inferences
true : boolean