TD1: Automated reasoning techniques in OCaml

Your first practical / homework is to write a SAT-solver, i.e. a solver for mathematical problems expresses in propositional logic.

The SAT problem consists in deciding, given a formula in Conjunctive Normal Form, if it is satisfiable. We recall the basic definitions:

- a
*literal*`l`

is a variable`x`

or its negation`x̅`

- a
*clause*`C`

is a disjunction of literals`l`

_{1}∨ ... ∨ l_{n} - a
*formula in CNF*is a conjunction of clauses`C`

_{1}∧ ... ∧ C_{m} - a
*valuation*is a function assigning`⊤`

or`⊥`

to each variable - a formula is
*satisfiable*if there exists a valuation that satisfies it; it is*unsatisfiable*otherwise

`Exercise.ml`

.
We give examples of different difficulties to - By far, the simplest is to edit the file
`Exercise.ml`

with Emacs+Tuareg+Merlin.In the Emacs+Tuareg+Merlin working environment, the keyboard shortcut ctr-x-e (also available via the menus Tuareg-Interactive Mode-Evaluate Phrase) will feed your current OCaml command to the OCaml interpreter.

The result of the run should appear in a new emacs buffer. - Under Linux and probably Mac, you can also compile your file by running the script build.sh. It will produce the executable file
`Exercise.opt`

, which can for instance be run from the command line.

`⊤`

or `⊥`

.
`dpll.ml`

.

For testing purposes, we provide a DIMACS parser (DIMACS is the standard format for SAT-soslving problems):
`DIMACS.ml`

,
as well as a couple of tests `tests.zip`

(but you can download other DIMACS files from the Internet).

`DPLLT.zip`

.

It contains a file `dpllT.ml`

which is very similar
to `dpll.ml`

.

The main difference is that literals may be taken as literals
from a particular *theory* T, for which we have a
**decision procedure** (a function of type (Lit.t list ->
bool) deciding whether a conjunction of literals is consistent
(outputs true) or inconsistent (outputs false) with the theory T.

This decision procedure is passed as an extra argument to
function `solve`

.

You need to complete the function solve, which now takes the decision procedure as an extra argument, and which is now of type (solutionType -> bool) -> cnfType -> solutionType

In order to test your algorithm, we have included, in the subdirectory LRA of the archive

- another implementation
`LRALit`

of the module type`LitType`

, implementing the literals of Linear Rational Arithmetic (LRA); - a decision procedure for Linear Rational Arithmetic (function
`decproc`

).

Note: When reaching the test code, the simple OCaml interpreter available from emacs may not be able to fetch the code in subdirectory LRA.
A more robust approach is to use `ocamlbuild`

from the command line, to compile a multi-file project:
In the directory with files `dpllT.ml`

and `_tags`

, run the command `ocamlbuild dpllT.native`

(compiling to native code) or
`ocamlbuild dpllT.byte`

(compiling to bytecode). If no error occurs, it will produce a file `dpllT.native`

or `dpllT.byte`

which you can run on your OS.

Among the tests, we have modelled the 2016 American elections and expressed as a CNF (built on LRA literals) the question "There is a tie between Clinton and Trump". Determining whether this CNF is SAT answers the question of whether a tie is possible. We added a piece of code to count the number of solutions (if any). For this particular question, there should be 5 solutions.

`all_form.ml`

.

Some explanations about the provided code:

Implementing `∀x p(x)`

as we write it by hand is not convenient
because we have to manipulate formulae as equivalence classes of trees,
the equivalence being that generated by the renaming of bound variables.
This is not impossible, e.g. working on canonical representatives in the
classes, but is quite complicated, error-prone and actually slightly
inefficient.

Most implementations use the solution proposed in the file:
A term `t`

is not considered independently from the context in which it is
being used. If it is considered as a term occurring in a formula, in
such a way that one of its variables "`x`

" is bound by a quantifier
`∀x`

or `∃x`

further up in that formula, then in the
implementation, the quantifier `∀x`

or `∃x`

is simply encoded
as `∀`

or `∃`

, forgetting the name of the bound variable; at the
same time, every occurrence of `x`

in the term `t`

is encoded as `(Bound i)`

,
where `i`

is an integer: the number of quantifiers that one has to cross
when going from `x`

to the quantifier that binds `x`

in the formula tree.
For instance, `∀x (x=x)`

is implemented as
`∀(Bound 0 = Bound 0)`

, because there are no quantifiers to cross
when going from an occurrence of `x`

to `∀x`

.
On the other hand `∀x ∃y (x=y)`

is implemented as
`∀∃(Bound 1 = Bound 0)`

because from `y`

to its binder, there are no quantifiers to cross, but
from `x`

to its binder, there is 1 quantifier to cross (that which binds
`y`

). Note that sometimes, not all occurrences of `x`

gets the same `i`

, e.g.
in `∀x ((x=x) ∧ ∃y (x=y))`

, which is implemented as
`∀((Bound 0 = Bound 0) ∧ ∃(Bound 1 = Bound 0))`

.

Now if the term `t`

is considered in a context that does not bind `x`

, so `x`

is really a free variable, not just in the term `t`

, but also free in the
context in which `t`

is being used, then `x`

is implemented as `(Free "x")`

.

Note that the manipulation of bound variables, managing the integers i
of the `(Bound i)`

constructs, is not something you should worry about,
because the only point of a variable `x`

bound by `∀x`

or `∃x`

is at some point to be substituted by a term `t'`

, and we give you the
substitution function:
If you see the formula `(∀A)`

implementing `∀x A(x)`

, and you
want get into `A`

itself, you need to say with which `x`

you are doing this,
i.e. with which term `t'`

you want to substitute `x`

. In that case you call
`(subst t' 0 A)`

to have what you would write on paper as `A(t')`

.
In case you don't want to substitute it by any interesting term and
really want to work on `A(x)`

, where `x`

is really a free variable, you can
just call `(subst (Free "x") 0 A)`

.