Objects: Inheritance and Overriding

INF 441 - TD 5

 Login :  Mot de passe :

Context: Presburger arithmetic

This TD is designed to introduce you to objects in OCaml, particularly in the use of inheritance and overriding to achieve a high degree of concision in repetitive code. The problem we will be looking at is building a decision procedure for Presburger arithmetic, which is a formal language for expressing mathematical statements about integers. The particular assertions we consider are built from:

  1. Integer expressions built from integer constants ($-10$, $0$, $42$ etc.) and integer-valued variables ($x$, $y$, etc.) that are combined together with:
    1. Addition ($+$), subtraction ($-$), and integer negation ($\sim$)
    2. Multiplication ($\cdot$) of any expression by integer constants. Note that any arbitrary pair of expressions cannot be multiplied, just multiplication with constants.
  2. Boolean expressions, which we sometimes call assertions, that are built from:
    1. Equalities and inequalities between integer expressions ($<$, $\le$, $>$, $\ge$, $=$, and $\neq$).
    2. Conjunction ($\wedge$) and disjunction ($\vee$) of boolean expressions
    3. Truth ($\top$) and falsehood ($\bot$)
    4. Implication ($\Rightarrow$) between boolean expressions
    5. Negation ($\lnot$) of a boolean expression
    6. Universal ($\forall$) and existential ($\exists$) quantification over boolean expressions

Here is an example of an assertion in this language:

$\forall x. x \ge 5 \Rightarrow \exists y. \exists z. x = 2y + 3z$.

It asserts that every integer larger than 5 can be built out of a combination of integer multiples of 2 and 3. If we had only 2 and 3 centime coins, for instance, we could assemble any integer amount of money larger than 5 centimes. The program we are going to build will take any such assertion that is closed, meaning that every variable is explicitly quantified, and decide whether it is true or false, i.e., whether or not it describes a mathematical property of integers.

Warmup: representing the syntax and printing

Before charging into solving the problem, let us first show how to represent integer and boolean expressions. Start a new module called Syntax in syntax.ml

(** identifiers *)
type ident = string

(** integer valued expressions *)
type intex =
  | Const of int                         (** constants *)
  | Var   of ident                       (** variables *)
  | Add   of intex * intex
  | Sub   of intex * intex
  | Times of intex * intex
  | Neg   of intex                       (** integer negation *)

(** relations between integer expressions *)
type rel = Eq | Lt | Leq | Gt | Geq | Neq

(** boolean valued expressions, aka assertions *)
type boolex =
  | Rel    of rel * intex * intex        (** relations *)
  | And    of boolex * boolex | True     (** conjunction and truth *)
  | Or     of boolex * boolex | False    (** disjunction and falsehood *)
  | Impl   of boolex * boolex            (** implication *)
  | Not    of boolex                     (** boolean negation *)
  | Forall of ident * boolex             (** universal *)
  | Exists of ident * boolex             (** existential *)

Unfortunately, this syntax is rather difficult for us to read. What we would like to do is to convert it into a string so that we can print it out. However, when we do this we have to be careful to preserve the meaning. For example, the expression Times (Add (Var "x", Var "y"), Var "z") should be printed as: "(x + y) * z" and not as "x + y * z".

Print Trees

We will do this by building print trees which will be used to store the hierarchical structure of things to be printed together with priorities. A print tree is, abstractly, a tree of node instances, but we will not explicitly build these trees. In this warmup exercise we will just consider the creation of objects and invoking methods.

The basic idea is as follows: we will endow each node with a priority, which is a (signed) integer. Any node with a negative priority is already well-bracketed, meaning that no matter where it occurs in a string its meaning does not change. For non-negative priorities, whenever a low priority node occurs as a child of a high priority node, when printing it out the low priority node must be surrounded in parentheses ("(" and ")"). For example, if Times is converted to a node of priority 3 and Add to a node of priority 2, then Times (Add (Var "x", Var "y"), Var "z") is printed as "(x + y) * z", because the + is a lower priority node than the *.

We go one step further and also deal with (left-) associative binary operators as follows: whenever we have two nodes $m$ and $n$ with the same non-zero priority, both associative, and $m$ is a binary operator node with $n$ its left child, then we avoid parenthesizing $n$. For example, we would convert $(x + y) + z$ as "x + y + z", since $+$ is left-associative, but for $x + (y + z)$ we would print it as "x + (y + z)". For a non-associative operator such as $-$, we would not omit the parentheses in such cases.

Put this in print_tree.mli and then complete the corresponding module in print_tree.ml.

class type node = object
  (** is it (left-) associative? *)
  method is_assoc  : bool
  (** what is its priority? *)
  method priority  : int
  (** what is its stringified form? *)
  method to_string : string
end

val exact   : string -> node
val unop    : int -> string -> node -> node
val binop   : int -> bool -> string -> node -> node -> node

Warning: calls to to_string should not create new objects!

Here are a few tests that you can put at the end of the file.

module Test = struct
  let x = exact "x"
  let y = exact "y"
  let z = exact "z"
  let xpy = binop 1 true " + " x y
  let () = assert (xpy#to_string = "x + y")
  let xpytz = binop 2 true " * " xpy z
  let () = assert (xpytz#to_string = "(x + y) * z")
  let xty = binop 2 true " * " x y
  let () = assert (xty#to_string = "x * y")
  let xtypz = binop 1 true " + " xty z
  let () = assert (xtypz#to_string = "x * y + z")
  let xtytz = binop 2 true " * " xty z
  let () = assert (xtytz#to_string = "x * y * z")
  let xmynz = binop 2 false " m " x (binop 3 false " n " y z)
  let () = assert (xmynz#to_string = "x m y n z")
  let xnymz = binop 3 false " n " x (binop 2 false " m " y z)
  let () = assert (xnymz#to_string = "x n (y m z)")
end
Submit print_tree.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

Printing expressions

  1. Using the Print_tree module, write the following two functions in Syntax.
    val intex_to_node  : intex -> node
    val boolex_to_node : boolex -> node
    Use the following table.
    operator kind priority string form
    $\forall x.$, $\exists x.$ unary 0 "\\A x. ", "\\E x. "
    $\Rightarrow$ binary-nonassociative 1 " => "
    $\lor$ binary-associative 2 " | "
    $\land$ binary-associative 3 " & "
    $=$, $\neq$, $<$, $\leq$, $>$, $\geq$ binary-nonassociative 4 " = ", " != ", " < ", " <= ", " > ", " >= "
    $\lnot$ unary 5 " ! "
    $+$ binary-associative 6 " + "
    $-$ binary-nonassociative 7 " - "
    $\cdot$ binary-associative 8 " * "
    $\sim$ unary 9 "~ "
    In addition, print $\top$ as "true" and $\bot$ as "false".
  2. Hence derive the following functions.
    val intex_to_string  : intex -> string
    val boolex_to_string : boolex -> string
Submit syntax.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

Traversal

A very common operation on assertions is to find all the free variables in the assertion, which is to say all the variables that are not explicitly quantified. Here are a few examples.

assertion free variables
$x + y = z$ $\{ x, y, z\}$
$\exists y. x + y = z$ $\{ x, z \}$
$\forall x. \forall z. \exists y. x + y = z$ $\{ \}$

To compute this set we need to traverse the entire syntax tree in order, accumulating the variables as we go. It turns out that objects and inheritance in OCaml is ideally suited for writing all such traversal functions. The basic idea is to write a basic traversal class that computes the identity function over the whole tree. Then, for any other traversal function that is basically the same as the identity except for a small change to particular cases, we merely inherit the default traverser.

  1. In syntax.ml finish the implementation of the following class.
    class ['acc] traverse = object (self)
      method intex (acc : 'acc) (e : intex) : 'acc =
        match e with
        | Const k -> acc
        | Add (e1, e2) ->
            let acc = self#intex acc e1 in
            let acc = self#intex acc e2 in
            acc
        (* ... *)
    
      method boolex (acc : 'acc) (e : boolex) : 'acc =
        (* ... *)
    end
  2. Use the traverse class to finish the following object.
    exception Is_free
    
    let check_nonfree x = object (self)
      inherit [unit] traverse as super
    
      method! intex acc e =
        match e with
        | Var y when x = y -> raise Is_free
        (* ... *)
    
      method! boolex acc e =
        (* ... *)
    end
    In the overridden methods intex and booleax of check_nonfree, make sure that you handle the case of the following constructors specially: Var, Forall, and Exists. None of the other constructors are relevant for variables, so you can just appeal to the corresponding methods of the superclass, which you can invoke using super#intex and super#boolex.
  3. In a similar vein, write the following object.
    module Idts : Set.S with type elt := ident = Set.Make (String)
    
    let find_vars : Idts.t traverse = (* ... *)
    You may assume that all Forall and Exists nodes in a boolex introduce globally unique identifiers. Thus, you will never have to deal with assertions such as $(x = y) \land (\exists x. x = z)$. Insead, you will be given $(x = y) \land (\exists x'. x' = z)$.
  4. Finally, write the following object:
    exception Malformed_times of intex
    
    let check_times : unit traverse = (* ... *)
    The purpose of this object is to find all the Times nodes and make sure that at least one of the arguments to the node is a constant. If the condition fails, raise the Malformed_times exception with one of the invalid Times nodes as its argument.
Submit syntax.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

Mapping

The traversal class you have written does not allow you to modify your syntax, only extract information from it. To perform a modification you need to create a different kind of traverser called a mapper that extends the functionality of the traverser.

  1. Complete the implementation of the following class.
    class ['acc] map = object (self)
      method intex (acc : 'acc) (e : intex) : 'acc * intex =
        match e with
        | Add (e1, e2) ->
            let (acc, e1) = self#intex acc e1 in
            let (acc, e2) = self#intex acc e2 in
            (acc, Add (e1, e2))
        (* ... *)
    
      method boolex (acc : 'acc) (e : boolex) : 'acc * boolex =
        (* ... *)
    end
  2. If an arithmetic expression contains no variables, it can be computed using ordinary integer operations. Likewise, if an assertion contains no variables, it can simply be evaluated to yield either $\top$ or $\bot$. Modify the map class that you wrote above to perform this simplification eagerly. In other words, make sure that you never create a node other than Const, True, or False if it can be avoided.
  3. Miniscoping is a transformation that tries to make the scope of $\forall$ and $\exists$ as small as possible by making use of the following equivalences.
    $\forall x. (e \land f)$$\equiv$$(\forall x. e) \land (\forall x. f)$ $\exists x. (e \lor f)$$\equiv$$(\exists x. e) \lor (\exists x. f)$
    $\forall x. (e \lor g)\ \equiv\ \forall x. (g \lor e)$$\equiv$$(\forall x. e) \lor g$ $\exists x. (e \land g)\ \equiv\ \exists x. (g \land e)$$\equiv$$(\exists x. e) \land g$
    $\forall x. g$$\equiv$g $\exists x. g$$\equiv$g
    where in the last two rows, $g$ is such that $x$ is not free in $g$. Use the map and check_nonfree classes to write the following object that performs miniscoping.
    val miniscope : unit map
  4. Propositional simplification is a transformation of an assertion that makes use of the following equivalences.
    $e \land \top \equiv \top \land e$$\equiv$$e$ $e \land \bot \equiv \bot \land e$$\equiv$$\bot$
    $e \lor \bot \equiv \bot \lor e$$\equiv$$e$ $e \lor \top \equiv \top \lor e$$\equiv$$\top$
    $\top \Rightarrow e$$\equiv$$e$ $\bot \Rightarrow e\ \equiv\ e \Rightarrow \top$$\equiv$$\top$ $e \Rightarrow \bot$$\equiv$$\lnot e$
    Use these equivalences to write the following object.
    val prop_simplify : unit map
  5. Strength reduction is a transformation that changes every relation into $<$ by means of the following transformations.
    $e = f$$\equiv$$(e < f + 1) \land (f < e + 1)$ $e \neq f$$\equiv$$(e < f) \lor (f < e)$
    $e \leq f$$\equiv$$e < f + 1$ $e \geq f$$\equiv$$f < e + 1$
    $e > f$$\equiv$$f < e$
    Use them to write the following object:
    val strength_reduce : unit map
  6. Negation-normalizing is a transformation of an assertion that removes all Not nodes with the following equivalences:
    $\lnot (e = f)$$\equiv$$e \neq f$ $\lnot (e < f)$$\equiv$$e \geq f$ $\lnot (e > f)$$\equiv$$e \leq f$
    $\lnot (e \neq f)$$\equiv$$e = f$ $\lnot (e \leq f)$$\equiv$$e > f$ $\lnot (e \geq f)$$\equiv$$e < f$
    $\lnot (e \land f)$$\equiv$$\lnot e \lor \lnot f$ $\lnot (e \lor f)$$\equiv$$\lnot e \land \lnot f$ $\lnot (e \Rightarrow f)$$\equiv$$e \land \lnot f$
    $\lnot \top$$\equiv$$\bot$ $\lnot \bot$$\equiv$$\top$ $\lnot \lnot e$$\equiv$$e$
    $\lnot (\forall x. e)$$\equiv$$\exists x. \lnot e$ $\lnot (\exists x. e)$$\equiv$$\forall x. \lnot e$
    Use this information to write this mapper:
    val negation_normalize : bool map
    Hint: the bool type parameter can be used to track the parity of the transform, i.e.. the number of $\lnot$s that you have seen so far.
  7. Write the following function that attempts all of the above transformations until the assertion does not change any further.
    val simplify : boolex -> boolex
    You can use OCaml structural equality (=) to compare assertions.
Submit syntax.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

Optional: Adding divisibility

Before we can take the final steps to implement the main decision procedure for Presburger arithmetic, we need to make a slight modification to the language: adding a new relation of divisibility by a constant. To see why this is needed, consider an assertion such as $\exists x. \exists y. \exists z. 4x = 3y + 7z$. To decide this we may need to consider all possible integers for $x, y, z$. However, we can rewrite this to the following equivalent assertion:

$\exists x. \exists y. \exists z. (x = y + z) \land (4 \mid x) \land (3 \mid y) \land (7 \mid z)$

where the form $k \mid e$ stands for “$k$ (a constant) divides $e$ (an expression)”. This change allows us to check only every fourth integer for $x$, every third for $y$, and every seventh for $z$, cutting down the possibilities by a lot. Of course for all integers this doesn't really improve much, but we will see later a different way to make the set of possibilities for each variable finite, where this reduction in the search space really helps. Moreover, such divisibility constraints are critical to finitizing the domains.

We will now add divisibility ($\mid$) and non-divisibility ($\nmid$) to the relation symbols. This will require you to modify code you have already written. Fortunately, the OCaml type-checker will helpfully point out exactly what bits of code must adapt to this change.

  1. Extend the type rel to:
    (** relations between integer expressions *)
    type rel = Eq | Lt | Leq | Gt | Geq | Neq | Div | Ndiv
  2. Modify the printer in the Syntax module following this table.
    operator kind priority string form
    $\mid$, $\nmid$ binary-nonassociative 4 " : ", " % "
  3. Update both traverse and map classes to match. Remember that map should eagerly evaluate assertions that are free of variables, so for any assertion of the form $k \mid e$ or $k \nmid e$ where both $k$ and $e$ are already integer constants, you should evaluate it $\top$ or $\bot$.
  4. Add the following traverser that makes sure that the first argument to a Div or Ndiv relation is an integer constant different from zero.
    exception Malformed_div of boolex
    
    let check_div : unit traverse = (* ... *)
  5. Your strength reducer does not need to change. However, you need to update the negation normalizer by incorporating the following equivalences.
    $\lnot (k \mid e)$$\equiv$$k \nmid e$ $\lnot (k \nmid e)$$\equiv$$k \mid e$
Submit syntax.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

Optional: Normalizing arithmetic

Suppose we have a variable $x$ and a quantifier-free assertion $e$ where $x$ is free, which we will write as $e[x]$. If we want to solve for $x$, it would be very useful if we could make sure that every arithmetic expression where $x$ occurs is put in one of the following four forms.

$x < e$ $e < x$ $k \mid x + e$ $k \nmid x + e$

Collecting terms involving $x$

We will get there in two steps. In the first step we will rewrite $e[x]$ so that all relations involving $x$ are in one of the following four forms.

$j\cdot x < e$ $e < j\cdot x$ $k \mid j\cdot x + e$ $k \nmid j\cdot x + e$

where $j$ is some integer constant $> 0$. Write this as the following function.

val collect_terms : ident map

You should use the traverse and map objects you defined earlier, possibly writing new ones if needed. Here is a suggestion for how to proceed for every Rel node:

Submit syntax.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

Resetting the coefficient of $x$ to $1$

Next, we need to change the coefficient of $x$ to 1 throughout $e[x]$. Here is the procedure.

  1. Let $j_1, \dotsc, j_n$ be all the coefficients of $x$ that occur in $e[x]$. Set $\delta = lcm(j_1, \dotsc, j_n)$ to be the least common multiple of the $j_i$, i.e., the smallest integer that is divisible by all the $j_i$.
  2. Change the relations so that the coefficients of $x$ become $\delta$. For instance, you would change $j\cdot x < e$ to $\delta\cdot x < (\delta / j) \cdot e$. Call this expression $e'[\delta\cdot x]$, that is, it is an expression where every occurrence of $x$ has coefficient $\delta$.
  3. Finally, change $e'[\delta\cdot x]$ to $e'[x] \land (\delta \mid x)$. to get an expression equivalent to $e[x]$, but where every coefficient of $x$ is 1.

Here is an example:

$(2 x < z + 6) \land (y - 1 < 3x) \land (4 \mid 5x + 1)$
becomes after step 2 (because $lcm(2,3,5) = 30$):
$(30 x < 15 z + 90) \land (10y - 10 < 30x) \land (24 \mid 30x + 6)$
becomes after step 3:
$(x < 15 z + 90) \land (10y - 10 < x) \land (24 \mid x + 6) \land (30 \mid x)$

Write this as the following function:

val reset_coefficient : ident map

Optional: Cooper's algorithm

We now have all the tools we need to solve the original problem of deciding whether a closed assertion of Presburger arithmetic is true or false. To build the full solution, let us first consider the case of just deciding $\exists x. e[x]$ where $e[x]$ is a quantifier-free assertion where $x$ is free. (If $x$ is not free, then miniscoping would get rid of the $\exists$.)

First rewrite $e[x]$ using the normalization routines in the previous section into an equivalent form $f[x]$ where all relations involving $x$ are of one of the following forms, which we label as forms A ... D.

(A) (B) (C) (D)
$x < e$ $e < x$ $k \mid x + e$ $k \nmid x + e$

Let $\delta = lcm\{k : k \text{ is the divisor in a (C) or (D) form relation in } f[x] \}$.

Case: $f[x]$ is satisfiable with infinitely many decreasing $x$

In other words, if $f[k]$ is true for some $k$, then there is a $j < k$ such that $f[j]$ is also true.

  1. Let $f_\infty[x]$ be formed from $f[x]$ by replacing:
    • relations of form (A) with $\top$
    • relations of form (B) with $\bot$
  2. Let $f_1$ be the formula:

    $f_\infty[1] \lor f_\infty[2] \lor \dotsm \lor f_\infty[\delta]$

    Note that $f_1$ is a formula where $x$ does not occur any more.

Compute $f_1$ from $f[x]$ in terms of this function:

val no_smallest : ident -> boolex -> boolex
Submit syntax.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

Case: there is a smallest value for $x$ satisfying $f[x]$

  1. Let $B = \{ e : e < x \text{ is a (B) relation in } f[x] \}$.
  2. Let $g[j]$ be the formula:

    $\bigvee_{b \in B} f[b + j]$

  3. Let $f_2$ be the formula:

    $g[1] \lor g[2] \lor \dotsm \lor g[\delta]$

Compute $f_2$ from $f[x]$ in terms of this function:

val smallest : ident -> boolex -> boolex
Submit syntax.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

Putting it together

Now, the main procedure is an instance of map where we remove all quantifiers starting from the innermost quantifiers and going outwards.

  1. For a boolean expression $\exists x. e_0$, first rewrite (recursively) $e_0$ into a quantifier-free form $e[x]$. Then, replace the whole expression with: $f_1 \lor f_2$, where $f_1$ and $f_2$ are computed as in the two cases above.
  2. For a boolean expression $\forall x. e_0$, first change it to $\lnot \exists x. \lnot e_0$, then handle the inner $\exists$ as in the previous case.

Write the procedure in terms of the following object and the derived function

val eliminate_quantifiers : unit map

val decide : boolex -> bool

Note: you will have to eagerly apply all the normalization and rewriting functions to make sure that the preconditions required to compute $f_1$ and $f_2$ are met.

Submit syntax.ml

Le nom du fichier à déposer
Il faut se connecter avant de pouvoir déposer

A walk through of an example

Here are the steps of the procedure for an example $\exists x. e[x]$ where $e[x]$ is:

$((3x + 1 < 10) \lor (7x - 6 > 7)) \land (2 \mid x)$

First, we collect all the occurrences of $x$ and set its coefficients to the same value, giving:

$((21x < 63) \lor (39 < 21x)) \land (42 \mid 21 x)$

After setting all coefficients of $x$ to 1 we get:

$((x < 63) \lor (39 < x)) \land (42 \mid x) \land (21 \mid x)$

From this, we see that $f_\infty[x]$ is:

$(\top \lor \bot) \land (42 \mid x) \land (21 \mid x)\ \equiv\ (42 \mid x) \land (21 \mid x)$

Note that $\delta = lcm(21, 42) = 42$, so we have:

$f_1 = \bigvee_{j = 1}^{42} ((42 \mid j) \land (21 \mid j))$

On the other hand, the set $B$ is $\{39\}$, so we have:

$f_2 = \bigvee_{j = 1}^{42} ((39 + j < 63) \lor (39 < 39 + j)) \land (42 \mid 39 + j) \land (21 \mid 39 + j)$

Hence, since $\exists x. e[x] \equiv f_1 \lor f_2$, and because $f_1$ is obviously true since $(42 \mid 42) \land (21 \mid 42)$, it is the case that $\exists x. e[x]$ is true.