Thomas Letan

PhD Student in Computer Science

Playing Around With Program and Dependent Types

For the past few weeks, I have been playing around with Program. The recent publications[1] [2] of Robert C. Martin (author of The Clean Code) have convinced me to write here something about it. I have been trying hard enough to use Program to build myself an “intuition”, meaning I am not randomly typing tactics until C-c C-n works. This text is an attempt to restore what I’ve learned.

Note: the code works with Coq 8.6, but not with Coq 8.5 (at least, the extract definition does not). The complete gist can be found here (8.5) or here (8.6). I want to apologize if this text lacks explanation sometimes. I found it very difficult to write, probably because I have struggled a lot of this matter and tried a lot of different strategies. I really wanted to finish it, even if it is not perfect, because I think it can be a good starting point for a conversation.

The Theory

If I had to explain Program, I would say Program is the heir of the refine tactic. It gives you a convenient way to embed proofs within functional programs that are supposed to fade away during code extraction. But what do I mean when I say “embed proofs” within functional programs? I found two ways to do it.

Invariant

First, we can define a record with one or more fields of type Prop. By doing so, we can constrain the values of other fields. Put another way, we can specify invariant for our type. For instance, in SpecCert, I have defined the memory controller’s SMRAMC register as follows:

(* see: specs/intel/5th-gen-core-family-datasheet-vol-2.pdf *)
Record SmramcRegister := {
  d_open: bool;
  d_lock: bool;

  (* following proofs are justified by:
   * “When D_LCK=1, then D_OPEN is reset to 0 and all
   * writeable fields in this register are locked (become RO).”
   *)
  lock_is_close: d_lock = true -> d_open = false;
}.

So lock_is_closed is an invariant I know each instance of SmramcRegister will have to comply with, because every time I will construct a new instance, I will have to prove lock_is_closed holds true. For instance:

Definition lock
           (reg: SmramcRegister)
  : SmramcRegister.
  refine ({| d_open := false
           ; d_lock := true
           |}).
  (* goal: true = true -> false = false *)
  trivial.
Defined.

Using tactics to define functions is not straightforward. In particular, as gasche noticed on reddit, it is hard to know what the extracted code will look like. Also, you end up mixing definition and proposition proofs.

From that perspective, Program helps. Indeed, the lock function can also be defined as follows:

Require Import Program.

Program Definition lock'
        (reg: SmramcRegister)
  : SmramcRegister :=
  {| d_open := false
   ; d_lock := true
   |}.

Pre and Post Conditions

Another way to “embed proofs in a program” is by specifying pre- and post-conditions for its component. In Coq, this is done using Sigma-types. I have already written a while ago about Sigma-types.

On the one hand, a precondition is a proposition a function input has to satisfy in order for the function to be applied. For instance, a precondition for head : list A -> A the function that returns the first element of a list l requires l to contain at least one element. We can write that using a Sigma-type.

Definition head
           {A: Type}
           (l: list A | l <> nil)
  : A.

On the other hand, a post condition is a proposition a function output has to satisfy in order for the function to be correctly implemented. In this way, head should in fact return the first element of l and not something else.

We can write such a specification without Program.

Definition head
           {A: Type}
           (l: list A | l <> nil)
  : { a: A | exists l', cons a l' = (proj1_sig l) }.

Because { l: list A | l <> nil } is not the same as list A, we cannot just compare l with cons a l'. One benefit on Program is to deal with it using implicit coercion, so we can write:

Program Definition head
           {A: Type}
           (l: list A | l <> nil)
  : { a: A | exists l', cons a l' = l }.

Note that for its type inference, it uses the left operand of = so you cannot write l = cons a l'.

Now that head have been specified, we have to implement it. Without Program, one can use refine, whereas using Program we can just write regular ML-like code:

Program Definition head
        {A: Type}
        (l: list A | l <> nil)
  : { a: A | exists l', cons a l' = l } :=
  match l with
  | cons a l' => a
  | nil       => !
  end.
Next Obligation.
  exists l'.
  reflexivity.
Qed.

I want to highlight several things here:

  • We return a (of type A) rather than a Sigma-type, then Program is smart enough to wrap it. To do so, it tries to prove the post condition and because it fails, we have to do it ourselves (this is the Obligation we solve after the function definition.)
  • The nil case is absurd regarding the precondition, we tell Coq that using the bang (!) symbol.

We can have a look at the extracted code:

(** val head : 'a1 list -> 'a1 **)

let head = function
| Nil -> assert false (* absurd case *)
| Cons (a, _) -> a

The implementation is pretty straightforward, but the pre- and post conditions have faded away. Also, the absurd case is discarded using an assertion. This means one thing: head should not be used directly from the Ocaml world. “Interface” functions have to be total.

The Practice

I have challenged myself to build a strongly specified library. My goal was to define a type vector : nat -> Type -> Type such as vector A n is a list of n instance of A.

Inductive vector (A: Type) : nat -> Type :=
| vcons  {n:nat} : A -> vector A n -> vector A (S n)
| vnil : vector A 0.

Arguments vcons [A n] _ _.
Arguments vnil [A].

I had three functions in mind: take, drop and extract. I learned few lessons. The first one is maybe the most important: do not use Sigma-types, Program and dependent-types together. From my point of view, Coq is not yet ready for this. Maybe it is possible to make those three work together, but I have to admit I did not find out how. As a consequence, my preconditions are defined as extra arguments.

To be able to specify the post conditions my three functions and some others, I first defined nth:

Program Fixpoint nth
        {A: Type}
        {n: nat}
        (v: vector A n)
        (i: nat)
        {struct v}
  : option A :=
  match v with
  | vcons x r => match i with
                 | 0 => Some x
                 | S i' => nth r i'
                 end
  | vnil => None
  end.

With nth, it is possible to give a very precise definition of take:

Program Fixpoint take
        {A:      Type}
        {n:      nat}
        (v:      vector A n)
        (e:      nat)
        (Hbound: e <= n)
        {struct v}
  : { v': vector A e | forall i : nat, i < e -> nth v' i = nth v i } :=
  match e with
  | 0 => vnil
  | S e' => match v with
            | vcons x r => vcons x (take r e' _)
            | vnil => !
            end
  end.
Next Obligation.
  apply (Nat.nlt_0_r i) in H.
  destruct H.
Qed.
Next Obligation.
  apply (le_S_n e' wildcard').
  exact Hbound.
Qed.
Next Obligation.
  induction i.
  + reflexivity.
  + apply e0.
    apply lt_S_n in H.
    exact H.
Defined.
Next Obligation.
  apply Nat.nle_succ_0 in Hbound.
  exact Hbound.
Qed.

I’ve learned another lesson here. I wanted to define the post condition as follows: { v': vector A e | forall (i : nat | i < e), nth v' i = nth v i }. I quickly realized it is a very bad idea. The goals and hypotheses become very hard to read and to use.

(* Extraction Implicit take [A n].
   Extraction take *)

(** val take : 'a1 vector -> nat -> 'a1 vector **)

let rec take v = function
| O -> Vnil
| S e' ->
  (match v with
   | Vcons (_, x, r) -> Vcons (e', x, (take r e'))
   | Vnil -> assert false (* absurd case *))

Then I could tackle drop in a very similar manner:

Program Fixpoint drop
        {A:   Type}
        {n:   nat}
        (v:  vector A n)
        (b:   nat)
        (Hbound: b <= n)
  : { v': vector A (n - b) | forall i, i < n - b -> nth v' i = nth v (b + i) } :=
  match b with
  | 0 => v
  | S n => (match v with
           | vcons _ r => (drop r n _)
           | vnil => !
           end)
  end.
Next Obligation.
  rewrite <- minus_n_O.
  reflexivity.
Qed.
Next Obligation.
  induction n;
    rewrite <- eq_rect_eq;
    reflexivity.
Qed.
Next Obligation.
  apply le_S_n in Hbound.
  exact Hbound.
Qed.
Next Obligation.
  apply Nat.nle_succ_0 in Hbound.
  destruct Hbound.
Defined.

The proofs are easy to write, and the extracted code is exactly what one might want it to be:

(* Extraction Implicit drop [A n].
   Extraction drop *)
(** val drop : 'a1 vector -> nat -> 'a1 vector **)

let rec drop v = function
| O -> v
| S n ->
  (match v with
   | Vcons (_, _, r) -> drop r n
   | Vnil -> assert false (* absurd case *))

But Program really shone when it comes to implementing extract. I just had to combine take and drop.

Program Definition extract
        {A:      Type}
        {n:      nat}
        (v:      vector A n)
        (b e:    nat)
        (Hbound: b <= e <= n)
  : { v': vector A (e - b) | forall i, i < (e - b) -> nth v' i = nth v (b + i) } :=
  take (drop v b _) (e - b) _.
Next Obligation.
  omega.
Qed.
Next Obligation.
  omega.
Qed.
Next Obligation.
  remember (drop v b (extract_obligation_1 n b e (conj l l0))) as vd.
  remember (take (`vd) (e - b) (extract_obligation_2 n b e (conj l l0))) as vt.
  induction vd as [v' Hv'].
  rewrite <- (Hv' i).
  simpl in vt.
  induction vt as [v'' Hv''].
  simpl.
  apply Hv''.
  exact H.
  omega.
Qed.

To be honest, I think the last proof can be better, but anyway, it is easy to convince ourselves from the extract definition that our implementation is bug-free.

I was pretty happy, so I tried some more. Each time, using nth, I managed to write a precise post condition and to prove it holds true. For instance, given map to apply a function f to each element of a vector v:

Program Fixpoint map
        {A B: Type}
        {n:   nat}
        (v:   vector A n)
        (f:   A -> B)
  : { v': vector B n | forall i, nth v' i = option_map f (nth v i) } :=
  match v with
  | vnil => vnil
  | vcons a v => vcons (f a) (map v f)
  end.
Next Obligation.
  induction i.
  + reflexivity.
  + apply e.
Defined.

Once again, the extracted code is what I wanted it to be once you tell Coq that the vector size is implicit.

I also managed to specify and write append:

Program Fixpoint append
        {A:   Type}
        {n m: nat}
        (v:   vector A n)
        (v':  vector A m)
  : { v'' : vector A (n + m) | forall i, (i < n /\ nth v'' i = nth v i)
                                      \/ (n <= i /\ nth v'' i = nth v' (i - n))} :=
  match v with
  | vnil => v'
  | vcons a v => vcons a (append v v')
  end.
Next Obligation.
  destruct (lt_dec i 0).
  * omega.
  * right.
    split; [ omega |].
    rewrite Nat.sub_0_r.
    reflexivity.
Qed.
Next Obligation.
  destruct (lt_dec i (S wildcard')).
  left.
  split; [ exact l |].
  induction i.
  + reflexivity.
  + assert (i < wildcard' /\ nth x i = nth v i
         \/ wildcard' <= i /\ nth x i = nth v' (i - wildcard')) by (apply (o i)).
    destruct H as [[Hi Hn]|[Hi Hn]].
    * exact Hn.
    * omega.
  + right.
    split.
    omega.
    induction i; [ omega |].
    assert (i < wildcard' /\ nth x i = nth v i
         \/ wildcard' <= i /\ nth x i = nth v' (i - wildcard')) by (apply (o i)).
    destruct H as [[Hi Hn]|[Hi Hn]].
    * omega.
    * rewrite Hn.
      rewrite Nat.sub_succ.
      reflexivity.
Defined.

This time, the proofs are bigger, but it is not the main issue. Because, this time, I was not happy with the extracted code, as it was not possible to make the size of the second vector implicit. It is quite normal, because m is used to construct the resulting vector. It is hidden by Coq, but vcons take the size of the inner vector, so when we append two vectors, we need to add m to each vcons of v. I think it is possible to find our way out. For instance, we could probably define length such that:

Program Definition length
        {A: Type}
        {n: nat}
        (v: vector A n)
  : { n': nat | n = n' } :=
  match v with
  | @vcons _ m _ _ => S m
  | vnil => 0
  end.

Then, we should never use the vector size directly, but rather length v.

Finally, I tried to implement map2 that takes a vector of A, a vector of B (both of the same size) and a function f: A -> B -> C and returns a vector of C.

I struggled a lot but could not find a way to use Program to implement it. I had to use tactics to achieve my ends, so I am not very happy about it. However, using a proper configuration, the extracted code looks quite good.

Definition option_app
           {A B: Type}
           (opf: option (A -> B))
           (opx: option A)
  : option B :=
  match opf, opx with
  | Some f, Some x => Some (f x)
  | _, _ => None
end.

Program Fixpoint map2
        {A B C: Type}
        {n:     nat}
        (v:     vector A n)
        (v':    vector B n)
        (f:     A -> B -> C)
        {struct v}
  : { v'': vector C n | forall i, nth v'' i = option_app (option_map f (nth v i)) (nth v' i) } := _.
Next Obligation.
  dependent induction v; dependent induction v'.
  + remember (IHv v' f) as v''.
    inversion v''.
    refine (exist _ (vcons (f a a0) x) _).
    intros i.
    induction i.
    * cbv.
      reflexivity.
    * simpl.
      apply (H i).
  + refine (exist _ vnil _).
    cbv.
    reflexivity.
Qed.

And the extracted code:

(* Extraction Inline map2_obligation_1 solution_left solution_right simplification_heq.
   Extraction map2. *)
(** val map2 : nat -> 'a1 vector -> 'a2 vector -> ('a1 -> 'a2 -> 'a3) -> 'a3 vector **)

let rec map2 n v v' f =
  let rec f0 _ v0 v'0 f1 =
    match v0 with
    | Vcons (n0, y, v1) ->
      (match v'0 with
       | Vcons (_, y0, v2) -> let v'' = f0 n0 v1 v2 f1 in Vcons (n0, (f1 y y0), v'')
       | Vnil -> assert false (* absurd case *))
    | Vnil ->
      (match v'0 with
       | Vcons (_, _, _) -> assert false (* absurd case *)
       | Vnil -> Vnil)
  in f0 n v v' f

Is It Usable?

This post mostly gives the “happy ends” for each function. I think I tried to hard for what I got in return and therefore I am convinced Program is not ready (at least for a dependent type, I cannot tell for the rest). For instance, I found at least one bug in Program logic (I still have to report it). Have a look at the following code:

Program Fixpoint map2
        {A B C: Type}
        {n:     nat}
        (v:     vector A n)
        (v':    vector B n)
        (f:     A -> B -> C)
        {struct v}
  : vector C n :=
  match v with
  | _ => vnil
  end.

It gives the following error:

Error: Illegal application:
The term "@eq" of type "forall A : Type, A -> A -> Prop"
cannot be applied to the terms
 "nat" : "Set"
 "0" : "nat"
 "wildcard'" : "vector A n'"
The 3rd term has type "vector A n'" which should be coercible to "nat".

That being said, it is very promising and I look forward to seeing what it will become in the future. Coming back to the blog posts of Uncle Bob, I am sure this is what he would call “a very Dark Path. »I never read deeply enough a Coq program to see if they use Program and similar approach. I know the only consequent Coq program I ever wrote is SpecCert and it is not a program but a model and its proofs. But I am eager to learn more and to find an opportunity to make this leap.

Related Threads