Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heapster] Changes pre 9/3/21 demo #1448

Merged
merged 5 commits into from
Sep 3, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
301 changes: 300 additions & 1 deletion heapster-saw/README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,301 @@
# heapster-saw
Implementation of the Heapster type system of separation types inside SAW, including a translation to SAW core

Implementation of the Heapster type system of separation types inside SAW,
including a translation to SAW core.

This remainder of this README contains general information about using
m-yac marked this conversation as resolved.
Show resolved Hide resolved
Heapster about the `examples` directory contained here.

## Building

You will need to follow the instructions in the top-level README to download
or build a SAW binary, of which Heapster is a part.

If you intend to interact with any of Heapster's Coq output, you will also need
to follow the instructions in the README in the `saw-core-coq` subdirectory.
Specifically, after installing the dependencies, you will need to run the
following (from this directory):
```bash
cd ../saw-core-coq/coq
make
```

## Using Heapster

This section will walk through the process of using Heapster on some code in a
C file. This will involve generating LLVM bitcode from your file, writing a SAW
script to type-check your code with Heapster, and writing a Coq file to prove
things about the generated functional specification(s).

### Generating LLVM bitcode

The input to Heapster is an LLVM bitcode (`.bc`) file. To generate LLVM
bitcode from a C file, run the following:
```bash
clang -emit-llvm -g -c my_file.c
```
Be aware that the resulting bitcode may depend on your `clang` version and your
operating system. In turn, this means the Heapster commands in your SAW script
and the proofs in your Coq file may also be dependent on how and where the
bitcode is generated. For this reason, we provide bitcode files for every
example in the `examples` directory.

### Type-checking using a SAW script

To use Heapster on your generated bitcode, you can either write a SAW script
(e.g. `my_file.saw`) or start a SAW interactive session. This document will
use writing a SAW script as an example, but the commands are the same in
either case.

To see the documentation for any of the commands mentioned here, type `:help`
followed by the name of the command in a SAW interactive session.

In order to use Heapster commands, you must begin with:
```
enable_experimental;
```
You can then load your example bitcode file into Heapster using the following:
```
env <- heapster_init_env "my_file" "my_file.bc";
```
This file will not go into detail about the process of actually type-checking a
function with Heapster. Instead, we will briefly discuss a few of the main
commands used, as well as some examples.

One of the simplest Heapster commands is `heapster_define_perm`, which defines
a new named permission which can then be used in Heapster types. As an
example, here's a permission which describes an 64-bit integer value:
m-yac marked this conversation as resolved.
Show resolved Hide resolved
```
heapster_define_perm
env
"int64"
" "
"llvmptr 64"
"exists x:bv 64.eq(llvmword(x))";
```
It's first argument is the Heapster environment, the second is its name, the
m-yac marked this conversation as resolved.
Show resolved Hide resolved
third is its arguments (of which there are none), the fourth is its type, and
m-yac marked this conversation as resolved.
Show resolved Hide resolved
the fifth is its definition.

To define permissions which can describe unbounded data structures, you can use
the `heapster_define_recursive_perm` command. As an example, here is how to
describe a linked list of 64-bit words using this command:
```
heapster_define_recursive_perm
env
"List64"
"rw:rwmodality"
"llvmptr 64"
["eq(llvmword(0))", "ptr((rw,0) |-> int64<>) * ptr((rw,8) |-> List64<rw>)"]
"List (Vec 64 Bool)"
"foldList (Vec 64 Bool)"
"unfoldList (Vec 64 Bool)";
```
Its first four arguments are the same as for `heapster_define_perm`, its
fifth argument contains its different inductive cases (in this case, a `List64`
is either a null pointer, or a pointer to an `Int64` and another `List64`),
and its final three arguments are its translation into SAW core. Here the
SAW core definitions used are from the SAW core prelude, but if you need new
SAW core definitions, you will need to use the following command instead of
`heapster_init_env`:
```
env <- heapster_init_env_from_file "my_file.sawcore" "my_file.bc";
```

Finally, to actually type-check a function you can use
`heapster_typecheck_fun`. Here's an example using the `is_elem` function from
m-yac marked this conversation as resolved.
Show resolved Hide resolved
`examples/linked_list.c` and the permissions we defined above:
```
heapster_typecheck_fun
env
"is_elem"
"().arg0:int64<>, arg1:List64<R> -o arg0:true, arg1:true, ret:int64<>";
```
The heapster type given has three parts, the context of ghosts, the input
m-yac marked this conversation as resolved.
Show resolved Hide resolved
permissions, and the output permissions. Here there are no ghosts used, so the
context is empty (the `()` at the start). The input permissions state that the
two arguments to `is_elem` are an `int64` and a read-only `List64`,
respectively. The output permissions state that the two arguments are
unconstrained after returning (the vacuous `true` permissions) and that the
returned value is an `int64`.

Note that for more complicated examples, usually examples involving loops,
the `heapster_block_entry_hint` command will also need to be used in order for
the `heapster_typecheck_fun` command to succeed. In the case of functions with
loops, this hint corresponds to a loop invariant. Additionally, such examples
will also often require your unbounded data structure to be defined as a
reachability permission, using `heapster_define_reachability_perm`, instead of
just as a recursive permission. See `examples/iter_linked_list.saw` for some
examples of using the commands mentioned in this paragraph.

Once you're finished, use the following command to export all the type-checked
functions in the current environment as functional specifications in Coq. By
convention, we add a `_gen` suffix to the filename.
```
heapster_export_coq env "my_file_gen.v";
```

### Verifying in Coq

To interact with the generated Coq code, you will first need to set up your Coq
environment. Make a file named `_CoqProject` with the following contents,
where `PATH_TO_SAW` is replaced by the path to the top-level `saw-script`
directory:
```
-Q PATH_TO_SAW/saw-core-coq/coq/generated/CryptolToCoq CryptolToCoq
-Q PATH_TO_SAW/saw-core-coq/coq/handwritten/CryptolToCoq CryptolToCoq

my_file_gen.v
```
This file is already set up for the examples in the `examples` directory.

By convention, the file which contains proofs about functions in your file
has the `_proofs` suffix added (e.g. `my_file_proofs.v`). This file should
also be added to your `_CoqProject`.

In your `_proofs` file, you will want to import at least the following:
```coq
From CryptolToCoq Require Import SAWCorePrelude.
From CryptolToCoq Require Import CompMExtra.
```

You can then either load your file using `Load` (e.g. `Load "my_file_gen.v"`)
or make `-Q . Namespace`, where `Namespace` is whatever string you want, the
third line of your `_CoqProject` and use `Import` (e.g.
`Require Import Namespace.my_file_gen`). You will then need to import the
module from your generated file as well as the SAW core prelude module – all
together these lines should look like:
```coq
Load "my_file_gen.v".
Import my_file.
Import SAWCorePrelude.
```

Often the first thing you want to verify is that the generated specification
has no errors. Errors can appear because of errors in the LLVM bitcode or
because of errors in the type-checking process. Having errors of the first kind
in the generated specification is not an issue, but it must be proved that they
are never reached. Sometimes, a precondition and/or loop invariant must be
added in order for such a proof to be completed, see
`examples/arrays_proofs.v` for an example. In a generated spec, these errors
often look like the following:
```
errorM "Failed Assert at arrays.c:19:14"
```
Seeing an error of the second kind in your generated specification means you
need to revise the types you wrote in your SAW script. These errors are
usually quite distinctive in the generated Coq code, for example:
```
errorM "At is_elem.c:26:12 ($10 = call $9($3, $7);)
Regs: $9 = fn @ , $3 = ptr @ , $7 = ptr4 @
Input perms: top_ptr:eq(LLVMword ghost_bv),
top_ptr1:ptr((R,0) |-> int64<>)*ptr((R,8) |-> List64<R>),
ghost_frm:llvmframe [C[&l]:8, C[&x]:8, local_ptr:8],
fn:().
arg1:int64<>, arg:List64_nonnull<R>
-o
arg1:true, arg:true, ret:int64<>, ptr:eq(LLVMword ghost_bv),
ptr4:eq(ptr3), local_ptr:ptr((W,0) |-> true),
C[&x]:ptr((W,0) |-> eq(ptr)), C[&l]:ptr((W,0) |-> eq(ptr1)),
ghost_bv:true, ptr3:List64<R>, ptr1:eq(top_ptr1)
Could not prove (). ptr:int64<>, ptr4:List64_nonnull<R>

proveVarImplH: Could not prove
ptr3:eq(LLVMword 0)
-o
(). ptr((R,0) |-> int64<>)*ptr((R,8) |-> List64<R>)"
```

To prove no-errors you will prove that your generated specification **refines**
the specification `noErrorsSpec`, which is defined as follows:
```coq
Definition noErrorsSpec : CompM A := existsM (fun x => returnM x).
```

For example, the statement of no-errors for `is_elem` is the following:
```coq
Lemma no_errors_is_elem : refinesFun is_elem (fun _ _ => noErrorsSpec).
```

You can think of specifications in the `CompM` monad (such as `noErrorsSpec`
and everything Heapster generates) as sets of possible executions, and the
refinement relation as the subset relation on these sets. In this way,
`noErrorsSpec` represents the set of all computations which return some pure
value (and thus cannot contain any calls to `errorM : string -> CompM A`), and
thus, proving that a specification refines `noErrorsSpec` represents proving
that it always returns some pure value.

For the proof of `no_errors_is_elem`, we simply need to unfold both sides of
the refinement, then call the automated tactic `prove_refinement`, imported
from `CompMExtra`:
```coq
Proof.
unfold is_elem, is_elem__tuple_fun, noErrorsSpec.
prove_refinement.
Qed.
```

The `prove_refinement` tactic is not guaranteed to solve all goals. Sometimes,
the goals it leaves over can be completed with simple Coq tactics, and other
times the left over goals can help you discover that the lemma you're trying
to prove is false, and therefore you need to return to type-checking, or
add/revise your precondition and/or loop invariant. In this case, the tactic
was able to solve all goals.

Note that when the generated specification has functions bound by a `letRecM`,
there must be a `letRecM` with matching shape on the right of a refinement.
To help set this up, you can use the `prove_refinement_match_letRecM_l` tactic,
which will generate as many goals as there are functions needed for the
`letRecM` on the right hand side. As an example, here is an excerpt from
`examples/iter_linked_list_proofs.v`, where a single matching `letRecM`
function is needed:
```coq
Lemma no_errors_is_elem : refinesFun is_elem (fun _ _ => noErrorsSpec).
Proof.
unfold is_elem, is_elem__tuple_fun.
prove_refinement_match_letRecM_l.
- exact (fun _ _ => noErrorsSpec).
unfold noErrorsSpec.
prove_refinement.
Qed.
```
It is good practice to defer unfolding the right hand side until after
the `letRecM` functions have been added to avoid `prove_refinement` getting
ahead of itself.

Check out the examples directory for more examples of what sort of things you
can prove about generated specifications.

## Examples

The `examples` directory contains lots of varied examples of the entire process
described above. To run all of the SAW scripts and Coq proofs, you can simply
run `make`, assuming that all `*_gen.v` files that may have been left from a
previous run have been deleted (alternately, you can first run `touch *.saw`).

Here is a brief overview of the different examples.
- `linked_list` - This is a good set of examples to look at first.
`linked_list.saw` introduces the basics of Heapster type-checking, and
`linked_list_proofs.v` contains lots of varied proofs all of which are
relatively very easy to understand.
- `iter_linked_list` - This is a good set of examples to look at after the
above, as they are variants of the above, just written with loops instead of
recursion. These examples introduce reachability permissions, block entry
hints, and preconditions in Coq proofs.
- `loops` - Some more examples of functions with loops, but which do not
involve reachability permissions or preconditions. This set of examples
introduces functions which depend on other functions, see `loops_proofs.v`.
- `arrays` - This set of examples involves using multiple types of hints during
type-checking as well as preconditions, loop invariants, and lots of user
input post-`prove_refinement` on the Coq side.
- `mbox` - This is a set of examples based on "real-world" code, i.e. code
not written for the intent of testing Heapster. As such, not every function
is complete, and the most complicated examples can be found in this file.
- `iso_recursive` - This set of examples uses an experimental feature where
the SAW core definitions used when defining recursive permissions are
set automatically.

Additionally, `clearbufs`, `xor_swap`, `memcpy`, and `string_set` contain some
minimal examples of type-checking various simple functions.

Not included in this list are any of the rust examples.
4 changes: 4 additions & 0 deletions heapster-saw/examples/iter_linked_list.saw
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ heapster_define_perm env "int64" " " "llvmptr 64" "exists x:bv 64.eq(llvmword(x)

heapster_define_reachability_perm env "ListF" "X:perm(llvmptr 64), l:lifetime, rw:rwmodality, y:llvmptr 64" "llvmptr 64" "[l]ptr((rw,0) |-> X) * [l]ptr((rw,8) |-> ListF<X,l,rw,y>)" "List_def" "foldList" "unfoldList" "appendList";

heapster_block_entry_hint env "is_elem" 3 "top_ptr:llvmptr 64, top_ptr1:llvmptr 64" "ghost_frm:llvmframe 64, ghost_ptr:llvmptr 64" "top_ptr:int64<>, top_ptr1:true, arg0:ptr((W,0) |-> true), arg1:ptr((W,0) |-> eq(top_ptr)), arg2:ptr((W,0) |-> eq(ghost_ptr)), ghost_ptr:ListF<int64<>,always,R,llvmword(0)>, ghost_frm:llvmframe [arg2:8, arg1:8, arg0:8]";

heapster_typecheck_fun env "is_elem" "().arg0:int64<>, arg1:ListF<int64<>,always,R,llvmword(0)> -o arg0:true, arg1:true, ret:int64<>";

heapster_block_entry_hint env "incr_list" 3 "top1:llvmptr 64" "frm:llvmframe 64,ghost:llvmptr 64" "top1:ListF<(exists y:(bv 64).eq(llvmword(y))),always,W,ghost>, arg0:ptr((W,0) |-> eq(ghost)),ghost:ListF<(exists y:(bv 64).eq(llvmword(y))),always,W,llvmword(0)>,frm:llvmframe [arg0:8]";

heapster_typecheck_fun env "incr_list" "().arg0:ListF<(exists y:(bv 64).eq(llvmword(y))),always,W,llvmword(0)> -o arg0:ListF<(exists y:(bv 64).eq(llvmword(y))),always,W,llvmword(0)>, ret:true";
Expand Down
68 changes: 68 additions & 0 deletions heapster-saw/examples/iter_linked_list_proofs.v
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,74 @@ Proof.
simpl; f_equal; eauto.
Qed.

Hint Rewrite appendList_Nil_r : refinesM.


Lemma no_errors_is_elem :
refinesFun is_elem (fun _ _ => noErrorsSpec).
Proof.
unfold is_elem, is_elem__tuple_fun.
prove_refinement_match_letRecM_l.
- exact (fun _ _ => noErrorsSpec).
unfold noErrorsSpec.
prove_refinement.
Qed.

Definition is_elem_pure (x:bitvector 64) (l:list (bitvector 64))
: bitvector 64 :=
list_rect (fun _ => bitvector 64)
(intToBv 64 0)
(fun y l' rec => if bvEq 64 y x then intToBv 64 1 else rec) l.

Lemma is_elem_pure_ref :
refinesFun is_elem (fun x l => returnM (is_elem_pure x l)).
Proof.
unfold is_elem, is_elem__tuple_fun.
prove_refinement_match_letRecM_l.
- exact (fun x l => returnM (is_elem_pure x l)).
unfold is_elem_pure.
prove_refinement.
rewrite appendList_Nil_r; reflexivity.
Qed.

Definition is_elem_spec (x:bitvector 64) (l:list (bitvector 64))
: CompM (bitvector 64) :=
orM (assertM (List.In x l) >> returnM (intToBv 64 1))
(assertM (~ List.In x l) >> returnM (intToBv 64 0)).

Lemma is_elem_spec_ref : refinesFun is_elem is_elem_spec.
Proof.
unfold is_elem, is_elem__tuple_fun, is_elem_spec.
prove_refinement_match_letRecM_l.
- exact is_elem_spec.
unfold is_elem_spec.
prove_refinement.
(* The a0 = [] case. *)
- continue_prove_refinement_right.
easy.
(* The a0 = (s1 :: a1) case where a = s1. *)
- continue_prove_refinement_left.
left; easy.
(* The a0 = (s1 :: a1) case where a <> s1, and we inductively assume
the left assertion of our specification (in the letRec) *)
- continue_prove_refinement_left.
rewrite appendList_Nil_r in e_assert.
right; easy.
(* The a0 = (s1 :: a1) case where a <> s1, and we inductively assume
the right assertion of our specification (in the letRec) *)
- continue_prove_refinement_right.
rewrite appendList_Nil_r in e_assert.
intros []; easy.
(* The a0 = (s1 :: a1) case where a <> s1, and we inductively assume
the left assertion of our specification (out of the letRec) *)
- continue_prove_refinement_left.
rewrite appendList_Nil_r in e_assert; easy.
(* The a0 = (s1 :: a1) case where a <> s1, and we inductively assume
the right assertion of our specification (out of the letRec) *)
- continue_prove_refinement_right.
rewrite appendList_Nil_r in e_assert; easy.
Qed.


Definition incr_list_invar :=
@list_rect (bitvector 64) (fun _ => Prop) True
Expand Down