Skip to content

Commit

Permalink
mir_find_adt, mir_struct_value, and friends
Browse files Browse the repository at this point in the history
This adds supports for MIR struct types, struct values, and ADT definitions in
the MIR backend, piggybacking on the existing support for struct `SetupValue`s.
MIR structs are a fair bit more involved than LLVM structs:

* Unlike in the LLVM backend, where structs are only identified up to their
  field types, MIR structs each have a distinct name. This means that two
  structs with the same field types can still be distinct, and SAW must be
  aware of this difference. As a result, the `XSetupStruct` instance for `MIR`
  uses an `Adt` (algebraic data type) to encode the information about the
  struct name, as well as the types of its fields.
* Building an ADT is somewhat tricky, since each monomorphic instantiation of
  a struct in MIR has its own distinct, autogenerated identifier. The new
  `mir_find_adt` command allows users to look up these autogenerated names by
  supplying a human-readable name for the struct, along with the types used to
  instantiate the type parameters of the struct.
* The `mir_adt` function builds a struct type from an ADT, and the
  `mir_struct_value` function builds a struct value from an ADT and the field
  values. These behave much like `llvm_struct` and `llvm_struct_value` in the
  LLVM backend. Note that we call it "`mir_adt`" rather than "`mir_struct`" to
  be forward-compatible with enums, another type of ADT that we will add
  support for in a future patch.

Some careful handling of `#[repr(transparent)]` structs is required, as they
have a different `TypeShape` than other structs. See the `TransparentShape`
code added in this patch, as well as the `test_mir_verify_struct` test case
which includes a transparent struct example.

This checks off one box in #1859.
  • Loading branch information
RyanGlScott committed Sep 2, 2023
1 parent 4c0985c commit 04b0de5
Show file tree
Hide file tree
Showing 36 changed files with 1,134 additions and 229 deletions.
75 changes: 74 additions & 1 deletion doc/manual/manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -2113,7 +2113,7 @@ implemented include the following:

* MIR specifications that use overrides (i.e., the `[MIRSpec]` argument to
`mir_verify` must always be the empty list at present)
* The ability to construct MIR `struct` or `enum` values in specifications
* The ability to construct MIR `enum` values in specifications
* The ability to specify the layout of slice values

The `String` supplied as an argument to `mir_verify` is expected to be a
Expand Down Expand Up @@ -2229,6 +2229,7 @@ Java types are built up using the following functions:

MIR types are built up using the following functions:

* `mir_adt : MIRAdt -> MIRType`
* `mir_array : Int -> MIRType -> MIRType`
* `mir_bool : MIRType`
* `mir_char : MIRType`
Expand Down Expand Up @@ -2627,6 +2628,78 @@ construct compound values:
* `mir_array_value : MIRType -> [SetupValue] -> SetupValue` constructs an array
of the given type whose elements consist of the given values. Supplying the
element type is necessary to support length-0 arrays.
* `mir_struct_value : MIRAdt -> [MIRValue] -> MIRValue` construct a struct
with the given list of values as elements. The `MIRAdt` argument determines
what struct type to create.

See the "Finding MIR alegraic data types" section for more information on how to
compute a `MIRAdt` value to pass to `mir_struct_value`.

### Finding MIR alegraic data types

We collectively refer to MIR `struct`s and `enum`s together as _algebraic data
types_, or ADTs for short. ADTs have identifiers to tell them apart, and a
single ADT declaration can give rise to multiple identifiers depending on how
the declaration is used. For example:

~~~~ .rs
pub struct S<A, B> {
pub x: A,
pub y: B,
}

pub fn f() -> S<u8, u16> {
S {
x: 1,
y: 2,
}
}

pub fn g() -> S<u32, u64> {
S {
x: 3,
y: 4,
}
}
~~~~

This program as a single `struct` declaration `S`, which is used in the
functions `f` and `g`. Note that `S`'s declaration is _polymorphic_, as it uses
type parameters, but the uses of `S` in `f` and `g` are _monomorphic_, as `S`'s
type parameters are fully instantiated. Each unique, monomorphic instantiation
of an ADT gives rise to its own identifier. In the example above, this might
mean that the following identifiers are created when this code is compiled with
`mir-json`:

* `S<u8, u16>` gives rise to `example/abcd123::S::_adt456`
* `S<u32, u64>` gives rise to `example/abcd123::S::_adt789`

The suffix `_adt<number>` is autogenerated by `mir-json` and is typically
difficult for humans to guess. For this reason, we offer a command to look up
an ADT more easily:

* `mir_find_adt : MIRModule -> String -> [MIRType] -> MIRAdt` consults the
given `MIRModule` to find an algebraic data type (`MIRAdt`). It uses the given
`String` as an identifier and the given MIRTypes as the types to instantiate
the type parameters of the ADT. If such a `MIRAdt` cannot be found in the
`MIRModule`, this will raise an error.

Note that the `String` argument to `mir_find_adt` does not need to include the
`_adt<num>` suffix, as `mir_find_adt` will discover this for you. The `String`
is expected to adhere to the identifier conventions described in the "Running a
MIR-based verification" section. For instance, the following two lines will
look up `S<u8, u16>` and `S<u32, u64>` from the example above as `MIRAdt`s:

~~~~
m <- mir_load_module "example.linked-mir.json";
s_8_16 <- mir_find_adt m "example::S" [mir_u8, mir_u16];
s_32_64 <- mir_find_adt m "example::S" [mir_u32, mir_u64];
~~~~

The `mir_adt` command (for constructing a struct type) and `mir_struct_value`
(for constructing a struct value) commands in turn take a `MIRAdt` as an
argument.

### Bitfields

Expand Down
13 changes: 13 additions & 0 deletions intTests/test_mir_verify_structs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
all: test.linked-mir.json

test.linked-mir.json: test.rs
saw-rustc $<
$(MAKE) remove-unused-build-artifacts

.PHONY: remove-unused-build-artifacts
remove-unused-build-artifacts:
rm -f test libtest.mir libtest.rlib

.PHONY: clean
clean: remove-unused-build-artifacts
rm -f test.linked-mir.json
1 change: 1 addition & 0 deletions intTests/test_mir_verify_structs/test.linked-mir.json

Large diffs are not rendered by default.

56 changes: 56 additions & 0 deletions intTests/test_mir_verify_structs/test.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
pub struct S1 {
pub x1: u32,
pub y1: u32,
}

pub fn f1(s: S1) -> S1 {
S1 {
x1: s.y1.wrapping_add(1),
y1: s.x1.wrapping_add(2),
}
}

pub fn g(s: &S1) -> S1 {
S1 {
x1: s.y1.wrapping_add(1),
y1: s.x1.wrapping_add(2),
}
}

pub fn h(s: &mut S1) {
let x1 = s.x1;
let y1 = s.y1;
s.x1 = y1.wrapping_add(1);
s.y1 = x1.wrapping_add(2);
}

// Polymorphism

pub struct S2<A, B> {
pub x2: A,
pub y2: B,
}

pub fn f2(s: S2<u32, u32>) -> S2<u32, u32> {
S2 {
x2: s.y2.wrapping_add(1),
y2: s.x2.wrapping_add(2),
}
}

pub struct S3(u32, u32);

pub fn f3(s: S3) -> S3 {
match s {
S3(x3, y3) => S3(y3.wrapping_add(1), x3.wrapping_add(2)),
}
}

#[repr(transparent)]
pub struct S4(u32);

pub fn f4(s: S4) -> S4 {
match s {
S4(x4) => S4(x4.wrapping_add(2)),
}
}
98 changes: 98 additions & 0 deletions intTests/test_mir_verify_structs/test.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
enable_experimental;

m <- mir_load_module "test.linked-mir.json";

let s1_adt = mir_find_adt m "test::S1" [];
let s2_adt = mir_find_adt m "test::S2" [mir_u32, mir_u32];
let s3_adt = mir_find_adt m "test::S3" [];
let s4_adt = mir_find_adt m "test::S4" [];

let f_spec adt = do {
x1 <- mir_fresh_var "x1" mir_u32;
y1 <- mir_fresh_var "y1" mir_u32;
let s = mir_struct_value
adt
[ mir_term x1
, mir_term y1
];

mir_execute_func [s];

let s' = mir_struct_value
adt
[ mir_term {{ y1 + 1 }}
, mir_term {{ x1 + 2 }}
];
mir_return s';
};


let f1_spec = f_spec s1_adt;
let f2_spec = f_spec s2_adt;
let f3_spec = f_spec s3_adt;

let f4_spec = do {
x4 <- mir_fresh_var "x4" mir_u32;
let s = mir_struct_value s4_adt [mir_term x4];

mir_execute_func [s];

let s' = mir_struct_value s4_adt [mir_term {{ x4 + 2 }}];
mir_return s';
};

let g_spec = do {
s_ptr <- mir_alloc (mir_adt s1_adt);
x1 <- mir_fresh_var "x1" mir_u32;
y1 <- mir_fresh_var "y1" mir_u32;
let s = mir_struct_value
s1_adt
[ mir_term x1
, mir_term y1
];
mir_points_to s_ptr s;

mir_execute_func [s_ptr];

let s' = mir_struct_value
s1_adt
[ mir_term {{ y1 + 1 }}
, mir_term {{ x1 + 2 }}
];
mir_return s';
};

let h_spec = do {
s_ptr <- mir_alloc_mut (mir_adt s1_adt);
x1 <- mir_fresh_var "x1" mir_u32;
y1 <- mir_fresh_var "y1" mir_u32;
let s = mir_struct_value
s1_adt
[ mir_term x1
, mir_term y1
];
mir_points_to s_ptr s;

mir_execute_func [s_ptr];

let s' = mir_struct_value
s1_adt
[ mir_term {{ y1 + 1 }}
, mir_term {{ x1 + 2 }}
];
mir_points_to s_ptr s';
};

mir_verify m "test::f1" [] false f1_spec z3;
mir_verify m "test::f2" [] false f2_spec z3;
mir_verify m "test::f3" [] false f3_spec z3;
mir_verify m "test::f4" [] false f4_spec z3;
mir_verify m "test::g" [] false g_spec z3;
mir_verify m "test::h" [] false h_spec z3;

fails (
mir_verify m "test::f1" [] false f2_spec z3
);
fails (
mir_verify m "test::f2" [] false f1_spec z3
);
3 changes: 3 additions & 0 deletions intTests/test_mir_verify_structs/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
set -e

$SAW test.saw
8 changes: 8 additions & 0 deletions saw-remote-api/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@

* The `SAW/MIR/load module` command loads a MIR JSON file into SAW.
* The `SAW/MIR/verify` command performs verification of a MIR function.
* The `SAW/MIR/find ADT` command looks up an algebraic data type (ADT) name in
a MIR module.

See the [remote API
documentation](https://github.com/GaloisInc/saw-script/blob/master/saw-remote-api/docs/SAW.rst#sawmirload-module-command)
Expand All @@ -17,6 +19,12 @@
LLVM verification, this field is optional. For MIR verification, this field
is required if the `"elements"` value is empty and optional if the
`"elements"` value is non-empty.
* The `"tuple"` `setup value` has been renamed to `"struct"`. This better
reflects its intended purpose of representing struct values, and it paves the
way to allow MIR tuples, which are a different kind of value in MIR.
* The API for `"struct"` `setup value`s now has a `"MIR ADT"` field. For
MIR verification, this field is required. For LLVM and JVM verification,
this field must be `null`, or else an error will be raised.

## 1.0.0 -- 2023-06-26

Expand Down
36 changes: 36 additions & 0 deletions saw-remote-api/docs/SAW.rst
Original file line number Diff line number Diff line change
Expand Up @@ -532,6 +532,42 @@ No return fields



SAW/MIR/find ADT (command)
~~~~~~~~~~~~~~~~~~~~~~~~~~

Consult the a MIR module to find an algebraic data type (ADT)with the supplied identifier and type parameter substitutions.If such an ADT cannot be found in the module, this will raisean error.

Parameter fields
++++++++++++++++


``module``
The server name of the MIR module containing the ADT.



``ADT original name``
The original (pre-monomorphized) ADT name.



``type substitutions``
The types to substitute the ADT's type parameters with.



``ADT server name``
The server name to refer to the ADT by later.



Return fields
+++++++++++++

No return fields



SAW/Yosys/import (command)
~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
7 changes: 7 additions & 0 deletions saw-remote-api/python/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@

* The `mir_load_module` function loads a MIR JSON file into SAW.
* The `mir_verify` function performs verification of a MIR function.
* The `mir_find_adt` function looks up an algebraic data type (ADT) name in a
MIR module.

* The `saw_client.mir` module contains utility functions for constructing
MIR types.

Expand All @@ -15,6 +18,10 @@
* The `array()` function now takes an additional `element_type` argument, which
defaults to `None`. If constructing a MIR array with no elements, then the
`element_type` must be specified. Otherwise, this argument is optional.
* The `struct()` function now takes an additional `mir_adt` argument, which
defaults to `None`. If building a MIR struct, this `mir_adt` argument is
required. Passing a `mir_adt` argument when building an LLVM struct will raise
an error.

## 1.0.1 -- YYYY-MM-DD

Expand Down
17 changes: 17 additions & 0 deletions saw-remote-api/python/saw_client/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from . import connection
from argo_client.connection import ServerConnection
from . import llvm
from . import mir
from . import exceptions
from . import option
from . import proofscript
Expand Down Expand Up @@ -753,6 +754,22 @@ def mir_verify(module: MIRModule,
return result


def mir_find_adt(module: MIRModule,
adt_orig_name: str,
*tys: mir.MIRType,
adt_server_name_hint: Optional[str] = None) -> mir.MIRAdt:
"""Consult the given MIR module (``module_server_name``) to find an
algebraic data type (ADT) with ``adt_orig_name`` as its identifier and
``tys`` as the types used to instantiate the type parameters. If such an
ADT cannot be found in the module, this will raise an error.
"""
if adt_server_name_hint is None:
adt_server_name_hint = adt_orig_name
adt_server_name = __fresh_server_name(adt_server_name_hint)
__get_designated_connection().mir_find_adt(module.server_name, adt_orig_name, list(tys), adt_server_name).result()
return mir.MIRAdt(adt_orig_name, adt_server_name)


def prove(goal: cryptoltypes.CryptolJSON,
proof_script: proofscript.ProofScript) -> ProofResult:
"""Atempts to prove that the expression given as the first argument, `goal`, is
Expand Down
Loading

0 comments on commit 04b0de5

Please sign in to comment.