Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add llvm_points_to_bitfield and enable_lax_loads_and_stores #1539

Merged
merged 3 commits into from
Dec 22, 2021
Merged

Conversation

RyanGlScott
Copy link
Contributor

This adds support for writing specifications that talk about bitfields in LLVM code by way of the new llvm_points_to_bitfield command. Broadly speaking, llvm_points_to_bitfield ptr fieldName rhs is like llvm_points_to (llvm_field ptr fieldName) rhs, except that fieldName is required to be the name of a field within a bitfield. The salient details are:

  • LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's debug metadata does. Support for retrieving bitfield-related metadata was added to llvm-pretty in Support structs with bitfields in Text.LLVM.DebugUtils llvm-pretty#90, so this patch bumps the llvm-pretty submodule to incorporate it. This patch also updates the crucible submodule to incorporate corresponding changes in Bump llvm-pretty{-bc-parser} submodules, adapt to ValPoison crucible#936.

  • The LLVMPointsTo data type now has a new LLVMPointsToBitfield data constructor that stores all of the necessary information related to the llvm_points_to_bitfield command. As a result, the changes in this patch are fairly insulated from the rest of SAW, as most of the new code involves adding additional cases to handle LLVMPointsToBitfield.

  • Two of the key new functions are storePointsToBitfieldValue and matchPointsToBitfieldValue, which implement the behavior of llvm_points_to_bitfield in pre- and post-conditions. These functions implement the necessary bit-twiddling to store values in and retrieve values out of a bitfield. I have left extensive comments in each function describing how all of this works.

  • Accompanying llvm_points_to_bitfield is a new set of {enable,disable}_lax_loads_and_stores command, which toggles the Crucible-side option of the same name. When enable_lax_loads_and_stores is on, reading from uninitialized memory will return a symbolic value rather than failing outright. This is essential to be able to deal with LLVM bitcode involving bitfields, as reading a field from a bitfield involves reading the entire bitfield at once, which may include parts of the struct that have not been initialized yet.

  • There are various test_bitfield_* test cases under intTests to test examples of bitfield-related specifications that should and should not verify.

  • I have also updated saw-remote-api and saw-client to handle bitfields as well, along with a Python-specific test case.

Fixes #1461.

@RyanGlScott RyanGlScott added subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm topics: bitfields Issues related to SAW's support for bitfields labels Dec 8, 2021
src/SAWScript/Crucible/LLVM/X86.hs Show resolved Hide resolved
src/SAWScript/Crucible/LLVM/Override.hs Outdated Show resolved Hide resolved
src/SAWScript/Crucible/LLVM/MethodSpecIR.hs Show resolved Hide resolved
src/SAWScript/Crucible/LLVM/Override.hs Show resolved Hide resolved
@RyanGlScott RyanGlScott changed the title Draft: Add llvm_points_to_bitfield and enable_lax_loads_and_stores Add llvm_points_to_bitfield and enable_lax_loads_and_stores Dec 10, 2021
@RyanGlScott RyanGlScott marked this pull request as ready for review December 10, 2021 20:06
RyanGlScott added a commit to RyanGlScott/s2n-tls that referenced this pull request Dec 15, 2021
This uses SAW's new `llvm_points_to_bitfield` command (introduced in
GaloisInc/saw-script#1539) to simplify specifications that involve the
bitfields in `s2n_config` and `s2n_connection`. In particular, there is no
longer any need to manually compute the size of the bitfields or the indices of
the fields within the bitfields, as these details are handled automatically
by `llvm_points_to_bitfield`. The positions of the fields within bitfields are
also no longer important.

This, along with a subsequent commit that will restore the bitfields' field
order to what they were before aws#3709, will address a review comment in
aws#3079 (comment).
Copy link
Contributor

@robdockins robdockins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't understood all of the details, but I think this looks good. I always appreciate comments and tests, so looks good there.

Couple of minor comments; otherwise looks good.

src/SAWScript/Crucible/LLVM/Override.hs Outdated Show resolved Hide resolved
src/SAWScript/Crucible/LLVM/Override.hs Outdated Show resolved Hide resolved
src/SAWScript/Crucible/LLVM/Override.hs Outdated Show resolved Hide resolved
src/SAWScript/Crucible/LLVM/Override.hs Outdated Show resolved Hide resolved
src/SAWScript/Crucible/LLVM/Override.hs Outdated Show resolved Hide resolved
src/SAWScript/Crucible/LLVM/ResolveSetupValue.hs Outdated Show resolved Hide resolved
This adds support for writing specifications that talk about bitfields in LLVM
code by way of the new `llvm_points_to_bitfield` command. Broadly speaking,
`llvm_points_to_bitfield ptr fieldName rhs` is like
`llvm_points_to (llvm_field ptr fieldName) rhs`, except that `fieldName` is
required to be the name of a field within a bitfield. The salient details are:

* LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's
  debug metadata does. Support for retrieving bitfield-related metadata was
  added to `llvm-pretty` in GaloisInc/llvm-pretty#90, so this patch bumps the
  `llvm-pretty` submodule to incorporate it. This patch also updates the
  `crucible` submodule to incorporate corresponding changes in
  GaloisInc/crucible#936.

* The `LLVMPointsTo` data type now has a new `LLVMPointsToBitfield` data
  constructor that stores all of the necessary information related to the
  `llvm_points_to_bitfield` command. As a result, the changes in this patch
  are fairly insulated from the rest of SAW, as most of the new code involves
  adding additional cases to handle `LLVMPointsToBitfield`.

* Two of the key new functions are `storePointsToBitfieldValue` and
  `matchPointsToBitfieldValue`, which implement the behavior of
  `llvm_points_to_bitfield` in pre- and post-conditions. These functions
  implement the necessary bit-twiddling to store values in and retrieve values
  out of bitfield. I have left extensive comments in each function describing
  how all of this works.

* Accompanying `llvm_points_to_bitfield` is a new set of
  `{enable,disable}_lax_loads_and_stores` command, which toggles the
  Crucible-side option of the same name. When `enable_lax_loads_and_stores` is
  on, reading from uninitialized memory will return a symbolic value rather
  than failing outright. This is essential to be able to deal with LLVM bitcode
  involving bitfields, as reading a field from a bitfield involves reading the
  entire bitfield at once, which may include parts of the struct that have not
  been initialized yet.

* There are various `test_bitfield_*` test cases under `intTests` to test
  examples of bitfield-related specifications that should and should not
  verify.

* I have also updated `saw-remote-api` and `saw-client` to handle bitfields as
  well, along with a Python-specific test case.

Fixes #1461.
@RyanGlScott RyanGlScott added the PR: ready to merge Magic flag for pull requests to ask Mergify to merge given an approval and a successful CI run label Dec 21, 2021
@mergify mergify bot merged commit 1c6470d into master Dec 22, 2021
@mergify mergify bot deleted the T1461 branch December 22, 2021 00:49
Copy link
Contributor

@andreistefanescu andreistefanescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the late comments

-- Finally, truncate the bitfield such that only the
-- bits for the field remain. In the `y` example, this
-- means truncating 0b000????Y to `0bY`.
bfBV'' <- liftIO $ W4.bvTrunc sym rhsWidth bfBV'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be W4.bvSelect sym bfOffset_repr rhsWidth bfBV.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good suggestion. I've submitted #1547 to implement this.

-- well use bitwise-OR, but bitwise-XOR plays nicer with
-- how What4 represents bitvectors). In our running
-- example, `bfBV''` is 0b????1???.
bfBV'' <- W4.bvXorBits sym bfBV' rhsBV''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be expressed more naturally in terms of bvConcat and bvSelect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think the XOR formulation will be better, because the What4 AND/XOR semiring structure ought to mean that subsequent updates to the same field will cancel, and update to disjoint fields will commute, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions on which approach is better per se, but I did try to implement your alternative approach in terms of bvConcat and bvSelect, and it wasn't entirely straightforward to do so. Unless I'm missing something, you have to have three cases:

  1. The field occurs at the very LSBs of the bitfield. In that case, bvSelect the range from rhsWidth to bfWidth - rhsWidth, and then bvConcat that to the rhsBV.
  2. The field occurs at the very MSBs of the bitfield. In that case, bvSelect the range from 0 to bfFieldOffset, and then bvConcat that to the rhsBV.
  3. The field is in the middle of the bitfield, with other bits surrounding it on both sides. In that case, call bvSelect twice to obtain the surrounding bits, and then call bvConcat twice to concatenate these bitvectors to the rhsBV.

What's more, there's a fair bit of tiresome NatRepr manipulation needed to satisfy the inequalities demanded by bvSelect's type signature. I've implemented it locally, and it is possible to do so, but the resulting code is not much more natural to look at than it is currently.

RyanGlScott added a commit to RyanGlScott/s2n-tls that referenced this pull request Jan 4, 2022
This uses SAW's new `llvm_points_to_bitfield` command (introduced in
GaloisInc/saw-script#1539) to simplify specifications that involve the
bitfields in `s2n_config` and `s2n_connection`. In particular, there is no
longer any need to manually compute the size of the bitfields or the indices of
the fields within the bitfields, as these details are handled automatically
by `llvm_points_to_bitfield`. The positions of the fields within bitfields are
also no longer important.

This, along with a subsequent commit that will restore the bitfields' field
order to what they were before aws#3709, will address a review comment in
aws#3079 (comment).
RyanGlScott added a commit to RyanGlScott/s2n-tls that referenced this pull request Jan 10, 2022
This uses SAW's new `llvm_points_to_bitfield` command (introduced in
GaloisInc/saw-script#1539) to simplify specifications that involve the
bitfields in `s2n_config` and `s2n_connection`. In particular, there is no
longer any need to manually compute the size of the bitfields or the indices of
the fields within the bitfields, as these details are handled automatically
by `llvm_points_to_bitfield`. The positions of the fields within bitfields are
also no longer important.

This addresses a review comment in
aws#3079 (comment).
RyanGlScott added a commit to RyanGlScott/s2n-tls that referenced this pull request Jan 24, 2022
This uses SAW's new `llvm_points_to_bitfield` command (introduced in
GaloisInc/saw-script#1539) to simplify specifications that involve the
bitfields in `s2n_config` and `s2n_connection`. In particular, there is no
longer any need to manually compute the size of the bitfields or the indices of
the fields within the bitfields, as these details are handled automatically
by `llvm_points_to_bitfield`. The positions of the fields within bitfields are
also no longer important.

This addresses a review comment in
aws#3079 (comment).
RyanGlScott added a commit to RyanGlScott/s2n-tls that referenced this pull request Jan 26, 2022
This uses SAW's new `llvm_points_to_bitfield` command (introduced in
GaloisInc/saw-script#1539) to simplify specifications that involve the
bitfields in `s2n_config` and `s2n_connection`. In particular, there is no
longer any need to manually compute the size of the bitfields or the indices of
the fields within the bitfields, as these details are handled automatically
by `llvm_points_to_bitfield`. The positions of the fields within bitfields are
also no longer important.

This addresses a review comment in
aws#3079 (comment).
RyanGlScott added a commit to RyanGlScott/s2n-tls that referenced this pull request Jan 26, 2022
This uses SAW's new `llvm_points_to_bitfield` command (introduced in
GaloisInc/saw-script#1539) to simplify specifications that involve the
bitfields in `s2n_config` and `s2n_connection`. In particular, there is no
longer any need to manually compute the size of the bitfields or the indices of
the fields within the bitfields, as these details are handled automatically
by `llvm_points_to_bitfield`. The positions of the fields within bitfields are
also no longer important.

This addresses a review comment in
aws#3079 (comment).
RyanGlScott added a commit to RyanGlScott/s2n-tls that referenced this pull request Jan 27, 2022
This uses SAW's new `llvm_points_to_bitfield` command (introduced in
GaloisInc/saw-script#1539) to simplify specifications that involve the
bitfields in `s2n_config` and `s2n_connection`. In particular, there is no
longer any need to manually compute the size of the bitfields or the indices of
the fields within the bitfields, as these details are handled automatically
by `llvm_points_to_bitfield`. The positions of the fields within bitfields are
also no longer important.

This addresses a review comment in
aws#3079 (comment).
lrstewart pushed a commit to aws/s2n-tls that referenced this pull request Jan 29, 2022
This uses SAW's new `llvm_points_to_bitfield` command (introduced in
GaloisInc/saw-script#1539) to simplify specifications that involve the
bitfields in `s2n_config` and `s2n_connection`. In particular, there is no
longer any need to manually compute the size of the bitfields or the indices of
the fields within the bitfields, as these details are handled automatically
by `llvm_points_to_bitfield`. The positions of the fields within bitfields are
also no longer important.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: ready to merge Magic flag for pull requests to ask Mergify to merge given an approval and a successful CI run subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm topics: bitfields Issues related to SAW's support for bitfields
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow using bitfields in SAW specifications
3 participants