Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor overflow checks in intrinsics code #1131

Merged
merged 10 commits into from
May 3, 2022

Conversation

adpaco-aws
Copy link
Contributor

Description of changes:

Refactors the overflow checks added in #1094, #1099 and #1102 into a function that returns the result and the assertions associated with it. When working on this I realized the expected failure for ptr_offset_from (bytes) was actually a failure coming from offset. This was a rabbit hole because it's actually possible to trigger the failure by "wrapping around" a pointer, so I added a comment there to avoid confusion in the future.

Resolved issues:

Resolves #1116

Call-outs:

Testing:

  • How is this change tested? Updates one test.

  • Is this a refactor change? No (but it should have been).

Checklist

  • Each commit message has a non-empty body, explaining why the change was made
  • Methods or procedures are documented
  • Regression or unit tests are included, or existing tests cover the modified code
  • My PR is restricted to a single feature or bugfix

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

@adpaco-aws adpaco-aws requested a review from a team as a code owner April 28, 2022 22:43
@adpaco-aws adpaco-aws changed the title Refactor overflows Refactor overflow checks in intrinsics code Apr 28, 2022
loc,
);
let (_offset_bytes, overflow_check) =
self.count_in_bytes(offset.clone(), ty, Type::ssize_t(), "ptr_offset_from", loc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood the documentation correctly (https://doc.rust-lang.org/std/primitive.pointer.html#method.offset_from), I think we need to divide (and not multiply) the offset between the two pointers (offset) by the size of ty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a typo there and the documentation should say:

The returned value is in units of T: the distance in bytes divided by mem::size_of::<T>()

The offset between two pointers is usually given in units of T (compare with the documentation for offset), which you can convert to bytes by multiplying
offset * mem::size_of<T> (what we do here).

I've opened this PR to correct the typo.

Copy link
Contributor

@zhassan-aws zhassan-aws May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the offset computed using:

  let offset = cast_dst_ptr.sub(cast_src_ptr);

in terms of bytes? If so, then to convert it to be in terms of units of T, we need to divide by size of T since "offset in terms of bytes = offset in terms of T x size of T".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, offset computation is done in terms of T. See the test case in tests/kani/PointerOffset/offset_from.rs:

    let a = [0; 5];
    let b = [0; 5];
    let ptr1: *const i32 = &a[1];
    let ptr2: *const i32 = &a[3];
    let ptr3: *const i64 = &b[1];
    let ptr4: *const i64 = &b[3];
    unsafe {
        assert!(ptr2.offset_from(ptr1) == 2);
        assert!(ptr1.offset_from(ptr2) == -2);
        assert!(ptr4.offset_from(ptr3) == 2);
        assert!(ptr3.offset_from(ptr4) == -2);
    }

Here, assertion 1 and 3 (or 2 and 4) return the same offset value. An offset computed in terms of bytes would return 8 in the first one (2 * 4 bytes for each i32 value) and 16 on assertion 3 (2 * 8 bytes for each i64 value).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I was missing is that the return value is computed separately on line 937/924, and is not computed from count_in_bytes. I'm not sure I understand why an overflow check is needed here though. I'm assuming that a pointer subtraction in CBMC would involve a division operation and not a multiplication, i.e. x-y translates to (x-y)/sizeof(*x)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overflow check is done because it's one of the safety conditions listed in the documentation:

The distance between the pointers, in bytes, cannot overflow an isize.

I don't know for sure how CBMC handles pointer arithmetic internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. But is this what the code is doing? I guess here's my source of confusion: what does offset represent in the lines below?

        let dst_ptr = fargs.remove(0);
        let src_ptr = fargs.remove(0);
        // Compute the offset with standard substraction using `isize`
        let cast_dst_ptr = dst_ptr.clone().cast_to(Type::ssize_t());
        let cast_src_ptr = src_ptr.clone().cast_to(Type::ssize_t());
        let offset = cast_dst_ptr.sub(cast_src_ptr);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhassan-aws pointed out in a discussion that, because of these conversions, the computed offset would be in bytes and not units of T. So the overflow check should be based on this sub operation, and using count_in_bytes does not proceed here.

Thanks @zhassan-aws !

Stmt::block(vec![align_check, overflow_check, memset_call.as_stmt(loc)], loc)
}

/// Computes (multiplies) the equivalent of a memory-related number (e.g., an offset) in bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice documentation!

// possible to assert that the value coming from `offset_from` is equal to
// `high_offset`. But CBMC's memory model is going to cause a "wrapping around"
// behavior in `v_wrap`, so any values that depend on it are going to show a
// strange behavior as well.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tracking ticket

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created and added: #1150

@@ -1227,18 +1213,41 @@ impl<'tcx> GotocCtx<'tcx> {
loc,
);

// Check that computing `bytes` would not overflow
// Check that computing `count` in bytes would not overflow
let (count_bytes, overflow_check) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catches the Rust definition of UB, but will still cause problems in CBMC if we overflow the index bits, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and there is now a separate test for CBMC's wrap-around behavior.

@adpaco-aws
Copy link
Contributor Author

This PR is ready for review again:

  • Reviewed the overflow check in ptr_offset_from (see discussion above)
  • Changed the offset-from-bytes-overflow expected output to look for overflows in both offset operations
  • Added offset-wraps-around to test the "wrapping-around" behavior in CBMC.
  • Created High offsets cause wrapping-around behavior in CBMC #1150 and linked in offset-related overflow checks.

Copy link
Contributor

@zhassan-aws zhassan-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@adpaco-aws adpaco-aws merged commit 1a61de2 into model-checking:main May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overflows checks in intrinsics should be refactored
4 participants