-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor overflow checks in intrinsics code #1131
Refactor overflow checks in intrinsics code #1131
Conversation
loc, | ||
); | ||
let (_offset_bytes, overflow_check) = | ||
self.count_in_bytes(offset.clone(), ty, Type::ssize_t(), "ptr_offset_from", loc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood the documentation correctly (https://doc.rust-lang.org/std/primitive.pointer.html#method.offset_from), I think we need to divide (and not multiply) the offset between the two pointers (offset
) by the size of ty
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a typo there and the documentation should say:
The returned value is in units of T: the distance in bytes divided by
mem::size_of::<T>()
The offset between two pointers is usually given in units of T
(compare with the documentation for offset
), which you can convert to bytes by multiplying
offset * mem::size_of<T>
(what we do here).
I've opened this PR to correct the typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the offset computed using:
let offset = cast_dst_ptr.sub(cast_src_ptr);
in terms of bytes? If so, then to convert it to be in terms of units of T
, we need to divide by size of T
since "offset in terms of bytes = offset in terms of T x size of T".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, offset computation is done in terms of T. See the test case in tests/kani/PointerOffset/offset_from.rs
:
let a = [0; 5];
let b = [0; 5];
let ptr1: *const i32 = &a[1];
let ptr2: *const i32 = &a[3];
let ptr3: *const i64 = &b[1];
let ptr4: *const i64 = &b[3];
unsafe {
assert!(ptr2.offset_from(ptr1) == 2);
assert!(ptr1.offset_from(ptr2) == -2);
assert!(ptr4.offset_from(ptr3) == 2);
assert!(ptr3.offset_from(ptr4) == -2);
}
Here, assertion 1 and 3 (or 2 and 4) return the same offset value. An offset computed in terms of bytes would return 8 in the first one (2 * 4 bytes for each i32
value) and 16 on assertion 3 (2 * 8 bytes for each i64
value).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I was missing is that the return value is computed separately on line 937/924, and is not computed from count_in_bytes
. I'm not sure I understand why an overflow check is needed here though. I'm assuming that a pointer subtraction in CBMC would involve a division operation and not a multiplication, i.e. x-y
translates to (x-y)/sizeof(*x)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overflow check is done because it's one of the safety conditions listed in the documentation:
The distance between the pointers, in bytes, cannot overflow an isize.
I don't know for sure how CBMC handles pointer arithmetic internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. But is this what the code is doing? I guess here's my source of confusion: what does offset
represent in the lines below?
let dst_ptr = fargs.remove(0);
let src_ptr = fargs.remove(0);
// Compute the offset with standard substraction using `isize`
let cast_dst_ptr = dst_ptr.clone().cast_to(Type::ssize_t());
let cast_src_ptr = src_ptr.clone().cast_to(Type::ssize_t());
let offset = cast_dst_ptr.sub(cast_src_ptr);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhassan-aws pointed out in a discussion that, because of these conversions, the computed offset would be in bytes and not units of T. So the overflow check should be based on this sub
operation, and using count_in_bytes
does not proceed here.
Thanks @zhassan-aws !
Stmt::block(vec![align_check, overflow_check, memset_call.as_stmt(loc)], loc) | ||
} | ||
|
||
/// Computes (multiplies) the equivalent of a memory-related number (e.g., an offset) in bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice documentation!
// possible to assert that the value coming from `offset_from` is equal to | ||
// `high_offset`. But CBMC's memory model is going to cause a "wrapping around" | ||
// behavior in `v_wrap`, so any values that depend on it are going to show a | ||
// strange behavior as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracking ticket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created and added: #1150
@@ -1227,18 +1213,41 @@ impl<'tcx> GotocCtx<'tcx> { | |||
loc, | |||
); | |||
|
|||
// Check that computing `bytes` would not overflow | |||
// Check that computing `count` in bytes would not overflow | |||
let (count_bytes, overflow_check) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This catches the Rust
definition of UB, but will still cause problems in CBMC if we overflow the index bits, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and there is now a separate test for CBMC's wrap-around behavior.
This PR is ready for review again:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Description of changes:
Refactors the overflow checks added in #1094, #1099 and #1102 into a function that returns the result and the assertions associated with it. When working on this I realized the expected failure for
ptr_offset_from
(bytes) was actually a failure coming fromoffset
. This was a rabbit hole because it's actually possible to trigger the failure by "wrapping around" a pointer, so I added a comment there to avoid confusion in the future.Resolved issues:
Resolves #1116
Call-outs:
Testing:
How is this change tested? Updates one test.
Is this a refactor change? No (but it should have been).
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.