Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unchecked mul to compute slice sizes #98078

Merged
merged 1 commit into from
Jun 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion compiler/rustc_codegen_ssa/src/glue.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,12 @@ pub fn size_and_align_of_dst<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
// The info in this case is the length of the str, so the size is that
// times the unit size.
(
bx.mul(info.unwrap(), bx.const_usize(unit.size.bytes())),
// All slice sizes must fit into `isize`, so this multiplication cannot (signed) wrap.
// NOTE: ideally, we want the effects of both `unchecked_smul` and `unchecked_umul`
// (resulting in `mul nsw nuw` in LLVM IR), since we know that the multiplication
// cannot signed wrap, and that both operands are non-negative. But at the time of writing,
// `BuilderMethods` can't do this, and it doesn't seem to enable any further optimizations.
bx.unchecked_smul(info.unwrap(), bx.const_usize(unit.size.bytes())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason to use signed multiplication here?
Both length and element size are unsigned.

Copy link
Contributor Author

@erikdesjardins erikdesjardins Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because mul/unchecked_smul/unchecked_umul are all "half width" multiplies (the result is the same number of bits as the input), there is no behavior difference between signed and unsigned multiplication--for the same bitpattern in the inputs, they return the same result. This is why the previous code was just bx.mul, and not bx.smul or bx.umul (which don't exist).

The difference between unchecked_umul/unchecked_smul, then, is that unchecked_umul tells the backend that the multiplication can't wrap around the unsigned boundary (0/usize::MAX), and unchecked_smul tells the backend that it can't wrap around the signed boundary (isize::MIN/isize::MAX). Ideally, we would have both (resulting in mul nsw nuw in LLVM IR), since we know both inputs are non-negative. But there's no existing unchecked_nsw_nuw_mul, and it's not necessary for this use case, so I don't think it's worth adding.

Because both unchecked_umul and unchecked_smul are correct here, the choice is arbitrary. I picked unchecked_smul here because the upper bound on the result is lower (isize::MAX instead of usize::MAX), though I have no particular reason to prefer that over a tighter lower bound. The added test case passes with either.

Added a comment.

bx.const_usize(unit.align.abi.bytes()),
)
}
Expand Down
29 changes: 29 additions & 0 deletions src/test/codegen/issue-96497-slice-size-nowrap.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// This test case checks that LLVM is aware that computing the size of a slice cannot wrap.
// The possibility of wrapping results in an additional branch when dropping boxed slices
// in some situations, see https://github.com/rust-lang/rust/issues/96497#issuecomment-1112865218

// compile-flags: -O
// min-llvm-version: 14.0

#![crate_type="lib"]

// CHECK-LABEL: @simple_size_of_nowrap
#[no_mangle]
pub fn simple_size_of_nowrap(x: &[u32]) -> usize {
// Make sure the shift used to compute the size has a nowrap flag.

// CHECK: [[A:%.*]] = shl nsw {{.*}}, 2
// CHECK-NEXT: ret {{.*}} [[A]]
core::mem::size_of_val(x)
}

// CHECK-LABEL: @drop_write
#[no_mangle]
pub fn drop_write(mut x: Box<[u32]>) {
// Check that this write is optimized out.
// This depends on the size calculation not wrapping,
// since otherwise LLVM can't tell that the memory is always deallocated if the slice len > 0.

// CHECK-NOT: store i32 42
x[1] = 42;
}