Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mod_inv termination for the last iteration #103378

Merged
merged 1 commit into from
Nov 19, 2022

Conversation

nagisa
Copy link
Member

@nagisa nagisa commented Oct 22, 2022

On usize=u64 platforms, the 4th iteration would overflow the mod_gate back to 0. Similarly for usize=u32 platforms, the 3rd iteration would overflow much the same way.

I tested various approaches to resolving this, including approaches with saturating_mul and widening_mul to a double usize. Turns out LLVM likes mul_with_overflow the best. In fact now, that LLVM can see the iteration count is limited, it will happily unroll the loop into a nice linear sequence.

You will also notice that the code around the loop got simplified somewhat. Now that LLVM is handling the loop nicely, there isn’t any more reasons to manually unroll the first iteration out of the loop (though looking at the code today I’m not sure all that complexity was necessary in the first place).

Fixes #103361

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Oct 22, 2022
@rust-highfive
Copy link
Collaborator

r? @scottmcm

(rust-highfive has picked a reviewer for you, use r? to override)

@rustbot
Copy link
Collaborator

rustbot commented Oct 22, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 22, 2022
@rust-log-analyzer

This comment has been minimized.

On usize=u64 platforms, the 4th iteration would overflow the `mod_gate`
back to 0. Similarly for usize=u32 platforms, the 3rd iteration would
overflow much the same way.

I tested various approaches to resolving this, including approaches with
`saturating_mul` and `widening_mul` to a double usize. Turns out LLVM
likes `mul_with_overflow` the best. In fact now, that LLVM can see the
iteration count is limited, it will happily unroll the loop into a nice
linear sequence.

You will also notice that the code around the loop got simplified
somewhat. Now that LLVM is handling the loop nicely, there isn’t any
more reasons to manually unroll the first iteration out of the loop
(though looking at the code today I’m not sure all that complexity was
necessary in the first place).

Fixes rust-lang#103361

let table_inverse = INV_TABLE_MOD_16[(x & (INV_TABLE_MOD - 1)) >> 1] as usize;
// SAFETY: `m` is required to be a power-of-two, hence non-zero.
let m_minus_one = unsafe { unchecked_sub(m, 1) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly unchecked is useless here today -- LLVM turns sub nuw %m, 1 into add %m, -1 during normalization :(

(Doesn't need to change here, though. I'm just sad about llvm/llvm-project#53377.)

@scottmcm
Copy link
Member

Thanks! Really nice to hear that LLVM is smart enough to realize that this lets LLVM fully unroll it.

@bors r+

@bors
Copy link
Contributor

bors commented Nov 15, 2022

📌 Commit a3c3f72 has been approved by scottmcm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 15, 2022
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Nov 16, 2022
…tmcm

Fix mod_inv termination for the last iteration

On usize=u64 platforms, the 4th iteration would overflow the `mod_gate` back to 0. Similarly for usize=u32 platforms, the 3rd iteration would overflow much the same way.

I tested various approaches to resolving this, including approaches with `saturating_mul` and `widening_mul` to a double usize. Turns out LLVM likes `mul_with_overflow` the best. In fact now, that LLVM can see the iteration count is limited, it will happily unroll the loop into a nice linear sequence.

You will also notice that the code around the loop got simplified somewhat. Now that LLVM is handling the loop nicely, there isn’t any more reasons to manually unroll the first iteration out of the loop (though looking at the code today I’m not sure all that complexity was necessary in the first place).

Fixes rust-lang#103361
bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 18, 2022
…earth

Rollup of 8 pull requests

Successful merges:

 - rust-lang#102977 (remove HRTB from `[T]::is_sorted_by{,_key}`)
 - rust-lang#103378 (Fix mod_inv termination for the last iteration)
 - rust-lang#103456 (`unchecked_{shl|shr}` should use `u32` as the RHS)
 - rust-lang#103701 (Simplify some pointer method implementations)
 - rust-lang#104047 (Diagnostics `icu4x` based list formatting.)
 - rust-lang#104338 (Enforce that `dyn*` coercions are actually pointer-sized)
 - rust-lang#104498 (Edit docs for `rustc_errors::Handler::stash_diagnostic`)
 - rust-lang#104556 (rustdoc: use `code-header` class to format enum variants)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 6b09d60 into rust-lang:master Nov 19, 2022
@rustbot rustbot added this to the 1.67.0 milestone Nov 19, 2022
#[cfg(target_pointer_width = "16")]
const SIZE: usize = 1 << 13;
struct HugeSize([u8; SIZE - 1]);
let _ = (SIZE as *const HugeSize).align_offset(SIZE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually prefer the strict provenance APIs in libcore -- #104632

Note sure if the lint against int2ptr casts ever got implemented? If yes we should probably enable it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I basically just copy-pasted over the reproducer from the issue…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

align_offset infinite loop
7 participants