Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: use compare_bytes for more slice element types #128495

Merged
merged 3 commits into from
Sep 1, 2024

Conversation

joboet
Copy link
Member

@joboet joboet commented Aug 1, 2024

bool, NonZero<u8>, Option<NonZero<u8>> and ascii::Char can be compared the same way as u8.

@rustbot
Copy link
Collaborator

rustbot commented Aug 1, 2024

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 1, 2024
@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Aug 1, 2024

The Miri subtree was changed

cc @rust-lang/miri

@oli-obk
Copy link
Contributor

oli-obk commented Aug 1, 2024

slices of all of these seem fairly rare. Not sure how I feel about adding more specialization stuff for them. Is there any motivation for this beyond "can be done"?

@joboet
Copy link
Member Author

joboet commented Aug 1, 2024

It's mostly "cause I can" but I can imagine that this is really beneficial when comparing ASCII-only strings.

@saethlin
Copy link
Member

saethlin commented Aug 1, 2024

I think it would be a great shame if ascii::Char can't get the same optimizations as u8 and i8 (which probably sneaks in a fair bit as ffi::c_char).

But also I wonder if this optimization is better delegated to the compiler. The strategy here does improve ascii::Char, but it leaves out all types in user crates.

@scottmcm
Copy link
Member

scottmcm commented Aug 2, 2024

(Note that since this is talking about Ord for slices it can't do i8, as that compares differently.)

I don't know how feasible this would be to do in LLVM instead. Certainly it'll replace loops with calls to memset already; dunno if it'd be able to detect dynamic-length memcmp. There's probably silly reasons it's can't, like the byte-by-byte loop not asserting readability of any bytes after the last one, which memcmp might actually try to read...

@saethlin
Copy link
Member

saethlin commented Aug 2, 2024

I wasn't thinking of LLVM specifically, just that it would be better if we could automate the detection of types eligible for this optimization by inspecting their layout, instead of attempting to enumerate all of them.

@scottmcm
Copy link
Member

scottmcm commented Aug 3, 2024

detection of types eligible for this optimization by inspecting their layout

I think the problem is that we need more than just their layout, because cmp::Reverse<u8> has the same layout as u8, but we can't do this. That's why I was thinking LLVM, because "detect a loop with these exact semantics" feels like an LLVM thing more than a rustc thing.

But maybe we could detect when it's from derive(Ord) on something where that works? I guess if slice-ord was directly an intrinsic and we could then detect cases that can call compare_bytes, otherwise leave it to fallback MIR?

@Mark-Simulacrum
Copy link
Member

r? scottmcm

I tend to agree that a new specialization trait doesn't make me feel enthusiastic about this.

@rustbot rustbot assigned scottmcm and unassigned Mark-Simulacrum Aug 4, 2024
Copy link
Member

@scottmcm scottmcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small thing above, then I think this is good to go.

I'm less convinced about things like [bool], but being able to get a [NonZero<u8>] to the before-the-terminator in a CString for example would make sense, and it would be unfortunate if that meant a worse Ord.

So once we're doing it for two types, might as well do it for bool (and ascii::Char and …) as well.

@@ -182,19 +183,31 @@ impl<A: Ord> SliceOrd for A {
}
}

// The type should be treated as an unsigned byte for comparisons.
#[rustc_specialization_trait]
unsafe trait UnsignedByte {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's a module-private thing, but can you elaborate on the name and add # Safety for the trait? Just "unsigned byte" could accidentally be interpreted as being more a layout thing, where say Rev<u8> is arguably an "unsigned byte" but definitely shouldn't be implementing this.

I see above it's BytewiseEq, so maybe UnsignedBytewiseOrd here or something?

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 8, 2024
@joboet
Copy link
Member Author

joboet commented Aug 12, 2024

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 12, 2024
@scottmcm
Copy link
Member

Thanks!

@bors r+

@bors
Copy link
Contributor

bors commented Aug 31, 2024

📌 Commit 8301dd4 has been approved by scottmcm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 31, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 1, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#128495 (core: use `compare_bytes` for more slice element types)
 - rust-lang#128641 (refactor: standardize duplicate processes in parser)
 - rust-lang#129207 (Lint that warns when an elided lifetime ends up being a named lifetime)
 - rust-lang#129493 (Create opaque definitions in resolver.)
 - rust-lang#129619 (Update stacker to 0.1.17)
 - rust-lang#129672 (Make option-like-enum.rs UB-free and portable)
 - rust-lang#129780 (add crashtests for several old unfixed ICEs)
 - rust-lang#129832 (Remove stray dot in `std::char::from_u32_unchecked` documentation)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit a073004 into rust-lang:master Sep 1, 2024
6 checks passed
@rustbot rustbot added this to the 1.82.0 milestone Sep 1, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Sep 1, 2024
Rollup merge of rust-lang#128495 - joboet:more_memcmp, r=scottmcm

core: use `compare_bytes` for more slice element types

`bool`, `NonZero<u8>`, `Option<NonZero<u8>>` and `ascii::Char` can be compared the same way as `u8`.
@cuviper cuviper modified the milestones: 1.82.0, 1.83.0 Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants