-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enum should prefer discriminant zero for niche #87794
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @jackh726 (or someone else) soon. Please see the contribution instructions for more information. |
This comment has been minimized.
This comment has been minimized.
CI failure looks unrelated. Once this passes CI, we should do a perf run on it. |
That will be interesting. There is still some things that I want input on before merging. Both problems can be seen in output from this code: #[repr(usize)]
pub enum Size {
One = 1,
Two = 2,
Three = 3,
}
pub fn handle(x: Option<Size>) -> usize {
match x {
None => {0}
Some(size) => {size as usize}
}
}
mov rax, rdi
test rdi, rdi // this instruction is redundant - should not be part of output
ret
test dil, dil
je .LBB0_1
movzx eax, dil
ret
.LBB0_1:
xor eax, eax
ret instead of: movzx eax, dil
ret My guess is that neither of them are strongly connected to choice of niche zero. |
I think that MIR seems agnostic in regard to niche, output is identical between nightly and my PR. 1: In LLVM-IR we have:
So there is a comparison for equal to zero(None) and then another comparison for 0 <= x < 4. 2: Difference between repr(isize/usize) isn't that big.
I think this selects the value conditionally and therefore is our culprit. For both 1 and 2 I think that all conditional checks should be optimized away at some stage. imho this is probably not a blocker for this PR, but would be nice to fix. |
I am a bit worried that this change would reserve a much larger range than it current do. For example, if I have a Another idea would be just try to creep towards zero instead of forcing zero to be used. We can extend the valid ranges just by Or maybe we can have some heuristics about not to leave too many gaps in the representation while prefer 0 if possible. |
You are totally right, I made some stupid mistake. I consider it very important to allocate zero for first nesting(much more likely to have an impact). I think of two alternatives.
Unanswered question: What is the cutoff point for aggressively allocating zero vs losing nesting capacity?
Because of my limited expertise with the compiler I prefer route 1 at this time. I will think about it some more and do a little prototyping tomorrow |
a64325f
to
dc51008
Compare
This comment has been minimized.
This comment has been minimized.
@nbdd0121 Did the simple prototype without any heuristics. Just grows towards zero. I am a bit unsure what good heuristics is because of lack of data. |
ea9c6f5
to
d67d32f
Compare
This comment has been minimized.
This comment has been minimized.
I don't understand this failure, from what I can see it shouldn't even use niches. |
There is |
Oh right, derive... |
Could be with the LLVM version. Rust builds and uses LLVM 12 by default but LLVM 10 and 11 are still supported. The CI is using LLVM 10 according to its name. |
This comment has been minimized.
This comment has been minimized.
Thanks for the help. |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit b8333b42b90c0c0963b90265e98004e7a7057ecd with merge e57f7bdd064435d08c84f9c60751acf77d396c1d... |
☀️ Try build successful - checks-actions |
Queued e57f7bdd064435d08c84f9c60751acf77d396c1d with parent 04c9901, future comparison URL. |
So I think this failed during a non-optimizing build. |
Perhaps |
307a1bb
to
4d66fbc
Compare
@bors r+ |
📌 Commit 4d66fbc has been approved by |
⌛ Testing commit 4d66fbc with merge 9a4d180a01eadf0aa4ff80dbbc8410a229db10e9... |
💔 Test failed - checks-actions |
The job Click to see the possible cause of the failure (guessed by this bot)
|
@bors retry |
☀️ Test successful - checks-actions |
Finished benchmarking commit (9f85cd6): comparison url. Summary: This benchmark run did not return any relevant changes. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression |
This seems to have caused #90038 |
Given an enum with unassigned zero-discriminant, rust should prefer it for niche selection.
Zero as discriminant for
Option<Enum>
makes it possible for LLVM to optimize resulting asm.test eax, eax
instead ofcmp eax, ?
Example:
In this case discriminant zero is available as a niche.
Above example on nightly:
PR:
I created this PR because I had a performance regression when I tried to use an enum to represent legal grapheme byte-length for utf8.
Using an enum instead of
NonZeroU8
hereresulted in a performance regression of about 5%.
I consider this to be a somewhat realistic benchmark.
Thanks to @ogoffart for pointing me in the right direction!
Edit: Updated description