-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor codegen for Debug
impls of C-like enums with many variants
#114106
Comments
It was even worse before #106884 We can probably get much better codegen and compile times by expanding to indexing into a |
@wesleywiser There is also another important difference between the two versions: all variants in the second enum have the default discriminant, meaning all valid values are 0..128 which Rustc uses to optimize the string lookup by creating a lookup table. You can see that this optimization vanishes if you set any discriminant manually: godbolt |
I wonder if part of the reason LLVM is having trouble here is the usage of an i8. Looking at the emitted IR you can see the values in the switch wrap around to -128. Could that be confusing its check for "continuousness"? Would casting the discriminant up to a bigger size so everything can stay positive change anything? Of course it'd also be great if we had a solution to improve the situation here for non-continuous but still fieldless c-like enums. Would using something like a phf or some other lookup structure be an option? |
Actually I think I might be on to something. The optimization reappears at 256 variants https://godbolt.org/z/jd5WoqEWK Putting a |
I think this should be optimized to miss some edge cases. |
Wow, this must be Karma, as I just experimented with a very similar case just yesterday, and documented all my finding on https://swatinem.de/blog/optimizing-enums/ The usecase I was looking at was enums with sparse discriminants, see rust-minidump/rust-minidump#847. In my example, using sparse discriminants works up to a certain threshold (haven’t figured out which one exactly though) // creates a lookup table with some empty/duplicate rows:
pub enum EnumLookup {
A = 0,
BB = 4,
CCC,
ABCD,
}
// creates a chain of conditional jumps:
pub enum EnumCmp {
A = 0,
BB = 200,
CCC,
ABCD,
} As in my example the discriminants are very sparse, I was experimenting successfully with using a perfect hash function to transform the sparse discriminant into an dense index for lookup, see rust-minidump/rust-minidump#861 |
@rustbot claim |
Take for instance
nix::errno::Errno
which has 132 variants. When passing-Copt-level=3
:llvm-ir
(godbolt)
Of note, the first case in the switch branches directly to
bb134
while all others branch to their own basic block which immediately branches tobb134
. I'm guessing some analysis in LLVM gives up on large switch terminators as removing 4 variants (so that there are exactly 128 cases) results in much more optimal IR:(godbolt)
The impact on codesize is significant with the 132-variant version of
<Errno as Debug>::fmt
weighing in at 2,650 bytes and the 128-variant version occupying only 40 bytes.The text was updated successfully, but these errors were encountered: