Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Enums with more than 256 variants #45

Open
ZJaume opened this issue Sep 23, 2024 · 4 comments
Open

Support for Enums with more than 256 variants #45

ZJaume opened this issue Sep 23, 2024 · 4 comments

Comments

@ZJaume
Copy link

ZJaume commented Sep 23, 2024

Are there any plans on supporting enums larger than u8? I'm currently developing a language identifier that has 238 variants for the language codes and will probably surpass that amount in the future when more languages are added.

Also, I found the error a bit confusing

error: enums with more than 256 variants are not supported
  --> src/main.rs:10:10
   |
10 | pub enum Lang {
   |          ^^^^

because it does not say anything about what is causing the error. So I did not know if it was a limitation from the language, the strum derives or the bitcode derives.

Thank you very much for your library. It was one of the main optimizations I did for my tool and was able to cut down the model loading time more than a half.

@caibear
Copy link
Collaborator

caibear commented Sep 23, 2024

The main reason I limited enums to a 256 variants is because bitcode 0.6 requires calculating a histogram of all the variants of each enum while deserializing. With 256 variants a [usize; 256] can be used to calculate the histogram which is quite fast. For an arbitrary number of variants a HashMap<u32, usize> is required which isn't fast.

@finnbear
Copy link
Member

finnbear commented Sep 23, 2024

requires calculating a histogram

Perhaps an exception can be made for C-style enums, i.e. those that lack fields in their variants, which don't need a histogram.

@ZJaume
Copy link
Author

ZJaume commented Sep 24, 2024

Is a [usize; 65536] too much for that histogram?

@caibear
Copy link
Collaborator

caibear commented Sep 24, 2024

Is a [usize; 65536] too much for that histogram?

Yes, because that would require iterating all 65536 elements to see which ones are non zero which is significant overhead.
This would only be faster than a HashMap<u32, usize> for calculating a histogram of a very large message where the 1 time cost of iteration does not dominate the runtime.

@finnbear does bring up a good point that histograms are only required non C-style enums so we could support u16 C-style enums easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants