Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compile-time optimization #219

Closed
Trass3r opened this issue Nov 4, 2022 · 11 comments
Closed

compile-time optimization #219

Trass3r opened this issue Nov 4, 2022 · 11 comments
Assignees

Comments

@Trass3r
Copy link

Trass3r commented Nov 4, 2022

The code is quite heavy on compile-time and could use some optimizations (e.g. every enum_cast takes 1s to instantiate in my test).
clang got improved recently to cover constexpr evaluations in time traces which will come in handy: https://reviews.llvm.org/D136022.

Half of the parsing time alone is spent on constexpr_switch for example, and is_flags_enum is very heavy to instantiate because it calls values.
Here's the test-cpp17 time trace generated with -ftime-trace -ftime-trace-granularity=100, can be viewed with https://ui.perfetto.dev.

@Neargye
Copy link
Owner

Neargye commented Nov 4, 2022

Thank you for your analysis, now I am at least splitting the header into several smaller ones, then I will look at how to optimize functions

@Neargye Neargye self-assigned this Nov 4, 2022
@Trass3r
Copy link
Author

Trass3r commented Nov 5, 2022

Seems like all these calls quickly add up:

constexpr bool valid[sizeof...(I)] = {is_valid<E, ualue<E, Min, IsFlags>(I)>()...};

The variant used for customization also has some overhead.
And that pretty_name function is also a good optimization target since it's called so often. Not sure why it's doing a complicated identifier search, in my toy code I just searched for the =/<. I guess is_valid can't be optimized much since enum identifiers can end in a number, but if n returned a string_view the copy to static_string could be avoided in that case.

@Neargye
Copy link
Owner

Neargye commented Nov 7, 2022

Good catch
I changed the code a little, now is_valid work with string_view not static_string
For default constexpr_switch is disabled, it can be enabled by build options MAGIC_ENUM_ENABLE_HASH

@Trass3r
Copy link
Author

Trass3r commented Nov 7, 2022

And that pretty_name function is also a good optimization target since it's called so often. Not sure why it's doing a complicated identifier search, in my toy code I just searched for the =/<.

Though it's probably faster to just hard-code the offsets: https://godbolt.org/z/exqMMq5G3
A little less generic but the whole trick is already highly compiler- and implementation-dependent anyway.
But some benchmarking needs to be done.

@Neargye
Copy link
Owner

Neargye commented Nov 8, 2022

It seems to me that the optimization pretty_name is not so critical, could you restart the profiler and see if it got better after update?

@Trass3r
Copy link
Author

Trass3r commented Nov 9, 2022

It is cause it gets called max-min times via is_valid for each enum after all. Even after my optimization n still takes ~300us, times 240 = at least 80ms per enum. But there's additional overhead so values takes more than 300ms per enum.
The parsing overhead for constexpr_switch is gone though. Now the parsing is dominated by the variant include and its instantiation inside enum_name.

@Neargye
Copy link
Owner

Neargye commented Nov 9, 2022

I'll try to replace the variant with something like tuple, I need to think about it.

@Trass3r
Copy link
Author

Trass3r commented Nov 9, 2022

The profile was a bit skewed though due to other includes in the test.cpp code.
A good base benchmark for the "fixed" costs is to compile

#include <magic_enum.hpp>
int main(){}

Parsing took 400ms in this run, (166ms of which is for which would often already be included like in the test code):
https://gist.github.com/Trass3r/e193286d4c2d040324140a843d007578

Still, variant is a less commonly used header and if it can be avoided all the better.
But tuple is also expensive btw, esp. in libstdc++ IIRC since they implemented it in a sub-optimal way and now can't change it cause of backward-compatibility. In general it shouldn't be used if the parameters are fixed and known.

@Neargye
Copy link
Owner

Neargye commented Nov 9, 2022

Yes, I thought it would be better to take std::pair

@Neargye
Copy link
Owner

Neargye commented Nov 30, 2022

I don't have new idea what can be optimized yet.
Considering the messages from #207, compile time has become better, I think we can close this.
If there are new ideas for improvement, be sure to let me know in the new issues.

@Neargye Neargye closed this as completed Nov 30, 2022
@Trass3r
Copy link
Author

Trass3r commented Feb 6, 2023

Parsing times are better after #221 but as soon as you actually use magic_enum template instantiations are still the main issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants