Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize template instantiations #239

Closed
Trass3r opened this issue Feb 6, 2023 · 8 comments
Closed

optimize template instantiations #239

Trass3r opened this issue Feb 6, 2023 · 8 comments
Assignees

Comments

@Trass3r
Copy link

Trass3r commented Feb 6, 2023

Template instantiations still cause long compile-times even for simple projects, as already identified in #219 (comment).

https://devblogs.microsoft.com/cppblog/profiling-template-metaprograms-with-cpp-build-insights/

vcperf -start -level3 MyVCSession
vcperf -stop -templates MyVCSession build.etl
wpa build.etl

image
image

This is using current master, so includes #227.
The ideas mentioned there (custom string_view class and single function instantiation) are worth pursuing.
And using hard-coded offsets potentially (#219 (comment)).

@Neargye Neargye self-assigned this Feb 6, 2023
@Trass3r
Copy link
Author

Trass3r commented Feb 6, 2023

It's hard to get good benchmarking results. Probably a good idea to run the compiler directly and a couple of times to filter the results, with something like (WPA profile):

@rem build PCH etc
devenv test.sln /build
@rem ensure warm file cache
cl /c -Yucmake_pch.hxx -Fpcmake_pch.pch -FIcmake_pch.hxx -w test.cpp

vcperf -start -level3 MyVCSession
for /L %%i in (1,1,5) do cl /c -Yucmake_pch.hxx -Fpcmake_pch.pch -FIcmake_pch.hxx -w test.cpp
vcperf -stop -templates MyVCSession build.etl
wpaexporter build.etl -profile instantiations.wpaProfile -delimiter "\t"
type "C++_Build_Insights_-_Template_Instantiations_My_Stats.csv"

For example, baseline:

Primary Template Name   Duration Sum (s)    Duration Avg (ms)   Duration Min (ms)   Duration Max (ms)   Count
magic_enum::detail::values      48.617529000    173.634032      38.813000       394.312000      280
magic_enum::detail::names_v     35.162485000    502.321214      447.216000      877.686000      70
magic_enum::detail::count_v     33.023892000    471.769885      442.146000      658.392000      70
magic_enum::detail::values_v    33.018903000    471.698614      442.081000      658.321000      70
magic_enum::detail::is_flags_v  30.420861000    434.583728      408.380000      591.376000      70
magic_enum::detail::is_flags_enum       27.474068000    392.486685      369.967000      517.662000      70
magic_enum::detail::is_valid    14.821388000    0.758709        0.671000        4.289000        19535
magic_enum::detail::n   7.343306000     0.375905        0.315000        1.908000        19535
magic_enum::detail::names       1.995899000     28.512842       3.888000        215.031000      70
magic_enum::detail::enum_name_v 1.838301000     1.598522        0.874000        5.631000        1150
magic_enum::detail::enum_name   0.977682000     0.850158        0.583000        3.598000        1150

n can be optimized a little bit by using hard-coded prefix lengths but the bigger gains need to be searched on a higher level:

magic_enum::detail::values      44.450258000    158.750921      34.668000       356.904000      280
magic_enum::detail::names_v     29.914679000    427.352557      400.139000      559.726000      70
magic_enum::detail::count_v     28.340313000    404.861614      392.682000      483.843000      70
magic_enum::detail::values_v    28.336586000    404.808371      392.638000      483.782000      70
magic_enum::detail::is_flags_v  26.508303000    378.690042      367.016000      456.854000      70
magic_enum::detail::is_flags_enum       24.472120000    349.601714      338.823000      427.931000      70
magic_enum::detail::is_valid    14.244608000    0.729183        0.662000        3.397000        19535
magic_enum::detail::n   6.659712000     0.340911        0.304000        2.640000        19535
magic_enum::detail::names       1.425296000     20.361371       3.251000        143.296000      70
magic_enum::detail::enum_name_v 1.259279000     1.090284        0.811000        3.312000        1155
magic_enum::detail::enum_name   0.693120000     0.600103        0.527000        1.629000        1155

@Neargye
Copy link
Owner

Neargye commented May 20, 2023

  • n was optimized
  • is_flags_v and is_flags_enum was optimized

@Neargye Neargye assigned Neargye and unassigned Neargye May 20, 2023
@Neargye
Copy link
Owner

Neargye commented May 21, 2023

What else can we do to optimize?
As I remember, add own lightweight string_view implementation

@Neargye Neargye added this to the v0.9.1 milestone May 21, 2023
@Trass3r
Copy link
Author

Trass3r commented May 21, 2023

Indeed, see #227.

@Neargye
Copy link
Owner

Neargye commented May 21, 2023

Another optimize: removed unnecessary array copying

@Neargye Neargye closed this as completed Jun 2, 2023
@Neargye
Copy link
Owner

Neargye commented Jun 2, 2023

own lightweight string_view implemented

@Trass3r
Copy link
Author

Trass3r commented Jul 29, 2023

Someone posted an interesting idea of using the debug mode code size as a proxy metric to optimize template instantiations. Something like that could also be easily tracked via Github Actions.

@Neargye
Copy link
Owner

Neargye commented Jul 29, 2023

Wow! Thanks for sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants