-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MicroBenchmarks/LoopVectorization] Add Microbenchmark for Epilogue Vectorization #165
[MicroBenchmarks/LoopVectorization] Add Microbenchmark for Epilogue Vectorization #165
Conversation
} \ | ||
BENCHMARK(benchReductionAutoVecFor##Ty##ForLoopTC##Itr); | ||
|
||
ADD_BENCHMARK(uint8_t, 65) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function specialization has an option to specialize on constants, but it is off by default. If however it would be enabled, I am wondering if an iteration count of 65 is small enough to trigger the full unroller, in which case we are no longer test what we want to test it. I know this is all a bit hypothetical and a nit, but perhaps to future prove this a little, just increase the iteration count a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth covering all possible trip counts for the epilogue, using something like
BENCHMARK_TEMPLATE(uint8_t)->DenseRange(65, 83, 1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, neat, thanks, I was wondering if there's a better way; latest version is now using BENCHMARK_TEMPLATE(...)->DenseRange(...)
to get all trip counts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now also running only a small set of tests by default to avoid potentially overloading build bots that run the test suite.
…ilogue Vectorization
… for Epilogue Vectorization Only run a reduced set of tests by default
…chmarks for Epilogue Vectorization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
This microbenchmark attempts to test the impact of epilogue vectorization on loops with varying epilogue lengths and vector widths for loops with and without reductions.