Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize vectorscan for aarch64 by using shrn instruction #113

Merged
merged 5 commits into from
Jul 20, 2022

Conversation

danlark1
Copy link

This optimization is based on thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction

To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.

This optimization is based on the thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate--

To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.
@danlark1 danlark1 changed the title DevelopOptimize vectorscan for aarch64 by using shrn instruction Optimize vectorscan for aarch64 by using shrn instruction Jun 26, 2022
@danlark1
Copy link
Author

danlark1 commented Jul 1, 2022

Friendly ping

@markos
Copy link

markos commented Jul 1, 2022

Apologies, I am in the process of reconfiguring our jenkins setup and I can't really merge anything until that is fixed, just in case it breaks anything on other platforms/configurations. It will get merged, but please be patient.

@markos markos closed this Jul 20, 2022
@markos markos reopened this Jul 20, 2022
@markos
Copy link

markos commented Jul 20, 2022

hi @danlark1 jenkins is finally back online, the PR fails in one AVX512 supervector test, looks like a simple fix:

https://jenkins.vectorcamp.gr/blue/organizations/jenkins/VectorCamp%2Fvectorscan-ci/detail/vectorscan-ci/95/pipeline/208

@markos markos merged commit 19947f7 into VectorCamp:develop Jul 20, 2022
@markos
Copy link

markos commented Jul 20, 2022

@danlark1 Thanks for your contribution and apologies for the delay in merging this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants