Optimize vectorscan for aarch64 by using shrn instruction #113

danlark1 · 2022-06-26T23:05:41Z

This optimization is based on thread
https://twitter.com/Danlark1/status/1539344279268691970 and uses
shift right and narrow by 4 instruction

To achieve that, I needed to redesign a little movemask into comparemask
and have an additional step towards mask iteration. Our benchmarks
showed 10-15% improvement on average for long matches.

This optimization is based on the thread https://twitter.com/Danlark1/status/1539344279268691970 and uses shift right and narrow by 4 instruction https://developer.arm.com/documentation/ddi0596/2020-12/SIMD-FP-Instructions/SHRN--SHRN2--Shift-Right-Narrow--immediate-- To achieve that, I needed to redesign a little movemask into comparemask and have an additional step towards mask iteration. Our benchmarks showed 10-15% improvement on average for long matches.

danlark1 · 2022-07-01T11:25:09Z

Friendly ping

markos · 2022-07-01T11:32:09Z

Apologies, I am in the process of reconfiguring our jenkins setup and I can't really merge anything until that is fixed, just in case it breaks anything on other platforms/configurations. It will get merged, but please be patient.

markos · 2022-07-20T07:52:59Z

hi @danlark1 jenkins is finally back online, the PR fails in one AVX512 supervector test, looks like a simple fix:

https://jenkins.vectorcamp.gr/blue/organizations/jenkins/VectorCamp%2Fvectorscan-ci/detail/vectorscan-ci/95/pipeline/208

markos · 2022-07-20T13:42:24Z

@danlark1 Thanks for your contribution and apologies for the delay in merging this!

danilak-G added 4 commits June 26, 2022 22:55

Fix formatting of a couple files

8a49e20

Minor fix

8498467

Fix ppc64el debug

7e7f604

danlark1 changed the title ~~DevelopOptimize vectorscan for aarch64 by using shrn instruction~~ Optimize vectorscan for aarch64 by using shrn instruction Jun 26, 2022

markos closed this Jul 20, 2022

markos reopened this Jul 20, 2022

Fix avx512 movemask call

db52ce6

markos merged commit 19947f7 into VectorCamp:develop Jul 20, 2022

rschu1ze mentioned this pull request Sep 13, 2022

Bump vectorscan to 5.4.8 ClickHouse/ClickHouse#41270

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize vectorscan for aarch64 by using shrn instruction #113

Optimize vectorscan for aarch64 by using shrn instruction #113

danlark1 commented Jun 26, 2022

danlark1 commented Jul 1, 2022

markos commented Jul 1, 2022

markos commented Jul 20, 2022

markos commented Jul 20, 2022

Optimize vectorscan for aarch64 by using shrn instruction #113

Optimize vectorscan for aarch64 by using shrn instruction #113

Conversation

danlark1 commented Jun 26, 2022

danlark1 commented Jul 1, 2022

markos commented Jul 1, 2022

markos commented Jul 20, 2022

markos commented Jul 20, 2022