Cleanup for single vector sort/bitonic merge (and minor cleanup for argsort/argselect) #152
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch rewrites all of the single vector sorting and bitonic merging to use swizzle ops and generic masks to reduce code duplication. It also centralizes all of this logic into one file. Also I did a small cleanup to the argsort code
This should have no impact on performance at all, and my testing seems to confirm this:
AVX512 1 million performance
AVX512 1k performance
AVX512 16 value performance
I also simply deduplicated the logic for argsort/argselect, and cleaned up some naming there
Note that currently the test suite does not test the single vector sorting logic for qsort. The key-value versions are effectively tested by the kvsort tests, but nothing ends up testing the non key-value versions.