Implement SIMD comparison operations for types with less than 4 lanes (i128) #1146

jhorstmann · 2022-01-09T15:14:36Z

Which issue does this PR close?

Implements comparison for simd types with less than 8 lanes.

Closes #1136 .

What changes are included in this PR?

This PR changes the comparison kernel so that the simd portion can always append 64 bits at a time. Since the simd types are 512 bits wide, this means the inner comparison is unrolled, for example 8 times for Float64 (8 x 8lanes) or 4 times for Float32 (4 x 16lanes). For Int8 types it does not get unrolled since one comparison already results in 64 bits.

This should even speed up the comparison kernel a bit for common types, because there is less loop overhead.

On my laptop the simd version for i128 MonthDayNano types is not actually faster than the scalar version, on a more modern or server class machine there should be a slight speedup.

Unrelated to this change I also noticed that the code generation for non-avx512 machines is sub-optimal since the compiler has to emulate the 512bit wider operations using smaller vector registers, and for the bitmap generating code this has some overhead.

Are there any user-facing changes?

codecov-commenter · 2022-01-09T15:25:32Z

Codecov Report

Merging #1146 (bb99dab) into master (719096b) will decrease coverage by 0.00%.
The diff coverage is n/a.

❗ Current head bb99dab differs from pull request most recent head 63beecd. Consider uploading reports for the commit 63beecd to get more accurate results

@@            Coverage Diff             @@
##           master    #1146      +/-   ##
==========================================
- Coverage   82.55%   82.55%   -0.01%     
==========================================
  Files         169      169              
  Lines       50456    50456              
==========================================
- Hits        41655    41653       -2     
- Misses       8801     8803       +2

Impacted Files	Coverage Δ
arrow/src/compute/kernels/comparison.rs	`91.96% <ø> (ø)`
parquet_derive/src/parquet_field.rs	`65.98% <0.00%> (-0.46%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 719096b...63beecd. Read the comment docs.

paddyhoran

LGTM.

arrow/src/compute/kernels/comparison.rs

Co-authored-by: Paddy Horan <5733408+paddyhoran@users.noreply.github.com>

alamb · 2022-01-10T21:48:50Z

arrow/src/compute/kernels/comparison.rs

@@ -2723,8 +2743,6 @@ mod tests {
        );
    }

-    // Fails when simd is enabled: https://github.com/apache/arrow-rs/issues/1136


alamb · 2022-01-10T21:49:41Z

Thanks @jhorstmann

fyi @tustvold

jhorstmann added 4 commits January 9, 2022 12:12

Implement simd mask creation for 128 bit types

96d94a4

Adjust comparison kernels to always append 64 bit chunks

22c9803

Only append minimal number of bytes

f90b4be

Add benchmark for MonthDayNano comparison

1e25b3c

github-actions bot added the arrow Changes to the arrow crate label Jan 9, 2022

paddyhoran approved these changes Jan 10, 2022

View reviewed changes

arrow/src/compute/kernels/comparison.rs Outdated Show resolved Hide resolved

arrow/src/compute/kernels/comparison.rs Outdated Show resolved Hide resolved

jhorstmann and others added 2 commits January 10, 2022 15:18

Fix typo in comment

f7475b4

Co-authored-by: Paddy Horan <5733408+paddyhoran@users.noreply.github.com>

Fix typo in comment

63beecd

Co-authored-by: Paddy Horan <5733408+paddyhoran@users.noreply.github.com>

alamb reviewed Jan 10, 2022

View reviewed changes

alamb merged commit 6b1abbd into apache:master Jan 10, 2022

jhorstmann mentioned this pull request Feb 7, 2022

Fix simd comparison kernels #1286

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement SIMD comparison operations for types with less than 4 lanes (i128) #1146

Implement SIMD comparison operations for types with less than 4 lanes (i128) #1146

jhorstmann commented Jan 9, 2022

codecov-commenter commented Jan 9, 2022 •

edited

Loading

paddyhoran left a comment

alamb Jan 10, 2022

alamb commented Jan 10, 2022

Implement SIMD comparison operations for types with less than 4 lanes (i128) #1146

Implement SIMD comparison operations for types with less than 4 lanes (i128) #1146

Conversation

jhorstmann commented Jan 9, 2022

Which issue does this PR close?

What changes are included in this PR?

Are there any user-facing changes?

codecov-commenter commented Jan 9, 2022 • edited Loading

Codecov Report

paddyhoran left a comment

Choose a reason for hiding this comment

alamb Jan 10, 2022

Choose a reason for hiding this comment

alamb commented Jan 10, 2022

codecov-commenter commented Jan 9, 2022 •

edited

Loading