Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Nov 16, 2025

…thods, deprecate old methods

Which issue does this PR close?

Rationale for this change

  1. bitwise_bin_op_helper and bitwise_unary_op_helper are somewhat hard to find and use
    as explained on WIP: special case bitwise ops when buffers are u64 aligned #8807

  2. I want to optimize bitwise operations even more heavily (see WIP: special case bitwise ops when buffers are u64 aligned #8807) so I want the implementations centralized so I can focus the efforts there

Also, I think these APIs I think cover the usecase explained by @jorstmann on #8561:

Building a new buffer by starting from an empty state and incrementally appending new bits (append_value, append_slice, append_packed_range and similar methods).

By creating a method on Buffer directly, it is easier to find, and it is clearer that
a new Buffer is being created.

What changes are included in this PR?

Changes:

  1. Add Buffer::from_bitwise_unary and Buffer::from_bitwise_binary methods that do the same thing as bitwise_unary_op_helper and bitwise_bin_op_helper but are easier to find and use
  2. Deprecate bitwise_unary_op_helper and bitwise_bin_op_helper in favor
    of the new Buffer methods
  3. Document the new methods, with examples (specifically that the bitwise operations
    operate on bits, not bytes and shouldn't do any cross byte operations)

Are these changes tested?

Yes, new doc tests

Are there any user-facing changes?

New APIs, some deprecated

@github-actions github-actions bot added the arrow Changes to the arrow crate label Nov 16, 2025
@alamb alamb force-pushed the alamb/bitwise_ops branch from 3c68505 to 69e68a1 Compare November 16, 2025 14:02
@alamb alamb force-pushed the alamb/bitwise_ops branch from 69e68a1 to d5a3604 Compare November 16, 2025 14:04
@alamb
Copy link
Contributor Author

alamb commented Nov 16, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (d5a3604) to ca4a0ae diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench boolean_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Nov 16, 2025

🤖: Benchmark completed

Details

group         alamb_bitwise_ops                      main
-----         -----------------                      ----
and           1.00    272.6±1.27ns        ? ?/sec    1.00    272.7±0.86ns        ? ?/sec
and_sliced    1.00   1096.3±7.89ns        ? ?/sec    1.00   1094.7±3.34ns        ? ?/sec
not           1.00    213.1±0.25ns        ? ?/sec    1.00    214.2±1.06ns        ? ?/sec
not_sliced    1.01    965.5±1.32ns        ? ?/sec    1.00    960.6±3.89ns        ? ?/sec
or            1.01    255.1±0.63ns        ? ?/sec    1.00    253.8±1.86ns        ? ?/sec
or_sliced     1.00   1228.0±7.56ns        ? ?/sec    1.00  1227.8±18.85ns        ? ?/sec

@alamb
Copy link
Contributor Author

alamb commented Nov 16, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (d5a3604) to ca4a0ae diff
BENCH_NAME=buffer_bit_ops
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench buffer_bit_ops
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Nov 16, 2025

🤖: Benchmark completed

Details

group                                alamb_bitwise_ops                      main
-----                                -----------------                      ----
buffer_binary_ops/and                1.00    259.6±0.56ns    55.1 GB/sec    1.00    258.9±2.00ns    55.2 GB/sec
buffer_binary_ops/and_with_offset    1.12   1486.1±2.12ns     9.6 GB/sec    1.00   1322.8±9.40ns    10.8 GB/sec
buffer_binary_ops/or                 1.00    239.3±0.60ns    59.8 GB/sec    1.07    256.3±1.96ns    55.8 GB/sec
buffer_binary_ops/or_with_offset     1.00   1355.4±2.50ns    10.6 GB/sec    1.10  1484.8±14.40ns     9.6 GB/sec
buffer_unary_ops/not                 1.14    257.5±0.71ns    37.0 GB/sec    1.00    225.9±3.19ns    42.2 GB/sec
buffer_unary_ops/not_with_offset     1.00    868.1±2.51ns    11.0 GB/sec    1.34  1160.1±14.15ns     8.2 GB/sec

@alamb
Copy link
Contributor Author

alamb commented Nov 16, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (d5a3604) to ca4a0ae diff
BENCH_NAME=boolean_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench boolean_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Nov 16, 2025

🤖: Benchmark completed

Details

group         alamb_bitwise_ops                      main
-----         -----------------                      ----
and           1.00    272.4±1.45ns        ? ?/sec    1.00    273.1±1.36ns        ? ?/sec
and_sliced    1.00   1096.0±1.60ns        ? ?/sec    1.00   1095.1±2.77ns        ? ?/sec
not           1.00    213.8±0.29ns        ? ?/sec    1.00    214.0±0.40ns        ? ?/sec
not_sliced    1.00    965.6±9.77ns        ? ?/sec    1.00    961.8±5.75ns        ? ?/sec
or            1.00    254.1±0.66ns        ? ?/sec    1.01    255.6±0.41ns        ? ?/sec
or_sliced     1.00   1225.5±2.12ns        ? ?/sec    1.00   1226.9±7.43ns        ? ?/sec

@alamb
Copy link
Contributor Author

alamb commented Nov 16, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/bitwise_ops (d5a3604) to ca4a0ae diff
BENCH_NAME=buffer_bit_ops
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench buffer_bit_ops
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_bitwise_ops
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Nov 16, 2025

🤖: Benchmark completed

Details

group                                alamb_bitwise_ops                      main
-----                                -----------------                      ----
buffer_binary_ops/and                1.00    259.7±0.55ns    55.1 GB/sec    1.00    259.3±4.36ns    55.2 GB/sec
buffer_binary_ops/and_with_offset    1.13   1486.2±3.20ns     9.6 GB/sec    1.00   1320.5±3.78ns    10.8 GB/sec
buffer_binary_ops/or                 1.00    239.2±0.34ns    59.8 GB/sec    1.07    256.2±0.89ns    55.8 GB/sec
buffer_binary_ops/or_with_offset     1.00   1355.8±4.32ns    10.6 GB/sec    1.09   1483.7±4.32ns     9.6 GB/sec
buffer_unary_ops/not                 1.13    257.1±0.97ns    37.1 GB/sec    1.00    226.6±1.72ns    42.1 GB/sec
buffer_unary_ops/not_with_offset     1.00    863.6±3.06ns    11.0 GB/sec    1.32   1139.4±2.91ns     8.4 GB/sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant