Skip to content

Consolidate bitwise operation implementations #8806

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I spent quite a while reading the code for various bitwise kernels while working
with @rluvaton on

I found there are many different, somewhat overlapping functions that are spread over the codebase

This makes it

  1. Hard to find the appropriate API
  2. Harder to optimize the correct operations as it is not clear always where
    the relevant code is is located

I also think my experience in #8793 suggests there is significant room for optimization,
and that it is important to have two sets of functions: Modify in place and return new buffer

I think part of the current state is somewhat due to history and some of these functions predated the Buffer APIs

Describe the solution you'd like
I would like the implementation for the "core" APIs to be clear:

  1. Create a new Buffer from a unary / binary bitwise operation
  2. Apply a unary/binary bitwise operation to an existing Mutable buffer in place

After #8619 we have 2

Describe alternatives you've considered

Thus, I propose we do the following:

  1. Add new Buffer::bitwise_unary and Buffer::bitwise_binary functions (that do the same thing as bitwise_bin_op_helper and bitwise_unary_op_helper) but are easier to find and use, and consistently named with PrimitiveArray::unary and PrimitiveArray::binary functions
  2. Add BooleanArray::binary and BooleanArray::unary functions that use the new Buffer functions internally
  3. Deprecate bitwise_bin_op_helper, and bitwise_unary_op_helper in favor of the new Buffer methods
  4. Deprecate special methods such as buffer_bin_or methods in favor of using the new Buffer methods directly

Then we'll basically have two core APIs:

  1. Apply bitwise operations in place (via apply_bitwise_unary / apply_bitwise_binary introduced in feat: add apply_unary_op and apply_binary_op bitwise operations #8619)
  2. Create a new buffer with the result of bitwise operations (via Buffer::bitwise_unary / Buffer::bitwise_binary)

Then all other APIs will then be thin wrappers around the core APIs and we can spend optimziation and testing efforts on these two core APIs.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrowChanges to the arrow crateenhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions