-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This came up in the context of this PR in DataFusion:
In that case we are applying some operations to a PrimitiveArray and would like to reuse the allocation if possible
However, the current API of PrimitiveArray::unary_mut and similar functions makes this awkward to do as the caller must handle the case where the allocation can not be reused
// want to apply an operation to arr, reusing allocation if possible
let arr: PrimitiveArray<u64> = ...
// to do so we call try_unary but also must handle when the allocation is shared
let new_arr = match arr.unary_mut(|a| a+ 1) {
Ok(arr) => arr,
Err(old_arr) => old_arr.unary(|a| a+1)
}This can be done, but it is hard to use.
I proposed the following function in DataFusion
/// Applies the unary operation in place if possible, or cloning the array if not
fn try_unary_mut_or_clone<F>(
array: PrimitiveArray<Int64Type>,
op: F,
) -> Result<PrimitiveArray<Int64Type>>
where
F: Fn(i64) -> Result<i64>,
{
match array.try_unary_mut(&op) {
Ok(result) => result,
// on error, make a new array
Err(array) => array.try_unary(op),
}
}but quoting @findepi on https://github.com/apache/datafusion/pull/18360/files#r2475557450:
can this be made more flexible with a more generous use of generics?
perhaps it could even be in arrow-rs. it makes try_unary_mut significantly more approachable
Describe the solution you'd like
I would like it to be easier to apply unary and binary operations on PrimitiveArrays and reuse the allocation if possble
Describe alternatives you've considered
One alternative would be to follow the API of Arc::unwrap_or_clone
So that would mean functions something like
PrimitiveArray::unary_mut_or_clonePrimitiveArray::try_unary_mut_or_clonePrimitiveArray::binary_mut_or_clonePrimitiveArray::try_binary_mut_or_clone
Which would be implemented like the function above
I think this would make it much easier to use these APIs