-
Notifications
You must be signed in to change notification settings - Fork 1.8k
fix: Support Dictionary[Int32, Binary] for bitmap count spark function #18273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to consider other key & value types? Since this only implements for i32 x Binary
| Dictionary(k, v) if k.as_ref() == &DataType::Int32 && v.as_ref() == &Binary => { | ||
| let dict_array = as_dictionary_array::<Int32Type>(input_array); | ||
| let binary_array = as_binary_array(dict_array.values())?; | ||
|
|
||
| let result: Int64Array = dict_array | ||
| .keys() | ||
| .iter() | ||
| .map(|key| { | ||
| key.and_then(|k| { | ||
| binary_count_ones(Some(binary_array.value(k as usize))) | ||
| }) | ||
| }) | ||
| .collect(); | ||
|
|
||
| Ok(result) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Dictionary(k, v) if k.as_ref() == &DataType::Int32 && v.as_ref() == &Binary => { | |
| let dict_array = as_dictionary_array::<Int32Type>(input_array); | |
| let binary_array = as_binary_array(dict_array.values())?; | |
| let result: Int64Array = dict_array | |
| .keys() | |
| .iter() | |
| .map(|key| { | |
| key.and_then(|k| { | |
| binary_count_ones(Some(binary_array.value(k as usize))) | |
| }) | |
| }) | |
| .collect(); | |
| Ok(result) | |
| } | |
| Dictionary(k, v) if k.as_ref() == &DataType::Int32 && v.as_ref() == &Binary => { | |
| let dict_array = as_dictionary_array::<Int32Type>(input_array); | |
| let array = dict_array.downcast_dict::<BinaryArray>().unwrap(); | |
| Ok(array | |
| .into_iter() | |
| .map(binary_count_ones) | |
| .collect::<Int64Array>()) | |
| } |
Using TypedDictionaryArray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
| } | ||
|
|
||
| #[test] | ||
| fn test_dictionary_encoded_bitmap_count_invoke() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add this as an SLT instead? Can make use of arrow_cast(value, 'Dictionary(...)') to ensure the values are dictionaries
|
Thanks @kazantsev-maksim |
apache#18273) ## Which issue does this PR close? Closes apache#18058 ## Rationale for this change When adding the bitmap_count function to Comet, we get the following error - org.apache.comet.CometNativeException: Error from DataFusion: bitmap_count expects Binary/BinaryView/FixedSizeBinary/LargeBinary as argument, got Dictionary(Int32, Binary). ## Are these changes tested? Added new UT --------- Co-authored-by: Kazantsev Maksim <mn.kazantsev@gmail.com>
apache#18273) ## Which issue does this PR close? Closes apache#18058 ## Rationale for this change When adding the bitmap_count function to Comet, we get the following error - org.apache.comet.CometNativeException: Error from DataFusion: bitmap_count expects Binary/BinaryView/FixedSizeBinary/LargeBinary as argument, got Dictionary(Int32, Binary). ## Are these changes tested? Added new UT --------- Co-authored-by: Kazantsev Maksim <mn.kazantsev@gmail.com>
Which issue does this PR close?
Closes #18058
Rationale for this change
When adding the bitmap_count function to Comet, we get the following error - org.apache.comet.CometNativeException: Error from DataFusion: bitmap_count expects Binary/BinaryView/FixedSizeBinary/LargeBinary as argument, got Dictionary(Int32, Binary).
Are these changes tested?
Added new UT