Skip to content

Conversation

@kazantsev-maksim
Copy link
Contributor

Which issue does this PR close?

Closes #18225

Rationale for this change

After adding the bit_count function in Comet, we got different results from Spark. (apache/datafusion-comet#2553)

Are these changes tested?

Tested with existing unit tests

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) spark labels Oct 28, 2025
}

// Here’s the equivalent Rust implementation of the bitCount function (similar to Apache Spark's bitCount for LongType)
fn bit_count(i: i64) -> i32 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we enhance the comment here with this link + potentially link to original source code this actual function is copied from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added

let value_array = value_array[0].as_ref();
match value_array.data_type() {
DataType::Int8 => {
DataType::Int8 | DataType::Boolean => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark supports only signed int types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems misplaced as the code adds support for boolean 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this code will now panic if you pass in a Boolean array

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to note that Spark only supports signed integer and boolean types as input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an additional test for BooleanArray.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

spark sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bit_count in spark create not fuly compatible with spark

3 participants