Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Nov 5, 2025

Which issue does this PR close?

Closes #666

Rationale for this change

This PR is based on #2595

What changes are included in this PR?

The differences since #2595 are:

  • Revert test changes
  • Implement new tests
  • Fallback to Spark for unsupported types DayTimeIntervalType and YearMonthIntervalType
  • Fix a potential bug with null scalar handling (although I am not sure if this code path even gets used with the Spark integration)

How are these changes tested?

@andygrove andygrove marked this pull request as ready for review November 5, 2025 02:28
Comment on lines -1433 to -1434
// https://github.com/apache/datafusion-comet/issues/666
ignore("abs") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented new tests using the fuzz data generator

@andygrove
Copy link
Member Author

@hsiang-c could you review?

@codecov-commenter
Copy link

codecov-commenter commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 87.50000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.32%. Comparing base (f09f8af) to head (a8c1d0a).
⚠️ Report is 661 commits behind head on main.

Files with missing lines Patch % Lines
...k/src/main/scala/org/apache/comet/serde/math.scala 85.71% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2689      +/-   ##
============================================
+ Coverage     56.12%   59.32%   +3.20%     
- Complexity      976     1448     +472     
============================================
  Files           119      147      +28     
  Lines         11743    13816    +2073     
  Branches       2251     2370     +119     
============================================
+ Hits           6591     8197    +1606     
- Misses         4012     4393     +381     
- Partials       1140     1226      +86     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

| ScalarValue::UInt32(_)
| ScalarValue::UInt64(_) => Ok(args[0].clone()),
ScalarValue::Int8(a) => match a {
None => Ok(args[0].clone()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

#[test]
fn test_abs_i8_scalar() {
with_fail_on_error(|fail_on_error| {
let args = ColumnarValue::Scalar(ScalarValue::Int8(Some(i8::MIN)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need a scalar test for the None case.

case _: NumericType =>
Compatible()
case _ =>
// Spark supports NumericType, DayTimeIntervalType, and YearMonthIntervalType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

StructField("c1", DataTypes.ShortType, nullable = true),
StructField("c2", DataTypes.IntegerType, nullable = true),
StructField("c3", DataTypes.LongType, nullable = true),
StructField("c4", DataTypes.FloatType, nullable = true),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are NaNs included in the test data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the fuzz generator generates edge cases like NaN, Infinity, -0.0 (optionally).

Copy link
Contributor

@hsiang-c hsiang-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Andy.

f.dataType == DataTypes.FloatType || f.dataType == DataTypes.DoubleType)) {
val col = field.name
checkSparkAnswerAndOperator(
s"SELECT $col, abs($col) FROM tbl WHERE signum($col) < 0 ORDER BY $col")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the value of $col is -0.0 (a negative zero) then signum($col) would return 0.
Just checking whether this is the intended behavior because the test is about negative zero specifically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @martin-g. I updated the test. Results:

+----+-------+
|c4  |abs(c4)|
+----+-------+
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
|-0.0|0.0    |
+----+-------+

Ok(())
}
Err(e) => {
if fail_on_error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that it would overflow for unsigned integers ?
IMO it should always panic here, i.e. leave only the else clause body.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, thanks


fn with_fail_on_error<F: Fn(bool) -> Result<()>>(test_fn: F) {
for fail_on_error in [true, false] {
let _ = test_fn(fail_on_error);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would ignore the returned Result.

Suggested change
let _ = test_fn(fail_on_error);
let _ = test_fn(fail_on_error)?;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I couldn't add ? but added .expect

false
};

match &args[0] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no check that the function is called with at least one argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

},
},
ScalarValue::Int16(a) => match a {
None => Ok(args[0].clone()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The code for handling Int16/32/64 is very similar to Int8.
It could be extracted to a declarative macro.

#[test]
fn test_abs_f32_array() {
with_fail_on_error(|fail_on_error| {
let input = Float32Array::from(vec![Some(-1f32), Some(f32::MIN), Some(f32::MAX), None]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also have NaN, Infinities?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

andygrove and others added 4 commits November 5, 2025 12:40
Co-authored-by: Oleks V <comphead@users.noreply.github.com>
Co-authored-by: Oleks V <comphead@users.noreply.github.com>
Co-authored-by: Oleks V <comphead@users.noreply.github.com>
}

#[test]
fn test_abs_f32_array() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to have this edge case

scala> spark.sql("select abs(-0.0), abs(0.0)").show(false)
+--------+--------+
|abs(0.0)|abs(0.0)|
+--------+--------+
|0.0     |0.0     |
+--------+--------+

Because Rust

fn main() {
    println!("{}", -0f32.abs());
}

-0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. We also have specific Spark test for negative zero case.

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove and @hsiang-c for supporting abs

@andygrove andygrove merged commit fc3e6e9 into apache:main Nov 6, 2025
175 of 178 checks passed
@andygrove andygrove deleted the enable_abs branch November 6, 2025 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

abs returns incorrect value in some cases

5 participants