Add isnan and iszero #7274

sarutak · 2023-08-12T18:47:22Z

Which issue does this PR close?

Rationale for this change

In Datafusion, +NaN and -NaN are distinguished as well as +0.0 and -0.0.
For example, asin(10) returns NaN but 'inf'::DOUBLE / 'inf'::DOUBLE returns -NaN.
3.0 * 0 returns 0.0 but -3.0 * 0 returns -0.0.

So, users need to determine NaN and 0.0 like as follows.

SELECT ... WHERE x = 'NaN'::DOUBLE OR x = -'NaN'::DOUBLE;

I think it's nice to avoid such boilerplate code.

What changes are included in this PR?

This PR aims to add isnan and iszero which determine NaN and 0.0 ignoring their sign.

Are these changes tested?

Added new tests.

Are there any user-facing changes?

Yes but it doesn't break compatibility.

viirya · 2023-08-12T19:31:41Z

docs/source/user-guide/expressions.md

+| isnan(x)              | predicate determining whether NaN or not          |
+| iszero(x)             | predicate determining whether 0.0 or not          |


viirya · 2023-08-12T22:17:41Z

datafusion/expr/src/built_in_function.rs

+            | BuiltinScalarFunction::Isnan
+            | BuiltinScalarFunction::Iszero => {


I would consider Isnan and Iszero not math expressions as listed here. The reason to give f64 high priority is also not applied too for these two expressions, I think.

Based on above reason, maybe it is less confused to have them in a separate pattern.

Sorry for the late reply.
Yes, f64 doesn't need high priority.
I've changed it.

viirya

Looks good to me. They are also in Spark expressions too, although seems not all similar systems support these two expressions.

jimexist · 2023-08-14T16:08:20Z

datafusion/expr/src/expr_fn.rs

+    Isnan,
+    isnan,
+    num,
+    "returns true if a given number is +NaN or -NaN otherwise returns false"


i didn't know there are +NaN and -NaN. i thought there's only one type of NaN and (they) are unordered. unlike +0 and -0 which are distinct.

Any value with an exponent field of all 1s is a Nan, and so there are 2^N distinct values of NaN, where N is the number of mantissa bits. The same is true of -NaN

Within arrow-rs we follow the IEEE 754 total order predicate which establishes an ordering for NaNs, (and infinity, etc...)

…o nan-zero

sarutak · 2023-08-15T00:49:42Z

This CI failure seems caused by #7270. I opened #7286 to fix it.

…o nan-zero

alamb · 2023-08-15T18:06:07Z

docs/source/user-guide/sql/scalar_functions.md

@@ -283,6 +285,32 @@ gcd(expression_x, expression_y)
 - **expression_y**: Second numeric expression to operate on.
  Can be a constant, column, or function, and any combination of arithmetic operators.

+### `isnan`


alamb · 2023-08-15T18:06:59Z

I took the liberty of merging this branch up to resolve some conflicts.

I think it can be merged when the CI passes.

Thank you @sarutak

liukun4515 · 2023-08-16T08:24:03Z

datafusion/sqllogictest/test_files/math.slt

@@ -99,3 +99,15 @@ query RRR
 SELECT nanvl(asin(10), 1.0), nanvl(1.0, 2.0), nanvl(asin(10), asin(10))
 ----
 1 1 NaN
+
+# isnan
+query BBBB


liukun4515

LGTM, will merge it.

Add isnan and iszero.

944f6b6

github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Aug 12, 2023

viirya reviewed Aug 12, 2023

View reviewed changes

Modified doc.

0e18b21

viirya reviewed Aug 12, 2023

View reviewed changes

viirya approved these changes Aug 12, 2023

View reviewed changes

Dandandan approved these changes Aug 14, 2023

View reviewed changes

jimexist reviewed Aug 14, 2023

View reviewed changes

sarutak added 2 commits August 15, 2023 07:17

f64 doesn't need high priority.

e5a986b

Merge branch 'main' of https://github.com/apache/arrow-datafusion int…

ba1b83b

…o nan-zero

sarutak and others added 2 commits August 15, 2023 16:16

Merge branch 'main' of https://github.com/apache/arrow-datafusion int…

6148c56

…o nan-zero

Merge remote-tracking branch 'apache/main' into nan-zero

4b342e1

alamb reviewed Aug 15, 2023

View reviewed changes

liukun4515 reviewed Aug 16, 2023

View reviewed changes

liukun4515 approved these changes Aug 16, 2023

View reviewed changes

liukun4515 merged commit cf152af into apache:main Aug 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add isnan and iszero #7274

Add isnan and iszero #7274

sarutak commented Aug 12, 2023 •

edited

Loading

viirya Aug 12, 2023

viirya Aug 12, 2023

viirya Aug 12, 2023

sarutak Aug 14, 2023

viirya left a comment

jimexist Aug 14, 2023

tustvold Aug 14, 2023 •

edited

Loading

sarutak commented Aug 15, 2023

alamb Aug 15, 2023

alamb commented Aug 15, 2023

liukun4515 Aug 16, 2023

liukun4515 left a comment

		\| isnan(x) \| predicate determining whether NaN or not \|
		\| iszero(x) \| predicate determining whether 0.0 or not \|

		\| BuiltinScalarFunction::Isnan
		\| BuiltinScalarFunction::Iszero => {

Add isnan and iszero #7274

Add isnan and iszero #7274

Conversation

sarutak commented Aug 12, 2023 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

viirya Aug 12, 2023

Choose a reason for hiding this comment

viirya Aug 12, 2023

Choose a reason for hiding this comment

viirya Aug 12, 2023

Choose a reason for hiding this comment

sarutak Aug 14, 2023

Choose a reason for hiding this comment

viirya left a comment

Choose a reason for hiding this comment

jimexist Aug 14, 2023

Choose a reason for hiding this comment

tustvold Aug 14, 2023 • edited Loading

Choose a reason for hiding this comment

sarutak commented Aug 15, 2023

alamb Aug 15, 2023

Choose a reason for hiding this comment

alamb commented Aug 15, 2023

liukun4515 Aug 16, 2023

Choose a reason for hiding this comment

liukun4515 left a comment

Choose a reason for hiding this comment

sarutak commented Aug 12, 2023 •

edited

Loading

tustvold Aug 14, 2023 •

edited

Loading