-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add isnan and iszero #7274
Add isnan and iszero #7274
Conversation
| isnan(x) | predicate determining whether NaN or not | | ||
| iszero(x) | predicate determining whether 0.0 or not | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| isnan(x) | predicate determining whether NaN or not | | |
| iszero(x) | predicate determining whether 0.0 or not | | |
| isnan(x) | predicate determining whether NaN/-NaN or not | | |
| iszero(x) | predicate determining whether 0.0/-0.0 or not | |
| BuiltinScalarFunction::Isnan | ||
| BuiltinScalarFunction::Iszero => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider Isnan
and Iszero
not math expressions as listed here. The reason to give f64 high priority is also not applied too for these two expressions, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on above reason, maybe it is less confused to have them in a separate pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply.
Yes, f64 doesn't need high priority.
I've changed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. They are also in Spark expressions too, although seems not all similar systems support these two expressions.
Isnan, | ||
isnan, | ||
num, | ||
"returns true if a given number is +NaN or -NaN otherwise returns false" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didn't know there are +NaN
and -NaN
. i thought there's only one type of NaN and (they) are unordered. unlike +0 and -0 which are distinct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any value with an exponent field of all 1s is a Nan, and so there are 2^N distinct values of NaN
, where N is the number of mantissa bits. The same is true of -NaN
Within arrow-rs we follow the IEEE 754 total order predicate which establishes an ordering for NaNs, (and infinity, etc...)
@@ -283,6 +285,32 @@ gcd(expression_x, expression_y) | |||
- **expression_y**: Second numeric expression to operate on. | |||
Can be a constant, column, or function, and any combination of arithmetic operators. | |||
|
|||
### `isnan` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
I took the liberty of merging this branch up to resolve some conflicts. I think it can be merged when the CI passes. Thank you @sarutak |
@@ -99,3 +99,15 @@ query RRR | |||
SELECT nanvl(asin(10), 1.0), nanvl(1.0, 2.0), nanvl(asin(10), asin(10)) | |||
---- | |||
1 1 NaN | |||
|
|||
# isnan | |||
query BBBB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will merge it.
Which issue does this PR close?
Closes #7273
Rationale for this change
In Datafusion,
+NaN
and-NaN
are distinguished as well as+0.0
and-0.0
.For example,
asin(10)
returnsNaN
but'inf'::DOUBLE / 'inf'::DOUBLE
returns-NaN
.3.0 * 0
returns0.0
but-3.0 * 0
returns-0.0
.So, users need to determine
NaN
and0.0
like as follows.I think it's nice to avoid such boilerplate code.
What changes are included in this PR?
This PR aims to add
isnan
andiszero
which determineNaN
and0.0
ignoring their sign.Are these changes tested?
Added new tests.
Are there any user-facing changes?
Yes but it doesn't break compatibility.