-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Interval analysis for OR
expressions
#7884
Conversation
@@ -1240,6 +1270,32 @@ mod tests { | |||
Ok(()) | |||
} | |||
|
|||
#[test] | |||
fn or_test() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was writing this test I found there appears to be something wrong with boolean interval handling.
Specifically, open_open(.., ..)
always creates the interval Interval { lower: IntervalBound { value: Boolean(true), open: false }, upper: IntervalBound { value: Boolean(false), open: false } }
regardless of the arguments:
I tracked it down to the code in https://github.com/apache/arrow-datafusion/blob/4591d82ccba6f52db535a377486362a35bfd84fb/datafusion/physical-expr/src/intervals/interval_aritmetic.rs#L250-L249
And I will file a follow on ticket about this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you create an interval with an open bound, if it is a lower bound, it can be (false
or (true
. (false
can be represented as [true
. For an upper bound case, true)
rounds down to false]
. That's why this result comes in that way.
But of course, you are right. We should not give the chance to create an invalid interval. Especially for boolean intervals, open_open case always gives an invalid interval, meaning that it needs to return an error regardless of the interval values (false or true). I will have solved all of these problems with the PR I am currently working on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have solved all of these problems with the PR I am currently working on.
The PR you are working on sounds great 🎉 . If it is possible, can you possibly break it up into smaller parts (e.g perhaps we could break the "make Interval
fields not pub
in one PR, and rework how propagate_constraints
works a a follow on PR?)
@@ -344,8 +344,8 @@ impl PhysicalExpr for BinaryExpr { | |||
let right_interval = children[1]; | |||
|
|||
let (left, right) = if self.op.is_logic_operator() { | |||
// TODO: Currently, this implementation only supports the AND operator | |||
// and does not require any further propagation. In the future, | |||
// TODO: Currently, this implementation only supports the AND/OR operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would appreciate any suggestions if any changes are needed here in propagate_constraints as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly changing the API here. It will be more understandable and easy to add new supports. Since the current version is written by overfitting the AND operator, adding OR support may be a challenge. Undoubtedly, it will be easier to add it after my refactoring PR arrives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good -- I don't fully understand what propagate constraints is used for. I think postponing any extra work until your refactoring is complete sounds like a good idea to me
Co-authored-by: Alex Huang <huangweijun1001@gmail.com>
// ((lhs_lower, lhs_upper), (rhs_lower, rhs_upper), (lower_result, upper_result)) | ||
let cases = vec![ | ||
// Note there are only three valid boolean intervals: [false, false], [false, true] and [true, true], | ||
((false, false), (false, false), (false, false)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test might be easier to read if I rewrote it in terms of CERTAINLY_TRUE
, UNCERTAIN, and
CERTAINLY_FALSE`. I'll try and do that next week
Marking as a draft until @berkaysynnada 's refactor PR has landed |
Superseded by #8276 |
Which issue does this PR close?
Closes #7883
Rationale for this change
See #7883
What changes are included in this PR?
OR
in interval analysisAre these changes tested?
Yes
Are there any user-facing changes?
Probably not