-
Notifications
You must be signed in to change notification settings - Fork 881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter for run end array #5573
filter for run end array #5573
Conversation
new_run_ends.truncate(i); | ||
|
||
if values_filter.is_empty() { | ||
new_run_ends.clear(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you allocate the new_run_ends
as just an empty Vec but with run_ends.len()
capacity and push to it, you probably won't need this part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using this trick
new_run_ends[i] = count;
i += keep as usize;
to make it branchless, I can't do the same thing with push
#[allow(clippy::type_complexity)] | ||
fn filter_run_end_array_generic<R: RunEndIndexType>( | ||
re_arr: &RunArray<R>, | ||
pred: &FilterPredicate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can leave a TODO item to utilize the IterationStrategy
within FilterPredicate
for potential performance benefit to keep this PR as more an initial version of filter for run end arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could handle the None case in this PR, the index based ones are a poor fit for REE I suspect unless the selectivity of the filter is high. I'd need a benchmark but in short I would prefer to leave it as TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think None case is already handled by the parent
arrow-rs/arrow-select/src/filter.rs
Line 318 in 77a3132
IterationStrategy::None => Ok(new_empty_array(values.data_type())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Which issue does this PR close?
Related to #3520
Rationale for this change
Attempt at adding filter support for RunArray for i64 run_ends
What changes are included in this PR?
Support for filtering RunArray
Are there any user-facing changes?
the filter function now works for RunArray