-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify and encapsulate window function state management #6621
Conversation
memoize_nth_value(state, nth_value_state)?; | ||
evaluator.set_state(&evaluator_state)?; | ||
} | ||
evaluator.memoize(state)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is actually more readable as well as cleaning up the trait
@@ -100,14 +99,6 @@ pub trait PartitionEvaluator: Debug + Send { | |||
false | |||
} | |||
|
|||
/// Returns the internal state of the window function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of the PR is to get of the functions on this trait that serialized / set state (which has a single use)
@@ -327,16 +327,7 @@ pub struct LeadLagState { | |||
pub idx: usize, | |||
} | |||
|
|||
#[derive(Debug, Clone, Default)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out this structure is not used anywhere else
@@ -127,13 +117,16 @@ pub trait PartitionEvaluator: Debug + Send { | |||
Ok(()) | |||
} | |||
|
|||
/// Sets the internal state for window function | |||
/// When the window frame has a fixed beginning (e.g UNBOUNDED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While writing this explanation it occurred to me that the term memoize
, while technically accurate is not the entire story here. I wonder if there is a better name like adjust_window_state
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adjust_window_state
feels like a more general functionality. However, this function is a special case for nth_value
when window frame boundary has UNBOUNDED PRECEDING
. Hence I think memoize
is better for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave it named memoize
if let BuiltinWindowState::NthValue(nth_value_state) = state { | ||
self.state = nth_value_state.clone() | ||
/// When the window frame has a fixed beginning (e.g UNBOUNDED | ||
/// PRECEDING), some functions such as FIRST_VALUE, LAST_VALUE and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can replace this line with PRECEDING), for some functions such as FIRST_VALUE, LAST_VALUE and
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alamb. I have left some minor comments. Other than these this PR is LGTM!.
fn memoize(&mut self, state: &mut WindowAggState) -> Result<()> { | ||
let out = &state.out_col; | ||
let size = out.len(); | ||
let (is_prunable, new_prunable) = match self.state.kind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that is is not introduced here (and introduced by me) but while reading the code maybe instead of new_prunable
we can use is_last
flag. I think, that way intent will be more clear.
NthValueKind::First => { | ||
let n_range = | ||
state.window_frame_range.end - state.window_frame_range.start; | ||
(n_range > 0 && size > 0, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case we use is_last
flag, second flag should be false
state.window_frame_range.end - state.window_frame_range.start; | ||
(n_range > 0 && size > 0, true) | ||
} | ||
NthValueKind::Last => (true, false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case we use is_last
flag, second flag should be true
NthValueKind::Nth(n) => { | ||
let n_range = | ||
state.window_frame_range.end - state.window_frame_range.start; | ||
(n_range >= (n as usize) && size >= (n as usize), true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we use is_last
flag, second flag should be false
} | ||
}; | ||
if is_prunable { | ||
if self.state.finalized_result.is_none() && new_prunable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case we use is_last
flag, condition should be if self.state.finalized_result.is_none() && !is_last {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the intent is clearer using the term is_last
. Thank you for that suggestion . Done in 4542799
Which issue does this PR close?
Related to #5781
Rationale for this change
PartitionEvaluator
currently has some methods that expose an enumBuiltinWindowState
specific to the built in functions. If we choose to exposePartitionEvaluator
externally it needs to not include details specifically about the built i window functionsIt turns out there is exactly one use for this state, which is memoization (for nth value calculation). We can move that functionality into the trait and avoid the need for
BuiltinWindowState
entirelyWhat changes are included in this PR?
PartitionEvaluator::get_state
,PartitionEvaluator::set_state()
andBuiltinWindowState
Are these changes tested?
existing tests
Are there any user-facing changes?
No these are all (currently) internal APIs