-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Hybrid planning when the root plan has local preference #2183
Conversation
crates/datafusion_ext/src/vars.rs
Outdated
@@ -95,6 +95,7 @@ impl SessionVars { | |||
is_cloud_instance: bool, | |||
dialect: Dialect, | |||
enable_experimental_scheduler: bool, | |||
prefer_local_execution: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- is this really "prefer" and not "force"
- are there meaningfully cases in local/embedded nodes where there's more-than-one session? My instinct was that local sessions (which are the things where hybrid execution could have happened anyway,) were always single user, and had one copy of the glaredb instance loaded/running.
- seems like we should (maybe) have this be an option on the operation itself rather than on the session (which means that the state of "should this operation be remote or local" is stateful on the session, and could be weird in multi-threaded contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer_local_execution
might not be the best name for this, but it's really more about setting the default location of where to run parts of the query. Currently we default to "remote" at the root and selectively pull up "local" parts as much as possible. This variable really just changes it to default "local" at the root.
I suggested a session variable since changing the default could break something that we don't know about, and so having a way to go back to the other behavior seemed reasonable and lets us play with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just have this be "execution_mode" and have it be an enum?
should_hybrid_execution_default_local
if it defaults to local at the root will any parts be remote?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just have this be "execution_mode" and have it be an enum?
We could, but I don't know if it's worth doing here. What we're aiming to solve here is very specific to hybrid exec, which only really cares about is thing running locally, or is thing not running locally. It won't (or shouldn't) care about how the thing is running remotely once we have dist exec.
should_hybrid_execution_default_local
Seems reasonable to me. I don't see this being a variable that a user really cares about unless they hit an edge case that would require them to set this, so the verbosity makes sense.
if it defaults to local at the root will any parts be remote?
Yes, runtime group effect everything below it in the tree unless there's another runtime group.
I will do my best to draw a picture (assuming local at root):
RuntimeGroup (local)
|
Project
|
Join
|
--------------
| |
RuntimeGroup (remote) RuntimeGroup (remote)
| |
TableScan TableScan
What this is saying is both table scans should run remotely, the results of both be pulled down locally, joined locally, and projected locally.
We have an optimizer rule that will pull up runtime groups as far as possible to minimize how much we need to pull down locally. After the rule, the tree ends up being:
RuntimeGroup (remote)
|
Project
|
Join
|
--------------
| |
TableScan TableScan
The table scans, join, and project now all run remotely, and the results of the entire tree are then returned to the local client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on the same, as Sean mentioned. Right now, the "remote" isn't pulled up; it must also be configurable. I will mark it ready for review once it is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[sounds good to me! sorry if I complicated things, but this feels clearer!]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on the same, as Sean mentioned. Right now, the "remote" isn't pulled up; it must also be configurable. I will mark it ready for review once it is done.
Oh, I assumed it was already doing that. If we can have it pull up matching preferences, that would be ideal (and would be useful once extended for dist exec too).
crates/datafusion_ext/src/vars.rs
Outdated
@@ -95,6 +95,7 @@ impl SessionVars { | |||
is_cloud_instance: bool, | |||
dialect: Dialect, | |||
enable_experimental_scheduler: bool, | |||
prefer_local_execution: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just have this be "execution_mode" and have it be an enum?
We could, but I don't know if it's worth doing here. What we're aiming to solve here is very specific to hybrid exec, which only really cares about is thing running locally, or is thing not running locally. It won't (or shouldn't) care about how the thing is running remotely once we have dist exec.
should_hybrid_execution_default_local
Seems reasonable to me. I don't see this being a variable that a user really cares about unless they hit an edge case that would require them to set this, so the verbosity makes sense.
if it defaults to local at the root will any parts be remote?
Yes, runtime group effect everything below it in the tree unless there's another runtime group.
I will do my best to draw a picture (assuming local at root):
RuntimeGroup (local)
|
Project
|
Join
|
--------------
| |
RuntimeGroup (remote) RuntimeGroup (remote)
| |
TableScan TableScan
What this is saying is both table scans should run remotely, the results of both be pulled down locally, joined locally, and projected locally.
We have an optimizer rule that will pull up runtime groups as far as possible to minimize how much we need to pull down locally. After the rule, the tree ends up being:
RuntimeGroup (remote)
|
Project
|
Join
|
--------------
| |
TableScan TableScan
The table scans, join, and project now all run remotely, and the results of the entire tree are then returned to the local client.
d0c0269
to
73dad61
Compare
NOTE: Trying to fix copy-to. There's still some issue with it being executed on remote. EDIT: The optimization is causing some issues. Thinking about it a bit more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense.
[TODO]: Need to add more comments.
Yes, we definitely need to. I think we'll also want to have tests asserting pre and post plan transformations, which would make this stuff a bit clearer too.
Wrapping custom execs in the RuntimeGroupExec seems to make sense here, and makes it a bit clearer on expectations for where something is ran.
Fixes: #2104 Signed-off-by: Vaibhav <vrongmeal@gmail.com>
73dad61
to
f699c40
Compare
/// incorrect. | ||
// TODO: Figure out how to "pull-up" remote more accurately. One solution | ||
// might be to specialize cases where we know we can pull up, i.e., know | ||
// that the node has to execute on local. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really appreciate this comment.
Push down "remote" exec as far down as possible and replace any "local"
execs that are found (children to the remote). Finally, pack the
remote exec in a [
SendRecvJoinExec
] so the plan transformation lookssomething like:
Original plan:
Transformed plan:
We don't "pull-up" the remote exec as much as possible because we want
to stop pulling once a "local" node is met but there's no certain way of
knowing that the node is to be run locally (since that information lies
with its n'th parent, i.e., [
RuntimeGroupExec
]).For example, take the following plan in consideration which tries to
copy contents from a remote table in a local file:
If we were to pull "remote" up until the "local", we would end up with
something like:
This ends up running the
CopyToExec
on remote which is completelyincorrect.
Fixes: #2104