-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: implement ConstructScan in the new factory #49682
Conversation
6a90265
to
27924b8
Compare
85fbc92
to
5112560
Compare
cc @rytaft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 4 of 4 files at r2, 6 of 6 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @yuzefovich)
pkg/sql/distsql_physical_planner.go, line 911 at r1 (raw file):
} func getColsForScanToTableOrdinalMap(
would be good to have a small comment at least
pkg/sql/distsql_physical_planner.go, line 1037 at r1 (raw file):
// scanNodes can have columns set up in a few different ways, depending on the // colCfg. The heuristic planner always creates scanNodes with all public // columns (even if some of them aren't even in the index we are scanning).
We don't have the heuristic planner anymore so you can take this sentence out :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @rytaft)
pkg/sql/distsql_physical_planner.go, line 911 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
would be good to have a small comment at least
Done.
pkg/sql/distsql_physical_planner.go, line 1037 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
We don't have the heuristic planner anymore so you can take this sentence out :)
Good point. I examined the code a little bit more and added another commit to this PR to clean this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 6 files at r6, 6 of 6 files at r7.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto and @yuzefovich)
pkg/sql/exec_factory_util.go, line 90 at r7 (raw file):
// Set visibility=execinfra.ScanVisibilityPublicAndNotPublic, since all // columns in the "cols" set should be projected, regardless of whether // they're public or non- public. The caller decides which columns to
[nit] extra space in "non- public"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! I see that Alfonso was also typing comments, so I'll wait for his feedback as well.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto and @rytaft)
pkg/sql/exec_factory_util.go, line 90 at r7 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
[nit] extra space in "non- public"
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! However, seeing this makes me think of how we should test it. I think something we should definitely do is differential testing of execFactory
+ distsql physical planner
vs distsqlSpecFactory
.
Reviewed 1 of 1 files at r1, 9 of 9 files at r4, 4 of 4 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft and @yuzefovich)
pkg/sql/distsql_physical_planner.go, line 911 at r1 (raw file):
} func getColsForScanToTableOrdinalMap(
nit: can this function name be clearer? maybe just getColOrdinals
?
pkg/sql/distsql_physical_planner.go, line 1063 at r6 (raw file):
func (dsp *DistSQLPlanner) planTableReaders( planCtx *PlanningCtx,
These are a lot of arguments. What do you think of creating some argument struct and maybe documenting the arguments better?
pkg/sql/distsql_spec_exec_factory.go, line 60 at r6 (raw file):
rowCount float64, locking *tree.LockingItem, ) (exec.Node, error) {
Composing the planning like this might open new doors for testing. I don't think much testing was ever added for distsql physical planning. Can you think of some tests that might be worth adding?
pkg/sql/distsql_spec_exec_factory.go, line 74 at r6 (raw file):
recommendation := canDistribute // ------- begin execFactory.ConstructScan -------
Are these comments convention in execFactory
? s/execFactory/distSQLSpecExecFactory
pkg/sql/distsql_spec_exec_factory.go, line 100 at r6 (raw file):
// TODO(yuzefovich): scanNode adds "parallel" attribute in walk.go when // scanNode.canParallelize() returns true. We should plumb that info from // here somehow as well.
What information is missing in this scope to do that? I think you have maxResults
, limit information and you would just need to check the cluster setting
pkg/sql/exec_factory_util.go, line 34 at r5 (raw file):
instrumentation: planner.curPlan.instrumentation, } assignPlan := func(plan *planMaybePhysical, node exec.Node) {
Instead of passing a pointer to planMaybePhysical
, it might be nicer to just take a *planTop
here, pass in res
, and have this function set the main
field
pkg/sql/exec_factory_util.go, line 36 at r5 (raw file):
assignPlan := func(plan *planMaybePhysical, node exec.Node) { if usesPlanNodeRepresentation { plan.planNode = node.(planNode)
If we're doing a type assertion, I think we can probably remove the boolean parameter and test for whatever type we want.
pkg/sql/plan_opt.go, line 182 at r6 (raw file):
bld *execbuilder.Builder ) dsp := p.DistSQLPlanner()
When would dsp
be nil?
pkg/sql/plan_opt.go, line 184 at r6 (raw file):
dsp := p.DistSQLPlanner() if mode := p.SessionData().ExperimentalDistSQLPlanningMode; dsp != nil && mode != sessiondata.ExperimentalDistSQLPlanningOff { bld = execbuilder.New(newDistSQLSpecExecFactory(p, dsp), execMemo, &opc.catalog, root, p.EvalContext())
One thing that I'm wondering is why we create a new exec factory for every query. Is that expensive? Maybe we'd want a sync.pool, but not in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, differential testing requires that both approaches produce exactly the same physical plans, but I think this will not be the case because we're changing the "distribution" strategy - in a follow-up PR I want to introduce the notion of "partially distributed" plans. And the old path would produce "local" plans while the new path will be producing "partially distributed" plans, so the differential testing is not applicable in general.
I think the best testing strategy will be to invest into a comprehensive suite of logic tests that use EXPLAIN (distsql)
. A nice addition to this approach would be introducing a textual representation of the diagram so that it would be easier to examine (similar to what we have with EXPLAIN (vec)
). I think we might have an issue for that.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto and @rytaft)
pkg/sql/distsql_physical_planner.go, line 911 at r1 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
nit: can this function name be clearer? maybe just
getColOrdinals
?
Renamed.
pkg/sql/distsql_physical_planner.go, line 1063 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
These are a lot of arguments. What do you think of creating some argument struct and maybe documenting the arguments better?
I extracted a utility struct, but I don't think adding some documentation is that useful because most of the fields are simply extracted from scanNode
s, so I referred the reader to the documentation in there. Or is your comment about improving that documentation in general as well?
pkg/sql/distsql_spec_exec_factory.go, line 60 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Composing the planning like this might open new doors for testing. I don't think much testing was ever added for distsql physical planning. Can you think of some tests that might be worth adding?
Hm, having some kind of unit tests doesn't seem that much useful, but using EXPLAIN (distsql)
more to test the physical planning is probably the way to go. I think we could expand the "execbuilder" logic tests (i.e. testoptlogic
) to include the physical planning once the new factory is significantly implemented.
pkg/sql/distsql_spec_exec_factory.go, line 74 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Are these comments convention in
execFactory
?s/execFactory/distSQLSpecExecFactory
I left them mostly for the ease of reviewing (and for myself when writing the code). I don't think they provide much benefit when the PR is ready to be merged, so I removed them.
pkg/sql/distsql_spec_exec_factory.go, line 100 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
What information is missing in this scope to do that? I think you have
maxResults
, limit information and you would just need to check the cluster setting
We do have all the necessary information in this scope, the complication is plumbing that information into EXPLAIN
code which is currently not supported on the new representation (because it assumes walking over planNode
tree).
pkg/sql/exec_factory_util.go, line 34 at r5 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Instead of passing a pointer to
planMaybePhysical
, it might be nicer to just take a*planTop
here, pass inres
, and have this function set themain
field
This function is also used for subqueries and cascades which are planMaybePhysical
and not planTop
, so I don't think we can do that.
pkg/sql/exec_factory_util.go, line 36 at r5 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
If we're doing a type assertion, I think we can probably remove the boolean parameter and test for whatever type we want.
Done.
pkg/sql/plan_opt.go, line 182 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
When would
dsp
be nil?
Indeed, it seems like it is always set, so it was an unnecessary "defense" mechanism, removed.
pkg/sql/plan_opt.go, line 184 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
One thing that I'm wondering is why we create a new exec factory for every query. Is that expensive? Maybe we'd want a sync.pool, but not in this PR.
Hm, interesting point, I don't see a reason why we couldn't use a pool for this, but I'm also not sure whether it'll provide noticeable performance change. Left a TODO to look into this.
This commit extract most of the logic of `createTableReaders` into a separate method that performs the actual physical planning of table readers given already created `TableReaderSpec`s. This will allow for reuse in the follow-up work. This commit additionally slightly refactors a few other things for reuse. Release note: None
This commit extracts the logic of `execFactory.ConstructPlan` into the function to be used by both factories. It also extracts out another small helper. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, could we run differential testing without the distribution variable (i.e. always plan locally)? I think we need to come up with some way to make sure that we're not introducing bugs with this new planning approach.
I'm not sure I understand the EXPLAIN (distsql)
approach, wouldn't the distribution still be an issue? Or do you mean that we wouldn't compare it with anything and write new tests for this new representation? That could work, and might be good to have, but it seems like a pretty heavy investment and doesn't mean we can't do some type of differential testing.
Reviewed 10 of 10 files at r8, 5 of 5 files at r9, 6 of 6 files at r10, 5 of 5 files at r11.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft and @yuzefovich)
pkg/sql/distsql_physical_planner.go, line 1063 at r6 (raw file):
Previously, yuzefovich wrote…
I extracted a utility struct, but I don't think adding some documentation is that useful because most of the fields are simply extracted from
scanNode
s, so I referred the reader to the documentation in there. Or is your comment about improving that documentation in general as well?
My comment was also about improving documentation in general as well but this LGTM.
pkg/sql/distsql_spec_exec_factory.go, line 74 at r6 (raw file):
Previously, yuzefovich wrote…
I left them mostly for the ease of reviewing (and for myself when writing the code). I don't think they provide much benefit when the PR is ready to be merged, so I removed them.
I actually kind of liked the general comments describing the stages minus the format
pkg/sql/distsql_spec_exec_factory.go, line 100 at r6 (raw file):
Previously, yuzefovich wrote…
We do have all the necessary information in this scope, the complication is plumbing that information into
EXPLAIN
code which is currently not supported on the new representation (because it assumes walking overplanNode
tree).
Ok. Just to confirm, the tableReader still parallelizes scans when it can with the current code?
This commit adds an implementation of `distSQLSpecExecFactory.ConstructScan` which combines the logic that performs `scanNode` creation of `execFactory.ConstructScan` and physical planning of table readers of `DistSQLPlanner.createTableReaders`. I tried to refactor the code so that there is not much duplication going on. Notably, simple projections, renders, and filters are not yet implemented. Release note: None
After the examining the code closer, I see that now in all code paths we populate `scanColumnsConfig.wantedColumns` to ask for the specific columns to be scanned. Previously, this map could be left as `nil` which would mean that all columns of the table should be scanned. This was used only by the heuristic planner which has been removed, so we can now update the assumption of the scan columns config. This allows us to clean up the comments a bit. This commit also moves `makeColumnsConfig` function into the shared util file because it is used by both factories. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, could we run differential testing without the distribution variable (i.e. always plan locally)? I think we need to come up with some way to make sure that we're not introducing bugs with this new planning approach.
Yes, we could. To be clear, my comment about inability to differential testing was "in general" (I was thinking about creating some randomized queries and comparing the plans on them), but I do intend to use the differential testing approach when applicable. Namely, I'm thinking that we might want to introduce a separate cluster setting for testing purposes that would prohibit partially distributed plans - whenever the new factory creates a partially distributed plan, if that setting is set, then we would fallback to the old path, but local and fully-distributed plans should be exactly the same in the old and the new paths. I also think that we will introduce non-default logic test mode to be used for different already existing "explain" logic tests.
Or do you mean that we wouldn't compare it with anything and write new tests for this new representation?
Yes, I think that we will need to write separate tests for partially distributed plans because we haven't had such precedent before, so there is nothing to reuse. However, we will be able to reuse already existing tests if they don't end up with a partially distributed plan.
TL;DR is that I am actively thinking about the testing, and the follow-up PR implements some EXPLAIN variants in the new factory which makes testing possible.
TFTRs!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @asubiotto and @rytaft)
pkg/sql/distsql_physical_planner.go, line 1063 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
My comment was also about improving documentation in general as well but this LGTM.
Done.
pkg/sql/distsql_spec_exec_factory.go, line 74 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
I actually kind of liked the general comments describing the stages minus the format
Ok, added the modified comments back.
pkg/sql/distsql_spec_exec_factory.go, line 100 at r6 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Ok. Just to confirm, the tableReader still parallelizes scans when it can with the current code?
Yes, the "parallelization" logic is done by DistSQLPlanner
in planTableReaders
which is shared with the old path, and this comment refers to only not being able to show the "parallel" attribute in the output of EXPLAIN.
Merge conflict (retrying...) |
Build succeeded |
sql: extract out planning of table readers into a separate method
This commit extract most of the logic of
createTableReaders
intoa separate method that performs the actual physical planning of table
readers given already created
TableReaderSpec
s. This will allow forreuse in the follow-up work. This commit additionally slightly refactors
a few other things for reuse.
Release note: None
sql: implement ConstructPlan by new factory
This commit extracts the logic of
execFactory.ConstructPlan
into thefunction to be used by both factories. It also extracts out another
small helper.
Release note: None
sql: implement ConstructScan in distsql spec factory
This commit adds an implementation of
distSQLSpecExecFactory.ConstructScan
which combines the logic thatperforms
scanNode
creation ofexecFactory.ConstructScan
and physicalplanning of table readers of
DistSQLPlanner.createTableReaders
.I tried to refactor the code so that there is not much duplication going
on.
Notably, simple projections, renders, and filters are not yet
implemented.
Closes: #47474.
Release note: None
sql: enforce the assumption that
wantedColumns
is not nil for scansAfter the examining the code closer, I see that now in all code paths we
populate
scanColumnsConfig.wantedColumns
to ask for the specificcolumns to be scanned. Previously, this map could be left as
nil
whichwould mean that all columns of the table should be scanned. This was
used only by the heuristic planner which has been removed, so we can now
update the assumption of the scan columns config. This allows us to
clean up the comments a bit.
This commit also moves
makeColumnsConfig
function into the shared utilfile because it is used by both factories.
Release note: None