opt,sql: use paired joins for left outer spatial joins #55216

sumeerbhola · 2020-10-04T15:56:13Z

The GenerateInvertedJoins rule now fires for left outer joins
(I haven't yet removed the old transformation, but it does not
get picked). GenerateInvertedJoins in custom_funcs.go builds
the two RelExprs and the continuation col ID.
execbuilder.Builder makes the adjustments for outputting the
continuation column for the inverted join (the first join in
the pair). There are similar changes in
execFactory.ConstructInvertedJoin and in
DistSQLPlanner.createPlanForInvertedJoin.

I am not happy that changes are needed in all these places
(and also about the correctness of these changes), but I
could not figure out a better way since the continuation column
is synthesized and not part of the input or the scan of
the right side.

Additionally, TestLogic fails for a few cases where a sort
gets interposed between the first and second join causing an
incorrect computation since the group boundaries are no longer
correctly defined by the continuation column. The fragility
worries me. I am wondering whether to go back to the idea of
having a pairedJoinExpr and in execFactory split into the
two joins. This will need costing changes since the
pairedJoinExpr will need its own special costing code. It will
reduce the number of places in the code that need to be know
about the continuation column. Another option is to push down
the paired-join even further into DistSQLPlanner -- that
will ensure that there is a one-to-one relationship between
the first join processor and the second join processor.
This would ensure that changes to DistSQLPlanner in the future
don't inadvertently cause more than one first join processor
feeding into a second join processor (which would also lead
to incorrect results).

The change to joinreader.go is due to a bug for the case
when it is the second join. I'll send out a separate PR
with that fix.

In addition to being more efficient than the current
transformation for left outer joins, I noticed in the
output in inverted_join_geospatial_dist that the current
transformation was not distributed while the paired joins
are distributed.

This PR does not include left semi and left anti joins which
will be in later PRs.

Release note (performance improvement): more efficient plan
for execution of left outer spatial joins.

cockroach-teamcity · 2020-10-04T15:56:20Z

This change is

rytaft

Cool!

To fix the issue with sort getting pushed in between the operators, I think you can update this function here: https://github.com/cockroachdb/cockroach/blob/bdf910ac4984277fdd411f1362192a85ba4311d6/pkg/sql/opt/ordering/lookup_join.go#L20. If the lookup join is part of a paired join, you'll need to check if the child can provide the ordering using ordering.CanProvide.

Reviewed 17 of 17 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)

pkg/sql/distsql_physical_planner.go, line 2018 at r1 (raw file):

	numInputNodeCols, planToStreamColMap, post, types :=
		mappingHelperForLookupJoins(plan, n.input, n.table, false)

[nit] add /* addContinuationCol */ after false

pkg/sql/distsql_spec_exec_factory.go, line 628 at r1 (raw file):

	lookupCols exec.TableColumnOrdinalSet,
	onCond tree.TypedExpr,
	secondJoinInPairedJoiner bool,

[nit] For these boolean params, I think it would be clearer if it were called isSecondJoinInPairedJoiner (likewise for isFirstJoinInPairedJoiner)

pkg/sql/opt_exec_factory.go, line 577 at r1 (raw file):

	lookupCols exec.TableColumnOrdinalSet,
	onCond tree.TypedExpr,
	secondJoinInPairedJoiner bool,

Same comment here -- I'd call this isSecondJoinInPairedJoiner

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_explain, line 199 at r1 (raw file):

           └── project              ·                      ·                                        (lk, geom1, rk1, rk2, cont)                     ·
                │                   estimated row count    10000 (missing stats)                    ·                                               ·
                └── inverted join   ·                      ·                                        (lk, geom1, rk1, rk2, geom_inverted_key, cont)  ·

would be great to print the type of join here (left outer) similar to the lookup join

pkg/sql/opt/exec/factory.opt, line 220 at r1 (raw file):

    LookupCols exec.TableColumnOrdinalSet
    OnCond tree.TypedExpr
    SecondJoinInPairedJoiner bool

[nit] IsSecondJoinInPairedJoiner

pkg/sql/opt/exec/factory.opt, line 242 at r1 (raw file):

    LookupCols exec.TableColumnOrdinalSet
    OnCond tree.TypedExpr
    FirstJoinInPairedJoiner bool

[nit] IsFirstJoinInPairedJoiner

pkg/sql/opt/exec/execbuilder/relational.go, line 1530 at r1 (raw file):

		maxValue, ok := allCols.MaxValue()
		if !ok {
			maxValue = 0

shouldn't this be -1?

could also use the function numOutputColsInMap which already takes care of this logic

pkg/sql/opt/norm/join_funcs.go, line 584 at r1 (raw file):

// column ID for the paired-joiners used for left joins when the first join
// generates false positives (due to inverted index or non-covering index).
func (c *CustomFuncs) ConstructContinuationColumnForPairedLeftJoin() opt.ColumnID {

This should be in xform (unexported) since (I think) that's the only place it's used.

pkg/sql/opt/ops/relational.opt, line 340 at r1 (raw file):

    # SecondJoinInPairedJoiner is true if this is the second join of a
    # paired-joiner used for left joins.
    SecondJoinInPairedJoiner bool

[nit] IsSecondJoinInPairedJoiner

pkg/sql/opt/ops/relational.opt, line 388 at r1 (raw file):

    # FirstJoinInPairedJoiner is true if this is the first join of a
    # paired-joiner used for left joins.
    FirstJoinInPairedJoiner bool

[nit] IsFirstJoinInPairedJoiner

pkg/sql/opt/xform/custom_funcs.go, line 2038 at r1 (raw file):

		indexCols := pkCols.ToSet()

		continuationCol := opt.ColumnID(-1)

0 can be used as the default, since we don't allow any columns with ID 0

pkg/sql/opt/xform/rules/join.opt, line 272 at r1 (raw file):

# indexes (of the Scan table) which allow it. See the GenerateInvertedJoins
# custom function for more details.
# TODO(rytaft): Add support for LeftJoin, SemiJoin, and AntiJoin. Currently

[nit] update this TODO

pkg/sql/opt/xform/rules/join.opt, line 290 at r1 (raw file):

# applies when the input is a Select.
[GenerateInvertedJoinsFromSelect, Explore]
(InnerJoin

add LeftJoin here too

pkg/sql/opt/xform/testdata/rules/join, line 3907 at r1 (raw file):


# Anti-joins are supported by converting them to a left join join wrapped in a
# select. The left join is later converted to an inner join.

Update comment about left join

pkg/sql/opt/xform/testdata/rules/join, line 3954 at r1 (raw file):


# TODO: why is this not causing the paired joins -- we can evaluate the
# n.name LIKE 'Upper%' in the lookup join.

You need to add LeftJoin to the GenerateInvertedJoinsFromSelect rule too (I added a comment there)

pkg/sql/opt/xform/testdata/rules/join, line 4046 at r1 (raw file):


# Left joins are supported by converting them to an inner join wrapped in a
# left join.

Update comment

pkg/sql/rowexec/joinreader.go, line 843 at r1 (raw file):

	matched bool
	// The last row index in the group. Only valid when doGrouping = true.
	// TODO: add test that exercises the bug.

What's the bug?

sumeerbhola

TFTR!

I tried this suggestion, and now it prefers to do a hash join (see the different output in inverted_join_geospatial) and sort the result instead of doing the inverted join followed by the lookup join. Is there a way to look at the cost comparison to debug what is going on?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rytaft, @sumeerbhola, and @yuzefovich)

pkg/sql/opt/exec/factory.opt, line 220 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] IsSecondJoinInPairedJoiner

in a recent slack thread you used the term upper/lower. Do you think isUpperJoinInPairedJoin would be better?

pkg/sql/rowexec/joinreader.go, line 843 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

What's the bug?

#55252

yuzefovich

I am not happy that changes are needed in all these places
(and also about the correctness of these changes), but I
could not figure out a better way since the continuation column
is synthesized and not part of the input or the scan of
the right side.

To me the amount of changes seems reasonable. I think we haven't had a precedent like this (that we need to somehow synchronize specifications of two processor stages) before, but I definitely expect that some changes need to be made in DistSQLPhysicalPlanner.

However, I don't like the idea of

Another option is to push down the paired-join even further into DistSQLPlanner

in isolation.

The way I think of it is that the optimizer produces a logical plan which is then converted by the physical planner into a physical plan; we've introduced a new logical concept of "paired joiners", so the logical plan must change to encompass that concept. Similarly, the physical planner must also change to support that concept.

By "in isolation" I mean that only changing the physical planner seems wrong to me, but I do like the idea of enforcing that certain assumptions (that there is a 1-to-1 relationship between processors of the first and the second stages) in the physical planner. We don't have any infrastructure to do that though, so it'll likely to be not pretty as well.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rytaft, @sumeerbhola, and @yuzefovich)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial, line 1 at r1 (raw file):

# LogicTest: local

I think we should try to keep all the default configs here (i.e. not specify LogicTest directive) and move EXPLAIN (DISTSQL) queries which have variable output to opt/exec/execbuilder/testdata (i.e. in optlogic tests).

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial, line 232 at r1 (raw file):

3  12
3  16
4  NULL

Why did this change? Is it also because of the sorter issue?

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_explain, line 187 at r1 (raw file):

----
·                                   distribution           local                                    ·                                               ·
·                                   vectorized             true                                     ·                                               ·

Hm, I wonder why it became vectorized - because the stats are missing, it shouldn't be.

sumeerbhola

I do like the idea of enforcing that certain assumptions (that there is a 1-to-1 relationship between processors of the first and the second stages) in the physical planner. We don't have any infrastructure to do that though, so it'll likely to be not pretty as well.

And the DistSQLPlanner can't assume that lookupJoinNode.input is an invertedJoinNode for this paired-joiner case, since there is a renderNode in between the two (to project away the inverted column). I guess I could cast the planNode to a renderNode and reach into renderNode.source.plan but that isn't pretty. Any suggestions here?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rytaft, @sumeerbhola, and @yuzefovich)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial, line 1 at r1 (raw file):

Previously, yuzefovich wrote…

I think we should try to keep all the default configs here (i.e. not specify LogicTest directive) and move EXPLAIN (DISTSQL) queries which have variable output to opt/exec/execbuilder/testdata (i.e. in optlogic tests).

I should have clarified this is just a temporary change to debug problems.
I've made a note to add tests to opt/exec/execbuilder/testdata/inverted_index -- it does not currently have any spatial left joins.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial, line 232 at r1 (raw file):

Previously, yuzefovich wrote…

Why did this change? Is it also because of the sorter issue?

correct

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_explain, line 187 at r1 (raw file):

Previously, yuzefovich wrote…

Hm, I wonder why it became vectorized - because the stats are missing, it shouldn't be.

I don't think we disabled vectorization for invertedJoins (we did for invertedFilterer with some code in colbuilder/execplan.go I think). The inner joins earlier in this file are also saying vectorized true.
I either have forgotten or never knew what this actually meant given that the inverted join and lookup join are themselves not vectorized.

yuzefovich

Reaching into the insides of the planNodes (especially if there are other things in-between) is not pretty, I agree, so we shouldn't do that. I was thinking of performing this assertion check on PhysicalPlan, but after looking into the code, I see that we are already using PhysicalPlan.AddNoGroupingStage in createPlanForLookupJoin which means that joinReaders are planned only and on all the same nodes on which the processors of the input stage are planned, so I actually think we already have the desired behavior and that should be sufficient (i.e. no need to add anything to do the assertion).

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rytaft and @sumeerbhola)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial, line 1 at r1 (raw file):

Previously, sumeerbhola wrote…

I should have clarified this is just a temporary change to debug problems.
I've made a note to add tests to opt/exec/execbuilder/testdata/inverted_index -- it does not currently have any spatial left joins.

Cool, thanks.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_explain, line 187 at r1 (raw file):

Previously, sumeerbhola wrote…

I don't think we disabled vectorization for invertedJoins (we did for invertedFilterer with some code in colbuilder/execplan.go I think). The inner joins earlier in this file are also saying vectorized true.
I either have forgotten or never knew what this actually meant given that the inverted join and lookup join are themselves not vectorized.

Yeah, we only disabled the vectorization if core.InvertedFilterer is present in the flow. Even if some processors don't have a vectorized implementation (for example, inverted joiner), we still can a vectorized flow in which we will have a wrapped row-execution processor (we have adapters that convert from row-by-row to vectorized and vice versa). If you're interested to see how it looks, you could run EXPLAIN (VEC) on the query, and there should be rowexec.invertedJoiner in the tree, if you do EXPLAIN (VEC, VERBOSE), you'll also see that there are Materializer and Columnarizer around that inverted joiners (these are the adapters).

I see why it became vectorized true - the plan has changed, and there is no longer a subquery (a subquery that is handled via core.LocalPlanNode prohibits the vectorization of the flow).

rytaft

Is there a way to look at the cost comparison to debug what is going on?

You could try forcing the index, although I'm not 100% sure if that will work. We may need to add a new join hint like LEFT INVERTED JOIN similar to our LEFT LOOKUP JOIN hint. Let me know if you think that's needed and I can open a PR. Another option might be to use the exploretrace debugging tool. Happy to discuss more offline.

Reviewed 2 of 2 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)

pkg/sql/opt/exec/factory.opt, line 220 at r1 (raw file):

Previously, sumeerbhola wrote…

in a recent slack thread you used the term upper/lower. Do you think isUpperJoinInPairedJoin would be better?

I'm fine with either one -- whichever you prefer

pkg/sql/rowexec/joinreader.go, line 843 at r1 (raw file):

Previously, sumeerbhola wrote…

#55252

If that's merged is there still a bug? Either way, would be good to update the TODO itself with more info if you plan to leave it in.

sumeerbhola

You could try forcing the index, although I'm not 100% sure if that will work.

@rytaft forcing worked.

I see that we are already using PhysicalPlan.AddNoGroupingStage in createPlanForLookupJoin which means that joinReaders are planned only and on all the same nodes on which the processors of the input stage are planned, so I actually think we already have the desired behavior and that should be sufficient (i.e. no need to add anything to do the assertion).

@yuzefovich I've added a comment where we do this to state that this is also needed for correctness.

Thanks for the review @rytaft @yuzefovich !
I've removed the WIP from the PR since this is now looking cleaner.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)

pkg/sql/distsql_physical_planner.go, line 2018 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] add /* addContinuationCol */ after false

Done.

pkg/sql/distsql_spec_exec_factory.go, line 628 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] For these boolean params, I think it would be clearer if it were called isSecondJoinInPairedJoiner (likewise for isFirstJoinInPairedJoiner)

Done.

pkg/sql/opt_exec_factory.go, line 577 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Same comment here -- I'd call this isSecondJoinInPairedJoiner

Done.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial, line 1 at r1 (raw file):

Previously, yuzefovich wrote…

Cool, thanks.

I've reverted this change.
But I did add forcing of the geom_index here since there were a couple of places that needed it for some reason, and I don't want end-to-end correctness testing to be compromised by inadvertently not testing the inverted joiner and paired joins.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 100 at r3 (raw file):

SELECT url FROM [EXPLAIN (DISTSQL)
SELECT lk, rk FROM ltable LEFT JOIN rtable@geom_index
ON ST_Intersects(rtable.geom, ltable.geom1) OR ST_DWithin(ltable.geom1, rtable.geom, 2) ORDER BY (lk, rk)]

This and the next one were the ones that were using the hash joins and forcing the index worked.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_explain, line 187 at r1 (raw file):

Previously, yuzefovich wrote…

Yeah, we only disabled the vectorization if core.InvertedFilterer is present in the flow. Even if some processors don't have a vectorized implementation (for example, inverted joiner), we still can a vectorized flow in which we will have a wrapped row-execution processor (we have adapters that convert from row-by-row to vectorized and vice versa). If you're interested to see how it looks, you could run EXPLAIN (VEC) on the query, and there should be rowexec.invertedJoiner in the tree, if you do EXPLAIN (VEC, VERBOSE), you'll also see that there are Materializer and Columnarizer around that inverted joiners (these are the adapters).

I see why it became vectorized true - the plan has changed, and there is no longer a subquery (a subquery that is handled via core.LocalPlanNode prohibits the vectorization of the flow).

Thanks for the explanation! I've added an inverted_join_geospatial_dist_vec (like the existing dist_vectorize) to that prints out the EXPLAIN (VEC, VERBOSE) output.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_explain, line 199 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

would be great to print the type of join here (left outer) similar to the lookup join

Done.

pkg/sql/opt/exec/factory.opt, line 220 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I'm fine with either one -- whichever you prefer

I remembered the confusion that lower/upper causes in Pebble, and left this as first/second. First being the join that executes first is easier to explain.

pkg/sql/opt/exec/factory.opt, line 242 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] IsFirstJoinInPairedJoiner

Done.

pkg/sql/opt/exec/execbuilder/relational.go, line 1530 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

shouldn't this be -1?

could also use the function numOutputColsInMap which already takes care of this logic

ok should always be true. I've changed this code to return an error if that is not the case.
I considered using numOutputColsInMap but the comment "returns the number of slots required to fill in all of the columns referred to by this ColMap" suggests a different meaning, so I was worried that the implementation match was not necessarily a semantic match.

pkg/sql/opt/norm/join_funcs.go, line 584 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This should be in xform (unexported) since (I think) that's the only place it's used.

I tried that first. I can't access norm.CustomFuncs.f from xform and instead of exporting f I thought a narrow function would be cleaner.

pkg/sql/opt/ops/relational.opt, line 340 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] IsSecondJoinInPairedJoiner

Done.

pkg/sql/opt/ops/relational.opt, line 388 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] IsFirstJoinInPairedJoiner

Done.

pkg/sql/opt/xform/custom_funcs.go, line 2038 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

0 can be used as the default, since we don't allow any columns with ID 0

Done.

pkg/sql/opt/xform/rules/join.opt, line 272 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] update this TODO

Done.

pkg/sql/opt/xform/rules/join.opt, line 290 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

add LeftJoin here too

Done.

pkg/sql/opt/xform/testdata/rules/join, line 3907 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Update comment about left join

Done.

pkg/sql/opt/xform/testdata/rules/join, line 3954 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

You need to add LeftJoin to the GenerateInvertedJoinsFromSelect rule too (I added a comment there)

Thanks. Done.
It didn't fix this problem, which suggests to me that the stats were indicating that using ConvertLeftToInnerJoin was better, which does not sound correct.
I went ahead and deleted ConvertLeftToInnerJoin and that fixed this problem.

pkg/sql/opt/xform/testdata/rules/join, line 4046 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Update comment

Done.

pkg/sql/rowexec/joinreader.go, line 843 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

If that's merged is there still a bug? Either way, would be good to update the TODO itself with more info if you plan to leave it in.

Yes, it is no longer a bug. Rebased and removed.

sumeerbhola

added a few more test cases for the logictest with a table with multiple columns in the primary key, that sandwich the geometry column, just to make sure the various adjustments to the columns (due to the continuation column) are working.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rytaft and @yuzefovich)

yuzefovich

Execution changes

Reviewed 8 of 19 files at r3, 3 of 5 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rytaft and @sumeerbhola)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial, line 1 at r1 (raw file):

Previously, sumeerbhola wrote…

I've reverted this change.
But I did add forcing of the geom_index here since there were a couple of places that needed it for some reason, and I don't want end-to-end correctness testing to be compromised by inadvertently not testing the inverted joiner and paired joins.

Makes sense.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 90 at r4 (raw file):

https://cockroachdb.github.io/distsqlplan/decode.html#eJzElVFP2zAQx9_3Kax7opu71klaIE-ZRqZ16lrWIg0JVSg0J8gIcWY7Ewj1u09JKtKG1k4Ho49J7uf7-ey_8gjydwwuTP2h__mMZCImXybj7-TCPz8dfhqMyMHJYHo2_TFskWVJfFtWxCq4ipH8_OpPfOKf51XkYFnzviwRqyVSXUaJQiFxruRBSX-8Rn5n0WVh8dRqzYBCwkMcBXcowb0ABhQsoGDDjEIq-Byl5CL_9FgUDsJ7cLsUoiTNVP56RmHOBYL7CCpSMYILZ3mDCQYhik4XKISogiguli9VvFREd4F4AArTNEikS9qdvOk4Uy7xGPVsmC0o8ExVTaQKrhFctqDNRQbJHxQKw288SlB07HWXchBePojLKAnxHugT4d-nojZFz6LEc1qrmpbO1NrFNDdcTqy_0bKa2JDz2ywlv3iUEJ64xMu3NR5tsu2t225VtXdRPYmkipK56hyvi3r51RmLEAWGecNat2qBqwdyE8ibZ_RsURk5W42qdXjZq77Oh3KhrdrMMng_jWzrvHoN7LJkk99GtRFv87TDerXKzb37a71Z8ySyZknsWO0iKDtn0aBSy6KzxywaTFeyeLjvLBpUq0vdfbMwslcNo_3KYbSaB8JqGAi7_S9xMIjU4tDbYxwMpitxONp3HAyq1bVibxYH61Xj4PzHf9OGxhOUKU8kNvrzdHN1DK-x3KrkmZjjqeDzok35OC644kWIUpVfWfkwSMpPueAqzLSwpYctLWzrYbsOs1XYWYPZbjDrvoju6WlHu2sD3NMfVl8_s76WPtTDh1r4SA8faeFjPXz8kqPWw6ajNtCG02L6bJlofbiYIV1MHy9myBd7dsvXcceAP7vmuxyagTadmgk3DV6fsjo9W7z7GwAA__9Kzz7E

# Left joins are converted to paired joins by the optimizer.

Looking at the DistSQL diagram I think it might be good to show the fact that we have a paired joiner in there (you would need to update flow_diagram.go). What do you think?

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist_vec, line 1 at r4 (raw file):

# LogicTest: 5node 5node-disk

nit: I don't think 5node-disk config gives us any extra coverage. Also, since this file contains only the "plan" tests (i.e. only EXPLAIN queries), it'd be nice to move it into execbuilder/testdata folder. (We haven't been very good at this, but I think it is a good hygiene.)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist_vec, line 47 at r4 (raw file):


query T
EXPLAIN (VEC, VERBOSE) SELECT lk, rk FROM ltable JOIN rtable@geom_index

nit: I'd probably not use VERBOSE since I think the goal is to make sure that the paired joiners are setup correctly in the vectorized flow which can be seen without verbose.

rytaft

Awesome! It might be worth adding a check inside memo/check_expr.go that any time a lookup join has IsSecondJoinInPairedJoiner=true, its input is an inverted join with IsFirstJoinInPairedJoiner=true. That will help root out any other cases where the two operators get separated.

Reviewed 16 of 19 files at r3, 5 of 5 files at r4.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @sumeerbhola)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 100 at r3 (raw file):

Previously, sumeerbhola wrote…

This and the next one were the ones that were using the hash joins and forcing the index worked.

Do you think we need to make some stats changes? Or is this just due to the fact that we're not injecting any stats in this file, so all tables are assumed to have 1000 rows (and therefore we shouldn't be too concerned)?

pkg/sql/opt/norm/join_funcs.go, line 584 at r1 (raw file):

Previously, sumeerbhola wrote…

I tried that first. I can't access norm.CustomFuncs.f from xform and instead of exporting f I thought a narrow function would be cleaner.

you shouldn't need to access norm.CustomFuncs.f. Inside xform/custom_funcs.go, you can use c.e.f to access the factory.

pkg/sql/opt/xform/testdata/rules/join, line 3907 at r4 (raw file):


# Anti-joins are supported by converting them to a left join join wrapped in a
# select. The left join is later converted to paired-joins consisting of an

[nit] an -> a

The GenerateInvertedJoins rule now fires for left outer joins GenerateInvertedJoins in custom_funcs.go builds the two RelExprs and the continuation col ID. execbuilder.Builder makes the adjustments for outputting the continuation column for the inverted join (the first join in the pair). There are similar changes in execFactory.ConstructInvertedJoin and in DistSQLPlanner.createPlanForInvertedJoin. I could not figure out a simpler way that did not require changes in all these places since the continuation column is synthesized, and not part of the input or the scan of the right side. To prevent a sort from being interposed between the first and second join, there is a change to lookupOrIndexJoinCanProvideOrdering. This is currently the only known case where the optimizer can interpose an operation that would break the behavior of the continuation column. DistSQLPlanner always uses PhysicalPlan.AddNoGroupingStage when planning the second join, which ensures that the second join processors are on the same node as the first join processors, so there is no danger of breaking the one-to-one relationship needed between the two processors. In addition to being more efficient than the current transformation for left outer joins, I noticed in the output in inverted_join_geospatial_dist that the current transformation was not distributed while the paired joins are distributed. This PR does not include left semi and left anti joins which will be in later PRs. Release note (performance improvement): more efficient plan for execution of left outer spatial joins.

sumeerbhola

It might be worth adding a check inside memo/check_expr.go that any time a lookup join has IsSecondJoinInPairedJoiner=true, its input is an inverted join with IsFirstJoinInPairedJoiner=true. That will help root out any other cases where the two operators get separated.

Done

Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @rytaft, @sumeerbhola, and @yuzefovich)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 100 at r3 (raw file):

Do you think we need to make some stats changes? Or is this just due to the fact that we're not injecting any stats in this file, so all tables are assumed to have 1000 rows (and therefore we shouldn't be too concerned)?

I honestly don't know. I had overlooked that we don't have stats even though we've populated the tables.
This relates to an earlier question I had about how to investigate the costs being generated -- if you can suggest how to go about that, I can take a look.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 90 at r4 (raw file):

Previously, yuzefovich wrote…

Looking at the DistSQL diagram I think it might be good to show the fact that we have a paired joiner in there (you would need to update flow_diagram.go). What do you think?

Good idea. Done.

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist_vec, line 1 at r4 (raw file):

Previously, yuzefovich wrote…

nit: I don't think 5node-disk config gives us any extra coverage. Also, since this file contains only the "plan" tests (i.e. only EXPLAIN queries), it'd be nice to move it into execbuilder/testdata folder. (We haven't been very good at this, but I think it is a good hygiene.)

Done

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist_vec, line 47 at r4 (raw file):

Previously, yuzefovich wrote…

nit: I'd probably not use VERBOSE since I think the goal is to make sure that the paired joiners are setup correctly in the vectorized flow which can be seen without verbose.

Done

pkg/sql/opt/norm/join_funcs.go, line 584 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

you shouldn't need to access norm.CustomFuncs.f. Inside xform/custom_funcs.go, you can use c.e.f to access the factory.

Done.

pkg/sql/opt/xform/testdata/rules/join, line 3907 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] an -> a

Done.

rytaft

Reviewed 9 of 9 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola and @yuzefovich)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 100 at r3 (raw file):

Previously, sumeerbhola wrote…

Do you think we need to make some stats changes? Or is this just due to the fact that we're not injecting any stats in this file, so all tables are assumed to have 1000 rows (and therefore we shouldn't be too concerned)?

I honestly don't know. I had overlooked that we don't have stats even though we've populated the tables.
This relates to an earlier question I had about how to investigate the costs being generated -- if you can suggest how to go about that, I can take a look.

You can use EXPLAIN (opt, verbose) with both plans (forcing and not forcing the index) to see the cost and stats. But you can also feel free to merge this PR and we can look at the stats as a separate PR if you prefer.

If you want to play around with injecting stats, you can use ALTER TABLE ltable INJECT STATISTICS (or rtable). If you grep for INJECT STATISTICS in the repo you can see what the injected JSON looks like in other tests.

sumeerbhola

Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft, @sumeerbhola, and @yuzefovich)

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 100 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

You can use EXPLAIN (opt, verbose) with both plans (forcing and not forcing the index) to see the cost and stats. But you can also feel free to merge this PR and we can look at the stats as a separate PR if you prefer.

If you want to play around with injecting stats, you can use ALTER TABLE ltable INJECT STATISTICS (or rtable). If you grep for INJECT STATISTICS in the repo you can see what the injected JSON looks like in other tests.

Without forcing

                    sort
                     ├── columns: lk:1 rk:6
                     ├── immutable
                     ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
                     ├── cost: 150977.656
                     ├── key: (1,6)
                     ├── ordering: +1,+6
                     ├── prune: (1,6)
                     ├── interesting orderings: (+1) (+6)
                     └── project
                          ├── columns: lk:1 rk:6
                          ├── immutable
                          ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
                          ├── cost: 15551.4033
                          ├── key: (1,6)
                          ├── prune: (1,6)
                          ├── interesting orderings: (+1) (+6)
                          └── left-join (cross)
                               ├── columns: lk:1 geom1:2 rk:6 geom:7
                               ├── immutable
                               ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0, distinct(7)=100, null(7)=3333.33333]
                               ├── cost: 12218.06
                               ├── key: (1,6)
                               ├── fd: (1)-->(2), (6)-->(7)
                               ├── interesting orderings: (+1) (+6)
                               ├── scan ltable
                               │    ├── columns: lk:1 geom1:2
                               │    ├── stats: [rows=1000, distinct(1)=1000, null(1)=0, distinct(2)=100, null(2)=10]
                               │    ├── cost: 1104.02
                               │    ├── key: (1)
                               │    ├── fd: (1)-->(2)
                               │    ├── prune: (1,2)
                               │    ├── interesting orderings: (+1)
                               │    └── unfiltered-cols: (1-5)
                               ├── scan rtable
                               │    ├── columns: rk:6 geom:7
                               │    ├── stats: [rows=1000, distinct(7)=100, null(7)=10]
                               │    ├── cost: 1084.02
                               │    ├── key: (6)
                               │    ├── fd: (6)-->(7)
                               │    ├── prune: (6,7)
                               │    ├── interesting orderings: (+6)
                               │    └── unfiltered-cols: (6-10)
                               └── filters
                                    └── st_intersects(geom:7, geom1:2) OR st_dwithin(geom1:2, geom:7, 2.0) [outer=(2,7), immutable, constraints=(/2: (/NULL - ]; /7: (/NULL - ])]

With forcing

                    sort (segmented)
                     ├── columns: lk:1 rk:6
                     ├── immutable
                     ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
                     ├── cost: 168642.882
                     ├── key: (1,6)
                     ├── ordering: +1,+6
                     ├── prune: (1,6)
                     ├── interesting orderings: (+1) (+6)
                     └── project
                          ├── columns: lk:1 rk:6
                          ├── immutable
                          ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
                          ├── cost: 106037.393
                          ├── key: (1,6)
                          ├── ordering: +1
                          ├── prune: (1,6)
                          ├── interesting orderings: (+1) (+6)
                          └── left-join (lookup rtable)
                               ├── columns: lk:1 geom1:2 rk:6 geom:7
                               ├── key columns: [6] = [6]
                               ├── lookup columns are key
                               ├── immutable
                               ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0, distinct(7)=100, null(7)=3333.33333]
                               ├── cost: 102704.05
                               ├── key: (1,6)
                               ├── fd: (1)-->(2), (6)-->(7)
                               ├── ordering: +1
                               ├── interesting orderings: (+1) (+6)
                               ├── left-join (inverted-lookup rtable@geom_index)
                               │    ├── columns: lk:1 geom1:2 rk:6 continuation:12
                               │    ├── inverted-expr
                               │    │    └── st_intersects(geom1:2, geom:7) OR st_dwithin(geom1:2, geom:7, 2.0)
                               │    ├── stats: [rows=10000, distinct(1)=1000, null(1)=0]
                               │    ├── cost: 42004.03
                               │    ├── key: (1,6)
                               │    ├── fd: (1)-->(2), (6)-->(12)
                               │    ├── ordering: +1
                               │    ├── scan ltable
                               │    │    ├── columns: lk:1 geom1:2
                               │    │    ├── stats: [rows=1000, distinct(1)=1000, null(1)=0, distinct(2)=100, null(2)=10]
                               │    │    ├── cost: 1104.02
                               │    │    ├── key: (1)
                               │    │    ├── fd: (1)-->(2)
                               │    │    ├── ordering: +1
                               │    │    ├── prune: (1,2)
                               │    │    ├── interesting orderings: (+1)
                               │    │    └── unfiltered-cols: (1-5)
                               │    └── filters (true)
                               └── filters
                                    └── st_intersects(geom:7, geom1:2) OR st_dwithin(geom1:2, geom:7, 2.0) [outer=(2,7), immutable, constraints=(/2: (/NULL - ]; /7: (/NULL - ])]

Is 1000 rows the default when there are no stats?
Does the stats: [rows=10000, distinct(1)=1000, null(1)=0] for the inverted-lookup mean that it is using a selectivity of 0.01, since 1000 * 1000 * 0.01 = 10000?
The stats for the lookup join stats: [rows=333333.333, distinct(1)=1000, null(1)=0, distinct(7)=100, null(7)=3333.33333] seem broken. Each left row has the PK for the right table, so the output can't be more rows than the input. Does it think this is a general one-to-many lookup join?

rytaft · 2020-10-12T16:14:55Z

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 100 at r3 (raw file):

Previously, sumeerbhola wrote…

Without forcing

                    sort
                     ├── columns: lk:1 rk:6
                     ├── immutable
                     ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
                     ├── cost: 150977.656
                     ├── key: (1,6)
                     ├── ordering: +1,+6
                     ├── prune: (1,6)
                     ├── interesting orderings: (+1) (+6)
                     └── project
                          ├── columns: lk:1 rk:6
                          ├── immutable
                          ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
                          ├── cost: 15551.4033
                          ├── key: (1,6)
                          ├── prune: (1,6)
                          ├── interesting orderings: (+1) (+6)
                          └── left-join (cross)
                               ├── columns: lk:1 geom1:2 rk:6 geom:7
                               ├── immutable
                               ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0, distinct(7)=100, null(7)=3333.33333]
                               ├── cost: 12218.06
                               ├── key: (1,6)
                               ├── fd: (1)-->(2), (6)-->(7)
                               ├── interesting orderings: (+1) (+6)
                               ├── scan ltable
                               │    ├── columns: lk:1 geom1:2
                               │    ├── stats: [rows=1000, distinct(1)=1000, null(1)=0, distinct(2)=100, null(2)=10]
                               │    ├── cost: 1104.02
                               │    ├── key: (1)
                               │    ├── fd: (1)-->(2)
                               │    ├── prune: (1,2)
                               │    ├── interesting orderings: (+1)
                               │    └── unfiltered-cols: (1-5)
                               ├── scan rtable
                               │    ├── columns: rk:6 geom:7
                               │    ├── stats: [rows=1000, distinct(7)=100, null(7)=10]
                               │    ├── cost: 1084.02
                               │    ├── key: (6)
                               │    ├── fd: (6)-->(7)
                               │    ├── prune: (6,7)
                               │    ├── interesting orderings: (+6)
                               │    └── unfiltered-cols: (6-10)
                               └── filters
                                    └── st_intersects(geom:7, geom1:2) OR st_dwithin(geom1:2, geom:7, 2.0) [outer=(2,7), immutable, constraints=(/2: (/NULL - ]; /7: (/NULL - ])]

With forcing

                    sort (segmented)
                     ├── columns: lk:1 rk:6
                     ├── immutable
                     ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
                     ├── cost: 168642.882
                     ├── key: (1,6)
                     ├── ordering: +1,+6
                     ├── prune: (1,6)
                     ├── interesting orderings: (+1) (+6)
                     └── project
                          ├── columns: lk:1 rk:6
                          ├── immutable
                          ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
                          ├── cost: 106037.393
                          ├── key: (1,6)
                          ├── ordering: +1
                          ├── prune: (1,6)
                          ├── interesting orderings: (+1) (+6)
                          └── left-join (lookup rtable)
                               ├── columns: lk:1 geom1:2 rk:6 geom:7
                               ├── key columns: [6] = [6]
                               ├── lookup columns are key
                               ├── immutable
                               ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0, distinct(7)=100, null(7)=3333.33333]
                               ├── cost: 102704.05
                               ├── key: (1,6)
                               ├── fd: (1)-->(2), (6)-->(7)
                               ├── ordering: +1
                               ├── interesting orderings: (+1) (+6)
                               ├── left-join (inverted-lookup rtable@geom_index)
                               │    ├── columns: lk:1 geom1:2 rk:6 continuation:12
                               │    ├── inverted-expr
                               │    │    └── st_intersects(geom1:2, geom:7) OR st_dwithin(geom1:2, geom:7, 2.0)
                               │    ├── stats: [rows=10000, distinct(1)=1000, null(1)=0]
                               │    ├── cost: 42004.03
                               │    ├── key: (1,6)
                               │    ├── fd: (1)-->(2), (6)-->(12)
                               │    ├── ordering: +1
                               │    ├── scan ltable
                               │    │    ├── columns: lk:1 geom1:2
                               │    │    ├── stats: [rows=1000, distinct(1)=1000, null(1)=0, distinct(2)=100, null(2)=10]
                               │    │    ├── cost: 1104.02
                               │    │    ├── key: (1)
                               │    │    ├── fd: (1)-->(2)
                               │    │    ├── ordering: +1
                               │    │    ├── prune: (1,2)
                               │    │    ├── interesting orderings: (+1)
                               │    │    └── unfiltered-cols: (1-5)
                               │    └── filters (true)
                               └── filters
                                    └── st_intersects(geom:7, geom1:2) OR st_dwithin(geom1:2, geom:7, 2.0) [outer=(2,7), immutable, constraints=(/2: (/NULL - ]; /7: (/NULL - ])]

Is 1000 rows the default when there are no stats?
Does the stats: [rows=10000, distinct(1)=1000, null(1)=0] for the inverted-lookup mean that it is using a selectivity of 0.01, since 1000 * 1000 * 0.01 = 10000?
The stats for the lookup join stats: [rows=333333.333, distinct(1)=1000, null(1)=0, distinct(7)=100, null(7)=3333.33333] seem broken. Each left row has the PK for the right table, so the output can't be more rows than the input. Does it think this is a general one-to-many lookup join?

Yea, does seem kind of broken. I think this is due to the OR expression, since the stats code isn't currently smart enough to know that an inverted join condition could include an OR expression. What happens if you change the OR to an AND?

Feel free to open an issue and assign to me.

rytaft · 2020-10-12T16:15:54Z

pkg/sql/logictest/testdata/logic_test/inverted_join_geospatial_dist, line 100 at r3 (raw file):

Is 1000 rows the default when there are no stats?
Does the stats: [rows=10000, distinct(1)=1000, null(1)=0] for the inverted-lookup mean that it is using a selectivity of 0.01, since 1000 * 1000 * 0.01 = 10000?

Yes, that's correct

sumeerbhola · 2020-10-12T17:20:33Z

What happens if you change the OR to an AND?

The rows stay at 10,000

Feel free to open an issue and assign to me.

#55448

sumeerbhola · 2020-10-12T17:20:52Z

bors r+

craig · 2020-10-12T19:01:14Z

Build failed (retrying...):

GitHub CI (Cockroach)

craig · 2020-10-12T21:25:51Z

Build succeeded:

GitHub CI (Cockroach)

sumeerbhola requested review from rytaft and yuzefovich October 4, 2020 15:56

sumeerbhola requested a review from a team as a code owner October 4, 2020 15:56

rytaft reviewed Oct 6, 2020

View reviewed changes

sumeerbhola force-pushed the left_join2 branch from 4775784 to cdd23c1 Compare October 6, 2020 19:43

sumeerbhola commented Oct 6, 2020

View reviewed changes

yuzefovich reviewed Oct 6, 2020

View reviewed changes

sumeerbhola commented Oct 6, 2020

View reviewed changes

yuzefovich reviewed Oct 7, 2020

View reviewed changes

rytaft reviewed Oct 7, 2020

View reviewed changes

sumeerbhola changed the title ~~[WIP] opt,sql: use paired joins for left outer spatial joins~~ opt,sql: use paired joins for left outer spatial joins Oct 8, 2020

sumeerbhola force-pushed the left_join2 branch 3 times, most recently from c3e6be6 to 8384b1f Compare October 8, 2020 17:20

sumeerbhola commented Oct 8, 2020

View reviewed changes

sumeerbhola force-pushed the left_join2 branch from 8384b1f to 91d81ed Compare October 8, 2020 20:29

sumeerbhola commented Oct 8, 2020

View reviewed changes

yuzefovich approved these changes Oct 8, 2020

View reviewed changes

rytaft approved these changes Oct 9, 2020

View reviewed changes

sumeerbhola force-pushed the left_join2 branch from 91d81ed to 22fdf47 Compare October 10, 2020 00:07

sumeerbhola force-pushed the left_join2 branch from 22fdf47 to 051263f Compare October 10, 2020 00:27

sumeerbhola commented Oct 10, 2020

View reviewed changes

rytaft approved these changes Oct 12, 2020

View reviewed changes

sumeerbhola commented Oct 12, 2020

View reviewed changes

sumeerbhola mentioned this pull request Oct 12, 2020

opt: use paired joins for left joins with non-covering indices #55452

Open

craig bot merged commit 03e8d32 into cockroachdb:master Oct 12, 2020

sumeerbhola mentioned this pull request Oct 27, 2020

opt, sql: LEFT OUTER and ANTI spatial joins cannot be index-accelerated #53576

Closed

sumeerbhola deleted the left_join2 branch November 2, 2020 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt,sql: use paired joins for left outer spatial joins #55216

opt,sql: use paired joins for left outer spatial joins #55216

sumeerbhola commented Oct 4, 2020

cockroach-teamcity commented Oct 4, 2020

rytaft left a comment

sumeerbhola left a comment

yuzefovich left a comment

sumeerbhola left a comment

yuzefovich left a comment

rytaft left a comment

sumeerbhola left a comment

sumeerbhola left a comment

yuzefovich left a comment

rytaft left a comment

sumeerbhola left a comment

rytaft left a comment

sumeerbhola left a comment

rytaft commented Oct 12, 2020

rytaft commented Oct 12, 2020

sumeerbhola commented Oct 12, 2020

sumeerbhola commented Oct 12, 2020

craig bot commented Oct 12, 2020

craig bot commented Oct 12, 2020

opt,sql: use paired joins for left outer spatial joins #55216

opt,sql: use paired joins for left outer spatial joins #55216

Conversation

sumeerbhola commented Oct 4, 2020

cockroach-teamcity commented Oct 4, 2020

rytaft left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

rytaft commented Oct 12, 2020

rytaft commented Oct 12, 2020

sumeerbhola commented Oct 12, 2020

sumeerbhola commented Oct 12, 2020

craig bot commented Oct 12, 2020

craig bot commented Oct 12, 2020