Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: modify limit pushdown rule to support inner joins #49802

Merged
merged 1 commit into from
Jun 6, 2020

Conversation

DrewKimball
Copy link
Collaborator

Previously, the optimizer could not push a limit into an InnerJoin.
This patch replaces PushLimitIntoLeftJoin with two rules which
perform the same function as well as handle the InnerJoin case.
A limit can be pushed into a given side of an InnerJoin when rows
from that side are guaranteed to be preserved by the join.

Release note (sql change): improve performance for queries with a
limit on a join that is guaranteed to preserve input rows.

@DrewKimball DrewKimball requested a review from a team as a code owner June 2, 2020 16:31
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/sql/opt/norm/rules/limit.opt, line 128 at r1 (raw file):

# PushLimitIntoLeftJoin pushes a Limit into the left input of a join when rows
# from the left input are guaranteed to (1) be preserved by the join and (2) to
# not be null-extended by the join. Since the join creates an output row for

Why does null-extension matter? (I don't think the rule actually checks that, does it?)


pkg/sql/opt/norm/rules/limit.opt, line 168 at r1 (raw file):

# Why can we not match a LeftJoin (or FullJoin)?
#
#   SELECT * FROM ab LEFT JOIN xy ON a = x LIMIT 10

For this example, do we assume that x is a FK reference to a?


pkg/sql/opt/norm/testdata/rules/limit, line 18 at r1 (raw file):


exec-ddl
CREATE TABLE fk(k INT PRIMARY KEY, v INT, r INT NOT NULL REFERENCES uv(u))

[nit] we found it useful when reading queries to make sure the table name has the column names, maybe fk_kvr or kvr_fk


pkg/sql/opt/norm/testdata/rules/limit, line 740 at r1 (raw file):

 ├── key: (1)
 ├── fd: (1)-->(2,3), (4)-->(5), (3)==(4), (4)==(3)
 ├── inner-join (hash)

Not strictly related to this change, but we should be able to figure out that the cardinality of this join is [0-10]. We'd need to check that the equality columns form a key in the right-hand side (in joinPropsHelper.cardinality).

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)


pkg/sql/opt/norm/rules/limit.opt, line 128 at r1 (raw file):

Previously, RaduBerinde wrote…

Why does null-extension matter? (I don't think the rule actually checks that, does it?)

As with the join elimination rule, we implicitly check this by only matching on InnerJoins and LeftJoins.

I intended the comment on PushLimitIntoJoinRight to explain this; I'll try and make it more clear that this is the case.


pkg/sql/opt/norm/rules/limit.opt, line 168 at r1 (raw file):

Previously, RaduBerinde wrote…

For this example, do we assume that x is a FK reference to a?

Yes, I'll change those to be more clear. Done.


pkg/sql/opt/norm/testdata/rules/limit, line 18 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] we found it useful when reading queries to make sure the table name has the column names, maybe fk_kvr or kvr_fk

Done.


pkg/sql/opt/norm/testdata/rules/limit, line 740 at r1 (raw file):

Previously, RaduBerinde wrote…

Not strictly related to this change, but we should be able to figure out that the cardinality of this join is [0-10]. We'd need to check that the equality columns form a key in the right-hand side (in joinPropsHelper.cardinality).

Yeah, doing this would allow the external limit to be removed in some cases. #49822 will make this easier, since JoinMultiplicity will be calculated at join construction instead of lazily, so it will be available during logical props building.

@DrewKimball DrewKimball force-pushed the limit_pushdown branch 2 times, most recently from fe882d0 to c3b8a13 Compare June 3, 2020 05:56
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @RaduBerinde)


pkg/sql/opt/norm/rules/limit.opt, line 128 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

As with the join elimination rule, we implicitly check this by only matching on InnerJoins and LeftJoins.

I intended the comment on PushLimitIntoJoinRight to explain this; I'll try and make it more clear that this is the case.

Ah, I read it as referring to null-extension of the other side. Usually we'd just phrase this as "input of an inner or left join where ...". Usually it's pretty clear why it won't work for FullJoin, plus you're already explaining that in detail below.


pkg/sql/opt/norm/rules/limit.opt, line 126 at r2 (raw file):

)

# PushLimitIntoLeftJoin pushes a Limit into the left input of a join when rows

IntoJoinLeft


pkg/sql/opt/norm/rules/limit.opt, line 132 at r2 (raw file):

# do this if the limit ordering refers only to the left input columns. We also
# check that the cardinality of the left input is more than the limit, to
# prevent repeated applications of the rule. FullJoins and RightJoins cannot be

[nit] This can be a bit confusing because we wouldn't have rules on RightJoins to begin with (they get normalized to LeftJoins). Maybe just talk about FULL JOIN.


pkg/sql/opt/norm/testdata/rules/limit, line 740 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Yeah, doing this would allow the external limit to be removed in some cases. #49822 will make this easier, since JoinMultiplicity will be calculated at join construction instead of lazily, so it will be available during logical props building.

Awesome!

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @RaduBerinde)


pkg/sql/opt/norm/rules/limit.opt, line 128 at r1 (raw file):

Previously, RaduBerinde wrote…

Ah, I read it as referring to null-extension of the other side. Usually we'd just phrase this as "input of an inner or left join where ...". Usually it's pretty clear why it won't work for FullJoin, plus you're already explaining that in detail below.

Done.


pkg/sql/opt/norm/rules/limit.opt, line 126 at r2 (raw file):

Previously, RaduBerinde wrote…

IntoJoinLeft

Done.


pkg/sql/opt/norm/rules/limit.opt, line 132 at r2 (raw file):

Previously, RaduBerinde wrote…

[nit] This can be a bit confusing because we wouldn't have rules on RightJoins to begin with (they get normalized to LeftJoins). Maybe just talk about FULL JOIN.

Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 3 files at r1, 1 of 2 files at r2, 1 of 1 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @RaduBerinde)


pkg/sql/opt/norm/rules/limit.opt, line 137 at r3 (raw file):

#
#   CREATE TABLE t_x (x INT PRIMARY KEY)
#   CREATE TABLE t_r (r INT NOT NULL REFERENCES x(x))

[nit] x(x) -> t_x(x)

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @RaduBerinde, and @rytaft)


pkg/sql/opt/norm/rules/limit.opt, line 137 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] x(x) -> t_x(x)

Whoops, done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @RaduBerinde)

Previously, the optimizer could not push a limit into an InnerJoin.
This patch replaces PushLimitIntoLeftJoin with two rules which
perform the same function as well as handle the InnerJoin case.
A limit can be pushed into a given side of an InnerJoin when rows
from that side are guaranteed to be preserved by the join.

Release note (sql change): improve performance for queries with a
limit on a join that is guaranteed to preserve input rows.
@DrewKimball
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented Jun 6, 2020

Build succeeded

@craig craig bot merged commit ef4eec0 into cockroachdb:master Jun 6, 2020
@DrewKimball DrewKimball deleted the limit_pushdown branch June 18, 2020 04:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants