-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: implement WITH RECURSIVE with UNION #71685
Conversation
pkg/sql/opt/optbuilder/testdata/with, line 1184 at r1 (raw file):
This diff looks like out of left field, but before this change we didn't even try to build it as a recursive CTE. Now we do and then realize it's not really recursive. The column IDs now match the case above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I only have nits, so
Reviewed 15 of 15 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)
pkg/sql/apply_join.go, line 254 at r1 (raw file):
} // runPlanInsidePlan is used to run a plan and gather the results in a row
nit: the comment should now mention the resultWriter.
pkg/sql/buffer_util.go, line 37 at r1 (raw file):
} func (c *rowContainerHelper) init(
nit: do you think it'd be cleaner to keep the signature of init
unchanged but to introduce initWithDeduplication
that would take the new argument and init
would call it passing in false
?
pkg/sql/buffer_util.go, line 75 at r1 (raw file):
} // addRowWithDedup adds a given row if not already present in the container.
super nit: I think s/a given row/the given row/
.
pkg/sql/buffer_util.go, line 79 at r1 (raw file):
func (c *rowContainerHelper) addRowWithDedup( ctx context.Context, row tree.Datums, ) (ok bool, _ error) {
super nit: maybe do s/ok/added/
to be more explicit about the meaning of the boolean?
pkg/sql/recursive_cte.go, line 179 at r1 (raw file):
// // If we are deduplicating, each row is either discarded if it has a duplicate // in the allRows container or added to both alRows and workingRows otherwise.
nit: s/alRows/allRows/
.
pkg/sql/opt/exec/factory.opt, line 693 at r1 (raw file):
# - the plan is executed; the results are emitted and also saved in a new # buffer for the next iteration. If Deduplicate is true, only rows that # haven't been returned yet are saved.
nit: maybe s/are saved/are returned and saved/
.
pkg/sql/opt/optbuilder/with.go, line 237 at r1 (raw file):
initial, recursive, isUnionAll, ok := b.splitRecursiveCTE(cte.Stmt) // We don't currently support the UNION form (only UNION ALL).
nit: needs an update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!! It would be nice to include the variant in ordinary EXPLAIN
as well, unless I'm missing something this will still just say recursive-cte
and not include the detail about UNION
vs UNION ALL
.
9a1f644
to
297cfe1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs! Added a field to EXPLAIN VERBOSE.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @yuzefovich)
pkg/sql/buffer_util.go, line 37 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: do you think it'd be cleaner to keep the signature of
init
unchanged but to introduceinitWithDeduplication
that would take the new argument andinit
would call it passing infalse
?
Done. I also added a commit that capitalizes the API.
pkg/sql/buffer_util.go, line 75 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
super nit: I think
s/a given row/the given row/
.
Done.
pkg/sql/buffer_util.go, line 79 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
super nit: maybe do
s/ok/added/
to be more explicit about the meaning of the boolean?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the second commit is missing from the PR description
Reviewed 7 of 15 files at r1, 10 of 10 files at r2, 7 of 7 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @RaduBerinde, and @yuzefovich)
pkg/sql/buffer_util.go, line 64 at r2 (raw file):
} c.rows.Init( ordering, typs, &evalContext.EvalContext,
Does DoDeDuplicate require a non-empty ordering? It's unclear in the comments for DiskBackedRowContainer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @yuzefovich)
pkg/sql/buffer_util.go, line 64 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Does DoDeDuplicate require a non-empty ordering? It's unclear in the comments for DiskBackedRowContainer.
AFAICT it deduplicates on the columns in the ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @RaduBerinde, and @yuzefovich)
pkg/sql/buffer_util.go, line 64 at r2 (raw file):
Previously, RaduBerinde wrote…
AFAICT it deduplicates on the columns in the ordering
I see. Maybe add a comment here to explain that?
This change implements the UNION variant of WITH RECURSIVE, where rows are deduplicated. We achieve this by storing all rows in a deduplicating container and inserting in that container first, detecting if the row is a duplicate. Fixes cockroachdb#46642. Release note (sql change): The WITH RECURSIVE variant that uses UNION (as opposed to UNION ALL) is now supported.
This change capitalizes the API of `rowContainerHelper` and `rowContainerIterator` to make it more clear what functions are not internal to the structure. Release note: None
297cfe1
to
ac476c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @yuzefovich)
pkg/sql/buffer_util.go, line 64 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I see. Maybe add a comment here to explain that?
Done, also added to DoDeDuplicate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 8 files at r4, 7 of 7 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @yuzefovich)
pkg/sql/buffer_util.go, line 64 at r2 (raw file):
Previously, RaduBerinde wrote…
Done, also added to
DoDeDuplicate
.
Thanks!
bors r+ |
Build failed: |
bors r+ |
Build succeeded: |
sql: implement WITH RECURSIVE with UNION
This change implements the UNION variant of WITH RECURSIVE, where rows
are deduplicated. We achieve this by storing all rows in a
deduplicating container and inserting in that container first,
detecting if the row is a duplicate.
Fixes #46642.
Release note (sql change): The WITH RECURSIVE variant that uses UNION
(as opposed to UNION ALL) is now supported.
sql: capitalize rowContainerHelper API
This change capitalizes the API of
rowContainerHelper
androwContainerIterator
to make it more clear what functions are notinternal to the structure.
Release note: None