[DO NOT MERGE][17972][SQL] Another try of PR #15517 #15565
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This is another try of PR #15517, which aims to solve the exponential slow down of query planning time. It's still a PoC. I'm opening this PR to check whether Jenkins complains.
This PR adds a new method
Dataset.cached, which returns a new Dataset with a cached version of the logical plan of the current Dataset, so that we can truncate the cached sub plan tree.The existing
Dataset.cache()method doesn't fit because it mutates inner states of the current Dataset.The microbenchmark results are basically the same with the one described in #15517.
How was this patch tested?
N/A. Existing tests should be enough.