Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect $bucket for mismatch bucket queries #11885

Merged

Conversation

oraclechang
Copy link
Member

Comparable change: prestodb/presto#12429

This change allows executing queries with the following conditions. Currently, these queries will raise an exception.

if (tableBucketCount != readBucketCount && bucketFilter.isPresent())

A more specific example can be seen in the testMismatchedBucketWithBucketPredicate() test.

@cla-bot cla-bot bot added the cla-signed label Apr 8, 2022
@findepi findepi requested a review from electrum April 9, 2022 08:33
@oraclechang oraclechang force-pushed the support_mismatch_bucket_query2 branch from 3580a93 to f78d3a1 Compare April 12, 2022 00:30
assertUpdate(
"CREATE TABLE test_mismatch_bucketing32\n" +
"WITH (bucket_count = 32, bucketed_by = ARRAY['key32']) AS\n" +
"SELECT custkey key32, comment value32 FROM orders",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this testable with smaller tables?
Like nation with bucketing on nationkey?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

assertQuery(withMismatchOptimization, query, "SELECT 130361");
assertQuery(withoutMismatchOptimization, query, "SELECT 130361");
}
finally {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the try-finally isn't really needed in this class, since all the test data is ephemeral anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the try-finally

15000);

Session withMismatchOptimization = Session.builder(getSession())
.setSystemProperty(COLOCATED_JOIN, "true")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter? document why

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

.build();

@Language("SQL") String query = "SELECT count(*) AS count\n" +
"FROM (\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: \n are redundant, replace with spaces so that it's more readable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@oraclechang oraclechang force-pushed the support_mismatch_bucket_query2 branch 2 times, most recently from 3ce8645 to 185deca Compare April 15, 2022 00:39
@oraclechang oraclechang requested a review from findepi April 18, 2022 17:14
@oraclechang
Copy link
Member Author

oraclechang commented Apr 25, 2022

Hi @findepi, @electrum

may I have a review please?
Thanks!

@oraclechang oraclechang requested a review from arhimondr May 3, 2022 21:06
@oraclechang
Copy link
Member Author

Hi @arhimondr,

Can you please have a look?
Thanks!

@@ -219,6 +219,14 @@ public ConnectorSplitSource getSplits(
// sort partitions
partitions = Ordering.natural().onResultOf(HivePartition::getPartitionId).reverse().sortedCopy(partitions);

if (bucketHandle.isPresent()) {
if (bucketHandle.get().getReadBucketCount() > bucketHandle.get().getTableBucketCount()) {
throw new TrinoException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please open a follow up PR dropping the HivePartitioningHandle#maxCompatibleBucketCount. Currently it is effectively unused as number of read buckets higher than number of table buckets is no longer supported.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@arhimondr
Copy link
Contributor

@oraclechang Could you please rebase and resubmit to make sure it compiles on Trunk. I will merge once the build is green.

@oraclechang oraclechang force-pushed the support_mismatch_bucket_query2 branch from 185deca to 9adf088 Compare May 6, 2022 23:36
@arhimondr arhimondr merged commit 4dd11db into trinodb:master May 7, 2022
@github-actions github-actions bot added this to the 381 milestone May 7, 2022
@oraclechang oraclechang deleted the support_mismatch_bucket_query2 branch May 12, 2022 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants