Skip to content

Conversation

@freakyzoidberg
Copy link
Member

Summary

  • Fixes incorrect results from ROLLUP/CUBE/GROUPING SETS queries when using multiple partitions
  • The subset satisfaction optimization was incorrectly allowing hash partitioning on fewer columns to satisfy requirements that include __grouping_id
  • This caused partial aggregates from different partitions to be finalized independently, producing duplicate grand totals

Closes #19849

@github-actions github-actions bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels Jan 16, 2026
@gabotechs
Copy link
Contributor

gabotechs commented Jan 16, 2026

cc @gene-bordegaray do this changes look good to you?

@gabotechs
Copy link
Contributor

I can confirm that this change does fix the issue, I'll let @gene-bordegaray chime in as the one assigned to the original issue.

Unless there are any relevant comments, I'll merge this in as soon as CI passes.

@gene-bordegaray
Copy link
Contributor

gene-bordegaray commented Jan 16, 2026

Yes this looks correct, thanks @freakyzoidberg . Let me unassign myself and comment take on issue

only thing may be wanting to add plans in the sqllogictests 😄

cc: @gabotechs

Comment on lines +145 to +147
04)------AggregateExec: mode=FinalPartitioned, gby=[channel@0 as channel, brand@1 as brand, __grouping_id@2 as __grouping_id], aggr=[sum(sub.total)]
05)--------RepartitionExec: partitioning=Hash([channel@0, brand@1, __grouping_id@2], 4), input_partitions=4
06)----------AggregateExec: mode=Partial, gby=[(NULL as channel, NULL as brand), (channel@0 as channel, NULL as brand), (channel@0 as channel, brand@1 as brand)], aggr=[sum(sub.total)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I imagine before this PR the RepartitionExec would not be there right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

Copy link
Contributor

@gabotechs gabotechs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good! thanks @freakyzoidberg and @gene-bordegaray

@gabotechs gabotechs added this pull request to the merge queue Jan 16, 2026
Merged via the queue into apache:main with commit 1ab7e41 Jan 16, 2026
32 checks passed
gabotechs pushed a commit to gabotechs/datafusion that referenced this pull request Jan 16, 2026
## Summary
- Fixes incorrect results from ROLLUP/CUBE/GROUPING SETS queries when
using multiple partitions
- The subset satisfaction optimization was incorrectly allowing hash
partitioning on fewer columns to satisfy requirements that include
`__grouping_id`
- This caused partial aggregates from different partitions to be
finalized independently, producing duplicate grand totals
  
Closes apache#19849
@alamb
Copy link
Contributor

alamb commented Jan 16, 2026

Love it -- thank you

@alamb
Copy link
Contributor

alamb commented Jan 16, 2026

Thank you @freakyzoidberg and @gabotechs

Do you we know what changes in 52 introduced this problem?

@gabotechs
Copy link
Contributor

You have the full conclusion here #19849

gabotechs added a commit that referenced this pull request Jan 16, 2026
Brings #19853 into `branch-52`

Co-authored-by: Pierre Lacave <pierre.lacave@datadoghq.com>
@alamb
Copy link
Contributor

alamb commented Jan 16, 2026

Thanks @gabotechs -- I'll ask some follow up questions there

Copy link
Contributor

@NGA-TRAN NGA-TRAN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Different results in TPC-DS q14 depending on the number of partitions

5 participants