Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configurable initial capacity for IndexedTable #14620

Merged

Conversation

Jackie-Jiang
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang commented Dec 7, 2024

One of the bottleneck for combining group-by results is to grow the IndexedTable as we are adding groups. To reduce the re-hashing, we can initialize it with a larger capacity:

  • The size of IndexedTable won't go over trimThreshold
  • The size of IndexedTable should at least be able to hold the groups from the first map to be merged
  • Allow user to config a min capacity by query option or instance config

Release Notes

Min initial indexed table capacity is 128 by default, and can be configured with:

  • Query option: minInitialIndexedTableCapacity
  • Server instance config: pinot.server.query.executor.min.init.indexed.table.capacity
  • Broker instance config: pinot.broker.min.init.indexed.table.capacity

@Jackie-Jiang Jackie-Jiang added enhancement documentation Configuration Config changes (addition/deletion/change in behavior) query performance labels Dec 7, 2024
@Jackie-Jiang Jackie-Jiang requested a review from xiangfu0 December 7, 2024 02:02
@Jackie-Jiang Jackie-Jiang force-pushed the improve_indexed_table_initial_capacity branch from 9566078 to 0de93d3 Compare December 9, 2024 05:53
@Jackie-Jiang Jackie-Jiang force-pushed the improve_indexed_table_initial_capacity branch from 0de93d3 to e57a785 Compare December 9, 2024 07:06
@codecov-commenter
Copy link

codecov-commenter commented Dec 9, 2024

Codecov Report

Attention: Patch coverage is 77.94118% with 30 lines in your changes missing coverage. Please review.

Project coverage is 64.06%. Comparing base (59551e4) to head (e57a785).
Report is 1442 commits behind head on master.

Files with missing lines Patch % Lines
...pinot/core/plan/maker/InstancePlanMakerImplV2.java 54.16% 8 Missing and 3 partials ⚠️
.../java/org/apache/pinot/core/util/GroupByUtils.java 91.37% 1 Missing and 4 partials ⚠️
...va/org/apache/pinot/query/runtime/QueryRunner.java 50.00% 2 Missing and 3 partials ⚠️
...inot/core/query/reduce/StreamingReduceService.java 0.00% 4 Missing ⚠️
...org/apache/pinot/core/data/table/IndexedTable.java 80.00% 0 Missing and 1 partial ⚠️
...e/operator/blocks/results/GroupByResultsBlock.java 75.00% 0 Missing and 1 partial ⚠️
...tor/streaming/StreamingGroupByCombineOperator.java 0.00% 1 Missing ⚠️
...e/pinot/core/query/reduce/BrokerReduceService.java 75.00% 0 Missing and 1 partial ⚠️
...not/core/query/reduce/GroupByDataTableReducer.java 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14620      +/-   ##
============================================
+ Coverage     61.75%   64.06%   +2.30%     
- Complexity      207     1574    +1367     
============================================
  Files          2436     2687     +251     
  Lines        133233   147827   +14594     
  Branches      20636    22662    +2026     
============================================
+ Hits          82274    94698   +12424     
- Misses        44911    46191    +1280     
- Partials       6048     6938     +890     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 64.03% <77.94%> (+2.32%) ⬆️
java-21 63.94% <77.94%> (+2.31%) ⬆️
skip-bytebuffers-false 64.05% <77.94%> (+2.30%) ⬆️
skip-bytebuffers-true 63.90% <77.94%> (+36.17%) ⬆️
temurin 64.06% <77.94%> (+2.30%) ⬆️
unittests 64.05% <77.94%> (+2.30%) ⬆️
unittests1 56.19% <77.94%> (+9.30%) ⬆️
unittests2 34.57% <4.41%> (+6.84%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@xiangfu0 xiangfu0 merged commit f9655b9 into apache:master Dec 9, 2024
20 of 21 checks passed
@Jackie-Jiang Jackie-Jiang deleted the improve_indexed_table_initial_capacity branch December 9, 2024 18:22
davecromberge pushed a commit to davecromberge/pinot that referenced this pull request Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Configuration Config changes (addition/deletion/change in behavior) documentation enhancement performance query
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants