Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt group trim threshold to group trim size rather than vice versa #13514

Merged

Conversation

yashmayya
Copy link
Collaborator

  • The grouping algorithm and group limit configurations are described here - https://docs.pinot.apache.org/users/user-guide-query/query-syntax/grouping-algorithm
  • Currently, if the trim size is larger than half the trim threshold, it is overridden to half the trim threshold in order to avoid excessive trimming.
  • However, this can lead to inaccurate results if the trim size has intentionally been configured to a high value based on the data pattern.
  • The better thing to do in this case would be to double the trim threshold instead, which can lead to higher memory usage but will respect the desired accuracy level.

@yashmayya yashmayya force-pushed the adaptive-group-trim-threshold branch from 279367e to 386aa50 Compare July 1, 2024 14:25
@codecov-commenter
Copy link

codecov-commenter commented Jul 1, 2024

Codecov Report

Attention: Patch coverage is 70.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 62.06%. Comparing base (59551e4) to head (386aa50).
Report is 699 commits behind head on master.

Files Patch % Lines
...org/apache/pinot/core/data/table/IndexedTable.java 70.00% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13514      +/-   ##
============================================
+ Coverage     61.75%   62.06%   +0.30%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2559     +123     
  Lines        133233   141387    +8154     
  Branches      20636    21943    +1307     
============================================
+ Hits          82274    87746    +5472     
- Misses        44911    46982    +2071     
- Partials       6048     6659     +611     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 62.01% <70.00%> (+0.30%) ⬆️
java-21 61.94% <70.00%> (+0.31%) ⬆️
skip-bytebuffers-false 62.04% <70.00%> (+0.30%) ⬆️
skip-bytebuffers-true 61.90% <70.00%> (+34.17%) ⬆️
temurin 62.06% <70.00%> (+0.30%) ⬆️
unittests 62.05% <70.00%> (+0.30%) ⬆️
unittests1 46.70% <70.00%> (-0.19%) ⬇️
unittests2 27.55% <0.00%> (-0.18%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yashmayya yashmayya marked this pull request as ready for review July 1, 2024 15:33
@Jackie-Jiang Jackie-Jiang merged commit 897376e into apache:master Jul 2, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants