-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Demonstrate new GroupHashAggregate
stream approach (runs more than 2x faster!)
#6800
Closed
Closed
Changes from all commits
Commits
Show all changes
98 commits
Select commit
Hold shift + click to select a range
9b22745
POC: Demonstrate new GroupHashAggregate stream approach
alamb 4ce6671
complete accumulator
alamb 5694190
touchups
alamb a58b006
Add comments
alamb 73cb33f
Update comments and simplify code
alamb 0b5d74f
factor out accumulate
alamb c30874d
split nullable/non nullable handling
alamb 2370220
Refactor out accumulation in average
alamb 26570f9
Move accumulator to their own function
alamb bed990e
update more comments
alamb 25787a0
Begin writing tests for accumulate
alamb 8433d6f
more tets
alamb 7e9b92e
more tests
alamb bb37e77
comments
alamb add7b36
Implement fuzz testing
alamb 53aa18b
Clarify the required order from GroupsAccumulator
alamb 00aac24
Zero copy into array
alamb d760a5f
fix spelling of indices
alamb 8811fa6
implement filtering for easy path
alamb 93a4e6f
Implement filtering
alamb 966d3d0
Add null handling in avg
alamb 316c781
WIP count
754a9ff
WIP count
e708723
Sketch out the adapter interface
alamb 677160e
More new adapter interface
alamb 689e51b
WIP sum
7b20155
WIP sum
6275a9f
Use `Rows` API
8902c91
Update adapter
alamb 6cab205
Add docs, refactor
alamb 587dc0e
Merge branch 'alamb/hash_agg_spike' of github.com:alamb/arrow-datafus…
alamb 7683350
Merge remote-tracking branch 'origin/main' into hash_agg_spike2
52c62ec
Merge
1684916
WIP count
a94c346
WIP count
1ba625a
WIP count
c2f955d
WIP count
9ff91cb
Support sum
180903b
Complete adapter
alamb 5d8bb35
Instantiate all types
alamb 51b0243
Implement memory accounting
alamb 68f62d1
cleanup memory accounting
alamb ad6d4f3
Fix sum accumulator with filtering, consolidate null handling
alamb 87b54c9
Add float support for sum
eb919a9
Merge remote-tracking branch 'origin/main' into hash_agg_spike2
917c050
Simplify count aggregate, clean up aggregates cleanup, fuzz almost pa…
alamb 9eb6822
Merge branch 'alamb/hash_agg_spike' of github.com:alamb/arrow-datafus…
alamb c041ecc
fix fmt
alamb f973a65
Fix clippy
alamb 24abb14
Fix docs
alamb 6e740a4
Min/Max for primitives
9d2c7bf
Min/Max for primitives
ecc980d
Min/Max initialization
fede032
Min/Max initialization
5076245
Initial min/max support for primitive
8de4ada
Refactor
09b9329
Clippy
ea0ce25
Clippy
be8a1e2
Cleanup
890b517
Fmt
ffd5cbe
Merge remote-tracking branch 'origin/main' into hash_agg_spike2
6846970
Speed up avg
2f4907a
Fmt
7ecf148
Add clickbench queries to sqllogictest coverage (#6836)
alamb 9adcf97
feat: implement posgres style `encode`/`decode` (#6821)
ozgrakkurt 4aa1656
chore(deps): update rstest requirement from 0.17.0 to 0.18.0 (#6847)
dependabot[bot] c02d4e4
[minior] support serde for some function (#6846)
liukun4515 e044b5c
Support fixed_size_list for make_array (#6759)
jayzhan211 e8d5c17
Improve median performance. (#6837)
vincev cf72ea0
Mismatch in MemTable of Select Into when projecting on aggregate wind…
berkaysynnada aab9103
feat: column support for `array_append`, `array_prepend`, `array_posi…
izveigor 0fb5de7
MINOR: Fix ordering of the aggregate_source_with_order table (#6852)
mustafasrepo 5705b3a
Return error when internal multiplication overflowing in decimal divi…
viirya e324e9f
Deprecate ScalarValue::and, ScalarValue::or (#6842) (#6844)
tustvold dec1b97
chore(deps): update bigdecimal requirement from 0.3.0 to 0.4.0 (#6848)
dependabot[bot] 49fc6c1
Update tests, and fix memory accounting
alamb b326b68
Merge branch 'alamb/hash_agg_spike' of github.com:alamb/arrow-datafus…
alamb b137df6
fix doc comments
alamb 1d3185c
add ticket referece
alamb d9cca24
Only make code for average types that can be instantiated
alamb c68c39b
Improve aggregate_fuzz output
alamb 0127917
Fix fuzz tests by emulating retractable batch
alamb e36a972
Fix and simplify min/max
alamb a96c3a0
Merge remote-tracking branch 'apache/main' into alamb/hash_agg_spike
alamb b6bde8d
Improve memory accounting
alamb cb5b8cb
feat: Add graphviz display format for execution plan. (#6726)
liurenjie1024 07f8d77
Fix (another) logical conflict (#6882)
alamb 4dcac2a
Implement groups accumulators for bit operations
alamb 5d6f815
Almost there
alamb 60ee2ef
it compiles
alamb f2fc450
Reuse hashes buffer
b781910
Complete BoolAnd and BoolOr accumulators
alamb aebe77f
Fix doc
alamb 7c17638
Merge remote-tracking branch 'apache/main' into alamb/hash_agg_spike
alamb 0a5a749
Merge branch 'alamb/hash_agg_spike' of github.com:alamb/arrow-datafus…
alamb f684ae8
clippy
alamb e798074
Performance: Use a specialized sum accumulator for retractable aggreg…
alamb afcab34
Simplify sum and make it faster
alamb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -70,4 +70,4 @@ lto = false | |
opt-level = 3 | ||
overflow-checks = false | ||
panic = 'unwind' | ||
rpath = false | ||
rpath = false |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ this is a good change -- thanks @Dandandan . Pretty soon there will be no allocations while processing each batch (aka the hot loop) 🥳 -- I think with #6888 we can get rid of the counts in the sum accumulator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this change was made to the existing row_hash (not the new one). I will port the change to the new one as part of #6904