-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-53266][TESTS] Regenerate benchmark results #52005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you, @HyukjinKwon . |
| Compression 10000 times at level 1 with buffer pool 580 581 0 0.0 58038.0 1.1X | ||
| Compression 10000 times at level 2 with buffer pool 612 615 3 0.0 61246.1 1.1X | ||
| Compression 10000 times at level 3 with buffer pool 721 734 11 0.0 72106.4 0.9X | ||
| Compression 10000 times at level 1 without buffer pool 265 267 1 0.0 26513.9 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This becomes much faster.
| Compression 4 times at level 1 with buffer pool 2568 2571 4 0.0 642087500.8 1.0X | ||
| Compression 4 times at level 2 with buffer pool 4211 4212 1 0.0 1052833529.0 0.6X | ||
| Compression 4 times at level 3 with buffer pool 6290 6291 2 0.0 1572505716.0 0.4X | ||
| Compression 4 times at level 1 without buffer pool 2764 2764 1 0.0 690899114.5 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it's a known issue, this is still slower than Java 17.
| arrayOfAnyAsObject 6 6 0 1611.8 0.6 1.0X | ||
| arrayOfAnyAsSeq 174 175 1 57.5 17.4 0.0X | ||
| arrayOfInt 393 395 1 25.4 39.3 0.0X | ||
| arrayOfIntAsObject 419 419 1 23.9 41.9 0.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relationship between arrayOfInt and arrayOfIntAsObject is switched.
| Common Codecs 4821 4894 64 0.2 4820.6 1.0X | ||
| Java 2565 2572 10 0.4 2564.8 1.9X | ||
| Spark 3811 3812 1 0.3 3810.7 1.3X | ||
| Spark Binary 2758 2759 1 0.4 2757.9 1.7X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relationship between Java and Spark Binary is switched. Now, Java is faster than Spark.
| Without bloom filter, blocksize: 16777216 790 794 5 126.6 7.9 1.0X | ||
| With bloom filter, blocksize: 16777216 792 798 9 126.3 7.9 1.0X | ||
| Without bloom filter, blocksize: 16777216 827 835 10 120.9 8.3 1.0X | ||
| With bloom filter, blocksize: 16777216 536 542 5 186.5 5.4 1.5X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This become faster and consistent with Java 21 result.
| Without bloom filter, blocksize: 33554432 427 430 3 234.2 4.3 1.0X | ||
| With bloom filter, blocksize: 33554432 508 520 12 196.9 5.1 0.8X | ||
| Without bloom filter, blocksize: 33554432 507 520 10 197.1 5.1 1.0X | ||
| With bloom filter, blocksize: 33554432 444 465 32 225.3 4.4 1.1X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This becomes faster and expected one.
| UTF-16 52085 52137 74 0.2 5208.5 0.6X | ||
| UTF-8 30150 30156 9 0.3 3015.0 1.1X | ||
| UTF-32 56295 56403 153 0.2 5629.5 1.0X | ||
| UTF-16 50644 50653 13 0.2 5064.4 1.1X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UTF-32 becomes suddenly the slowest among all. Previously, UTF-16 is slowest in both Java 17 and 21.
| columnar deserialization + columnar-to-row 179 220 36 5.6 178.7 1.0X | ||
| row-based deserialization 171 219 70 5.9 170.5 1.0X | ||
| columnar deserialization + columnar-to-row 222 257 41 4.5 222.3 1.0X | ||
| row-based deserialization 140 178 63 7.2 139.8 1.6X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Row-base deserialization becomes much faster relatively.
|
I'm going to merge this as a new baseline for further work about the above observations. |
What changes were proposed in this pull request?
This PR aims to regenerate benchmark results.
Why are the changes needed?
We have 3 goals.
RecursiveCTEBenchmark-jdk21-results.txtspark/sql/core/benchmarks/RecursiveCTEBenchmark-results.txt
Line 6 in ebf4dd1
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manual review.
Was this patch authored or co-authored using generative AI tooling?
No.