-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-42483][TESTS] Regenerate benchmark results #40072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Compression 10000 times at level 1 without buffer pool 605 812 220 0.0 60521.0 1.0X | ||
| Compression 10000 times at level 2 without buffer pool 665 678 20 0.0 66512.5 0.9X | ||
| Compression 10000 times at level 3 without buffer pool 890 903 20 0.0 88961.3 0.7X | ||
| Compression 10000 times at level 1 with buffer pool 829 839 11 0.0 82940.2 0.7X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look at this after this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java 8/17 doesn't have this regression.
| Use HashSet 4 4 0 226.9 4.4 1.0X | ||
| Use EnumSet 1 1 0 737.3 1.4 3.2X | ||
| Use HashSet 0 1 0 2440.2 0.4 1.0X | ||
| Use EnumSet 1 1 0 884.8 1.1 0.4X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to investigate this reversed ratio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HashSet seems to get some improvements in this case, contains use empty Set:. The other cases looks in a reasonable range.
| Use HashSet 5 5 0 209.4 4.8 1.0X | ||
| Use EnumSet 2 2 0 459.8 2.2 2.2X | ||
| Use HashSet 1 1 1 1972.0 0.5 1.0X | ||
| Use EnumSet 2 2 0 444.0 2.3 0.2X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
| interpreted version 4933 4935 2 108.8 9.2 1.0X | ||
| codegen version 5135 5141 9 104.6 9.6 1.0X | ||
| codegen version 64-bit 5071 5079 10 105.9 9.4 1.0X | ||
| codegen HiveHash version 4326 4326 0 124.1 8.1 1.1X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, this is the fastest.
| To non-nullable StructTypes using performant method 5520 5639 168 0.0 Infinity 1.0X | ||
| To nullable StructTypes using performant method 2657 2708 72 0.0 Infinity 2.1X | ||
| To non-nullable StructTypes using performant method 3126 3150 34 0.0 Infinity 1.0X | ||
| To nullable StructTypes using performant method 3136 4768 2309 0.0 Infinity 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a regression in Java 8. We need to take a look at this later.
| Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz | ||
| TPCDS Snappy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative | ||
| ------------------------------------------------------------------------------------------------------------------------ | ||
| q3 718 759 41 4.1 241.8 1.0X | ||
| q3 996 1035 55 3.0 335.3 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, slower?
| radix sort one byte 197 197 0 127.0 7.9 61.5X | ||
| radix sort two bytes 371 372 0 67.4 14.8 32.6X | ||
| radix sort eight bytes 1391 1397 8 18.0 55.7 8.7X | ||
| radix sort key prefix array 1914 1951 52 13.1 76.6 6.3X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this benchmark, all Java 17 results are faster than Java 8.
| SQL ORC MR 1654 1661 9 9.5 105.2 6.3X | ||
|
|
||
| OpenJDK 64-Bit Server VM 1.8.0_352-b08 on Linux 5.15.0-1023-azure | ||
| SQL CSV 13143 13363 311 1.2 835.6 1.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CSV seems to become 30% slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it's significant.
|
When you have some time, could you review this, @viirya ? I want to merge this to proceed the further investigations. |
|
Thank you so much always for your help, @viirya ! |
What changes were proposed in this pull request?
This aims to regenerate benchmark results on
masterbranch as a baseline for Spark 3.5.0 and a way to comparing Apache Spark 3.4.0 branch.Why are the changes needed?
These are reference values with minor changes.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manual review.