[SPARK-11425] [SPARK-11486] Improve hybrid aggregation #9383

davies · 2015-10-30T19:12:41Z

After aggregation, the dataset could be smaller than inputs, so it's better to do hash based aggregation for all inputs, then using sort based aggregation to merge them.

SparkQA · 2015-10-30T19:22:49Z

Test build #44698 has finished for PR 9383 at commit 5707f5b.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-10-30T19:37:27Z

Test build #44700 has finished for PR 9383 at commit 764f540.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-10-30T22:17:47Z

Test build #44708 has finished for PR 9383 at commit 55c47ed.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-10-30T22:56:29Z

Test build #44710 has finished for PR 9383 at commit 752c8e7.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-10-31T01:18:46Z

Test build #44714 has finished for PR 9383 at commit 0512f1e.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2015-10-31T03:38:28Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

After call getCompactArray, the content of longArray is modified. Can this BytesToBytesMap be normally used later? Because the position in longArray for a key should be determined by (keyBase, keyOffset, keyLength). If the positions are modified, can the methods such as safeLookup work?

No, after this, the map is broken, should be freed later.

Is it needed to add comment for it?

viirya · 2015-10-31T04:15:14Z

Besides, as we discussed in #9067, should we add a configuration for turning on/off this feature? This feature may not always have performance gain.

viirya · 2015-10-31T04:21:56Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

Don't we need to catch OutOfMemoryError anymore?

davies · 2015-10-31T04:45:09Z

Currently, the old one is broken, I'd to remove that one. The new should be as fast as the old one in worst case, I think we don't need a configuration for this.

SparkQA · 2015-10-31T05:32:48Z

Test build #44728 has finished for PR 9383 at commit 28f84e1.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-10-31T07:57:58Z

Test build #44730 has finished for PR 9383 at commit 6fde4d5.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2015-11-02T18:35:37Z

After some benchmark, realized that using hashcode as prefix in timsort will cause regression in timsort and snappy compression (especially for aggregation after join, the order of records will become random). I will revert that part.

benchmark code:

sqlContext.setConf("spark.sql.shuffle.partitions", "1")
N = 1<<25
M = 1<<20
df = sqlContext.range(N).selectExpr("id", "repeat(id, 2) as s")
df.show()
df2 = df.select(df.id.alias('id2'), df.s.alias('s2'))
j = df.join(df2, df.id==df2.id2).groupBy(df.s).max("id", "id2")
n = j.count()

Another interesting finding is that Snappy will slowdown the spilling by 50% of end-to-end time, LZ4 will be faster than Snappy, but still 10% slower than not-compressed. Should we use false as the default value for spark.shuffle.spill.compress?(PS: tested on Mac with SSD, it may not be true on HD)

SparkQA · 2015-11-02T18:47:16Z

Test build #44820 has finished for PR 9383 at commit 53dbdf2.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-11-02T21:34:13Z

Test build #44823 has finished for PR 9383 at commit 2e341f5.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-11-02T23:19:50Z

Test build #44830 has finished for PR 9383 at commit 3864095.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-11-03T00:17:02Z

Test build #1970 has finished for PR 9383 at commit df44fc6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2015-11-03T00:18:51Z

ping @yhuai @JoshRosen

SparkQA · 2015-11-03T00:44:00Z

Test build #44834 has finished for PR 9383 at commit df44fc6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen · 2015-11-03T17:59:54Z

Currently, the old one is broken, I'd to remove that one.

@davies, are you referring to the old Aggregate1 interface or the old implementation of sort fallback here?

JoshRosen · 2015-11-03T18:10:00Z

@davies, the block comment at the top of TungstenAggregationIterator is now out-of-date; do you mind updating it to reflect the new behavior?

JoshRosen · 2015-11-03T18:12:02Z

core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java

Ordinarily, this would end up deleting the spill files, but it doesn't because of the spillWriters.clear() call above. If you end up updating this patch, mind adding a one-line comment to explain this (since it's a subtle point)?

SparkQA · 2015-11-04T02:30:46Z

Test build #44972 has finished for PR 9383 at commit 6f3bb15.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2015-11-04T06:23:47Z

core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java

@JoshRosen Do you remember why we need to clear this? Once cleared, how to delete the spilled files?

ping @JoshRosen

Chatted with @JoshRosen offline, we should not clear spillWriters here.

just a note, we had a quick discussion. Seems we should not call spillWriters.clear(). Otherwise those spilled files will not be deleted.

davies · 2015-11-04T06:29:29Z

@JoshRosen @yhuai pushed a refactor on this (reduce possibility of full GC by re-use the array and map), please take another look.

SparkQA · 2015-11-04T07:34:14Z

Test build #44997 has finished for PR 9383 at commit fc5e052.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-11-04T08:54:18Z

Test build #45001 has finished for PR 9383 at commit 1c0c6c3.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-11-04T18:42:42Z

Test build #1978 has finished for PR 9383 at commit 1c0c6c3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

yhuai · 2015-11-04T22:11:41Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

When we have numElements <= growthThreshold && !canGrowArray, it is guaranteed that our page still has space to put this key, right?

No, we check the space in page later.

yhuai · 2015-11-04T23:18:59Z

test this please

SparkQA · 2015-11-04T23:21:17Z

Test build #45062 has finished for PR 9383 at commit 10d7169.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

yhuai · 2015-11-04T23:24:42Z

LGTM pending jenkins.

SparkQA · 2015-11-05T04:30:42Z

Test build #45078 has finished for PR 9383 at commit b1f8a99.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2015-11-05T05:30:07Z

Merging into master, thanks!

davies mentioned this pull request Oct 30, 2015

[SPARK-11423] remove MapPartitionsWithPreparationRDD #9381

Closed

Davies Liu added 2 commits October 30, 2015 15:49

Do hash-based aggregation for all records before switch to sort-based

4e09081

merge the last map

53dbdf2

davies force-pushed the fix_switch branch from 55c47ed to 752c8e7 Compare October 30, 2015 22:50

davies force-pushed the fix_switch branch from 954c24e to cc5440f Compare October 30, 2015 22:59

viirya reviewed Oct 31, 2015
View reviewed changes

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java Outdated

Copy link

Member

viirya Oct 31, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to catch OutOfMemoryError anymore?

davies force-pushed the fix_switch branch from 6fde4d5 to 53dbdf2 Compare November 2, 2015 18:36

update tests

2e341f5

fix bug

df44fc6

davies force-pushed the fix_switch branch from 3864095 to df44fc6 Compare November 2, 2015 21:34

JoshRosen reviewed Nov 3, 2015
View reviewed changes

Davies Liu added 2 commits November 3, 2015 20:51

free the array after spilling

fc5e052

refactor

d89e034

davies force-pushed the fix_switch branch from fee162f to d89e034 Compare November 4, 2015 06:14

cleanup

1c0c6c3

davies reviewed Nov 4, 2015
View reviewed changes

throw better exception

d8422e1

yhuai reviewed Nov 4, 2015
View reviewed changes

Davies Liu added 3 commits November 4, 2015 15:00

add more comments

cbeaedf

Merge branch 'master' of github.com:apache/spark into fix_switch

fbce6fe

fix conflict

10d7169

Update UnsafeFixedWidthAggregationMap.java

8a20e56

fix build

f6a5f06

davies mentioned this pull request Nov 5, 2015

[SPARK-7542] [SQL] Support off-heap index/sort buffer #9477

Closed

fix test

b1f8a99

asfgit closed this in 81498dd Nov 5, 2015

[SPARK-11425] [SPARK-11486] Improve hybrid aggregation #9383

[SPARK-11425] [SPARK-11486] Improve hybrid aggregation #9383

Uh oh!

Conversation

davies commented Oct 30, 2015

Uh oh!

SparkQA commented Oct 30, 2015

Uh oh!

SparkQA commented Oct 30, 2015

Uh oh!

SparkQA commented Oct 30, 2015

Uh oh!

SparkQA commented Oct 30, 2015

Uh oh!

SparkQA commented Oct 31, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya commented Oct 31, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davies commented Oct 31, 2015

Uh oh!

SparkQA commented Oct 31, 2015

Uh oh!

SparkQA commented Oct 31, 2015

Uh oh!

davies commented Nov 2, 2015

Uh oh!

SparkQA commented Nov 2, 2015

Uh oh!

SparkQA commented Nov 2, 2015

Uh oh!

SparkQA commented Nov 2, 2015

Uh oh!

SparkQA commented Nov 3, 2015

Uh oh!

davies commented Nov 3, 2015

Uh oh!

SparkQA commented Nov 3, 2015

Uh oh!

JoshRosen commented Nov 3, 2015

Uh oh!

JoshRosen commented Nov 3, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Nov 4, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davies commented Nov 4, 2015

Uh oh!

SparkQA commented Nov 4, 2015

Uh oh!

SparkQA commented Nov 4, 2015

Uh oh!

SparkQA commented Nov 4, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yhuai commented Nov 4, 2015

Uh oh!

SparkQA commented Nov 4, 2015

Uh oh!

yhuai commented Nov 4, 2015

Uh oh!