[WIP] Measurement for SPARK-16929. #17112

jinxing64 · 2017-03-01T03:35:30Z

What changes were proposed in this pull request?

This pr doesn't target for merging. It's a measurement for #16867, in which store successful taskIds in successfulTaskIdsSet in TreeSet, thus the time complexity is O(n/2) when get median duration in checkSpeculatableTasks.

SparkQA · 2017-03-01T03:39:05Z

Test build #73654 has finished for PR 17112 at commit 6825bd7.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

jinxing64 · 2017-03-01T04:02:48Z

The unit test "Measurement for SPARK-16929." added is the measurement.
In TaskSetManagerSuite.scala line 1049, if newAlgorithm=true, successfulTaskIdsSet will be used to get the median duration. If newAlgorithm=false, old algorithm(Arrays.sort) will be used.

I calculate the time used for getting median duration in TaskSetManager.scala line 957.
If tasksNum=1000(TaskSetManagerSuite.scala line 1043), I did this test multiple times, results are as below:

newAlgorithm	time cost
false	5ms, 3ms, 4ms, 3ms, 3ms
true	2ms, 4ms, 2ms, 2ms, 3ms

if tasksNum=100000:

newAlgorithm	time cost
false	107ms, 109ms, 103ms, 100ms, 107ms
true	17ms, 14ms, 14ms, 13ms, 14ms

if tasksNum=150000:

newAlgorithm	time cost
false	133ms, 146ms, 127ms, 163ms, 114ms
true	14ms, 13ms, 15ms, 16ms, 14ms

As we can see, new algorithm(TreeSet) has better performance than old algorithm(Arrays.sort). When tasksNum=100000, Arrays.sort costs over 100ms every time, while in new algorithm all below 20ms.

srowen · 2017-03-01T13:54:38Z

Put [WIP] in the title for clarit

SparkQA · 2017-03-04T03:04:09Z

Test build #73889 has finished for PR 17112 at commit 61b96ff.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

* SPARK-16929: (178 commits) mod Refine test. scheduleAtFixedRate -> scheduleWithFixedDelay Change back to scheduleAtFixedRate Change some comment and unit tests. scheduleAtFixedRate -> scheduleWithFixedDelay Get rid of 'remove' and fix doc in MedianHeap [SPARK-16929] Improve performance when check speculatable tasks. [SPARK-19891][SS] Await Batch Lock notified on stream execution exit [SPARK-19008][SQL] Improve performance of Dataset.map by eliminating boxing/unboxing [SPARK-19886] Fix reportDataLoss if statement in SS KafkaSource [SPARK-19611][SQL] Introduce configurable table schema inference [SPARK-12334][SQL][PYSPARK] Support read from multiple input paths for orc file in DataFrameReader.orc [SPARK-19861][SS] watermark should not be a negative time. [SPARK-19715][STRUCTURED STREAMING] Option to Strip Paths in FileSource [SPARK-19793] Use clock.getTimeMillis when mark task as finished in TaskSetManager. [SPARK-19757][CORE] DriverEndpoint#makeOffers race against CoarseGrainedSchedulerBackend#killExecutors [SPARK-19561][SQL] add int case handling for TimestampType [SPARK-19763][SQL] qualified external datasource table location stored in catalog [SPARK-19859][SS][FOLLOW-UP] The new watermark should override the old one. ...

SparkQA · 2017-03-18T04:16:39Z

Test build #74765 has finished for PR 17112 at commit cfc7e33.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

jinxing64 mentioned this pull request Mar 1, 2017

[SPARK-16929] Improve performance when check speculatable tasks. #16867

Closed

jinxing64 changed the title ~~Measurement for SPARK-16929.~~ [WIP] Measurement for SPARK-16929. Mar 2, 2017

jinxing64 mentioned this pull request Mar 2, 2017

[SPARK-19793] Use clock.getTimeMillis when mark task as finished in TaskSetManager. #17133

Closed

jinxing added 11 commits March 11, 2017 09:10

[SPARK-16929] Improve performance when check speculatable tasks.

09719a2

Get rid of 'remove' and fix doc in MedianHeap

318a172

scheduleAtFixedRate -> scheduleWithFixedDelay

5aa2fcf

Change some comment and unit tests.

1728895

Change back to scheduleAtFixedRate

2518a95

scheduleAtFixedRate -> scheduleWithFixedDelay

7740d77

Refine test.

c13a198

mod

617d5aa

Measurement for SPARK-16929.

9d627c4

update

cfc7e33

jinxing64 force-pushed the SPARK-16929-measurement branch from 61b96ff to cfc7e33 Compare March 18, 2017 02:08

jinxing64 closed this Apr 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Measurement for SPARK-16929. #17112

[WIP] Measurement for SPARK-16929. #17112

Uh oh!

jinxing64 commented Mar 1, 2017

Uh oh!

SparkQA commented Mar 1, 2017

Uh oh!

jinxing64 commented Mar 1, 2017 •

edited

Loading

Uh oh!

srowen commented Mar 1, 2017

Uh oh!

SparkQA commented Mar 4, 2017

Uh oh!

SparkQA commented Mar 18, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIP] Measurement for SPARK-16929. #17112

[WIP] Measurement for SPARK-16929. #17112

Uh oh!

Conversation

jinxing64 commented Mar 1, 2017

What changes were proposed in this pull request?

Uh oh!

SparkQA commented Mar 1, 2017

Uh oh!

jinxing64 commented Mar 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

srowen commented Mar 1, 2017

Uh oh!

SparkQA commented Mar 4, 2017

Uh oh!

SparkQA commented Mar 18, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jinxing64 commented Mar 1, 2017 •

edited

Loading