Skip to content

Conversation

@squito
Copy link
Contributor

@squito squito commented Jun 15, 2016

What changes were proposed in this pull request?

SPARK-15927 exacerbated a race in BasicSchedulerIntegrationTest, so it went from very unlikely to fairly frequent. The issue is that stage numbering is not completely deterministic, but these tests treated it like it was. So turn off the tests.

How was this patch tested?

on my laptop the test failed abotu 10% of the time before this change, and didn't fail in 500 runs after the change.

case _ =>
// we can't check for the output for the two intermediate stages, unfortunately,
// b/c the stage numbering is non-deterministic, so stage number alone doesn't tell
// us what to check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to do an OR here -- check that one of the intermediate outputs is available? (fine if you want to do this later and merge the fix for now)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, but decided it was kind of silly since you could be checking for something irrelevant, and its not a trivial change (you'd need to catch an exception from doing the wrong check). Anyway just figured I'd leave it out for the hotfix.

@kayousterhout
Copy link
Contributor

LGTM

Would be great if we could (later) update these tests so they still check the same things, but in a way that doesn't rely on the stage ID being deterministic.

@markhamstra
Copy link
Contributor

As a HOTIFX, LGTM; but I agree that there is room for a follow-up.

@squito
Copy link
Contributor Author

squito commented Jun 15, 2016

core tests passed so I merged to master (not 2.0). Will keep an eye on it as well.

@SparkQA
Copy link

SparkQA commented Jun 15, 2016

Test build #60593 has finished for PR 13688 at commit 3ae2a5f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor Author

squito commented Jun 15, 2016

also opened https://issues.apache.org/jira/browse/SPARK-15976 for related discussion.

@squito squito deleted the hotfix_basic_scheduler branch June 27, 2016 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants