Skip to content

Conversation

@YanTangZhai
Copy link
Contributor

Update global variables of HttpBroadcast so that multiple SparkContexts can coexist.
SparkContext1 creates broadcastManager and initializes HttpBroadcast object. HttpBroadcast creates httpserver and broadcastDir and so on. However SparkContext2 in the same process won't initialize HttpBroadcast object when creating broadcastManager. Since HttpBroadcast object is marked initialized and will not be initialized any more. SparkContext1 and SparkContext2 will share the same HttpBroadcast object. When SparkContext1 stops HttpBroadcast, HttpBroadcast in SparkContext2 actually is stopped. When HttpBroadcast1 cleans up files, some files owned by SparkContext2 may be removed. Since they are the same one.
The latest spark version still has this problem.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have started for PR 2059 at commit 97b3407.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have finished for PR 2059 at commit 97b3407.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have started for PR 2059 at commit a5987a5.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have finished for PR 2059 at commit a5987a5.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

Could you provide some more background in order to help us review this PR? Did older Spark versions support multiple SparkContexts? Is this fixing a regression from an earlier release?

Please add this information to the pull request description so that it's incorporated in the final commit message if we merge this. Thanks!

@YanTangZhai
Copy link
Contributor Author

Hi @JoshRosen SparkContext1 creates broadcastManager and initializes HttpBroadcast object. HttpBroadcast creates httpserver and broadcastDir and so on. However SparkContext2 in the same process won't initialize HttpBroadcast object when creating broadcastManager. Since HttpBroadcast object is marked initialized and will not be initialized any more. SparkContext1 and SparkContext2 will share the same HttpBroadcast object. When SparkContext1 stops HttpBroadcast, HttpBroadcast in SparkContext2 actually is stopped. When HttpBroadcast1 cleans up files, some files owned by SparkContext2 may be removed. Since they are the same one.
The latest spark version still has this problem.
Please review again. Thanks.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

QA tests have started for PR 2059 at commit a5987a5.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 5, 2014

QA tests have finished for PR 2059 at commit a5987a5.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

I think that we should close this issue for now, since there are other blockers to multiple SparkContexts in the same JVM (namely, the global SparkEnv). We'll consider this fix when addressing the larger "multiple SparkContexts issue", though.

@asfgit asfgit closed this in 534f24b Dec 27, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants