-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-15271] [MESOS] Allow force pulling executor docker images #13051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Please note that I had to upgrade the Mesos library version to implement this feature! Unfortunately its only available in Mesos 0.22 and above. |
docs/running-on-mesos.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docker image support, generally, will still work in 0.20.1. It's just the forcePullImage feature that will require 0.22.2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, rewrote that part.
de28d19 to
5b072ba
Compare
5b072ba to
1b295e8
Compare
|
I rebased again, against master. Anything else I can do here to get this merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really spell out the parameter name in other places in this call. I recommend just putting the value here.
|
Other than the style nit, I think it LGTM. Once you fixed it we need to ask a Spark committer to review it. |
|
ok to test |
|
Test build #59858 has finished for PR 13051 at commit
|
1b295e8 to
80d1e3e
Compare
|
@tnachen I addressed your comment and also fixed the scala style error from the build. |
|
Test build #59864 has finished for PR 13051 at commit
|
|
@andrewor14 PTAL, this PR LGTM |
80d1e3e to
7768589
Compare
|
rebase to master |
|
Test build #61719 has finished for PR 13051 at commit
|
|
Can you add this to the CoarseMesosSchedulerBackend as well? This solution adds the force pull feature to drivers, but not executors for spark jobs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the redundancy in having both methods. Can we simplify here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
7768589 to
c2d4b7f
Compare
|
Test build #62317 has finished for PR 13051 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a gratuitous conversion. I'd rather retain the semantics of SparkConf, and just pass that in, unless there's a really good reason not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I provided a better approach. Note that this method is also called from MesosClusterScheduler which maintains the driver settings as a raw Map[String, String] so passing the SparkConf here is not an option imho.
Mesos agents by default will not pull docker images which are cached locally already. In order to run Spark executors from mutable tags like `:latest` this commit introduces a Spark setting (`spark.mesos.executor.docker.forcePullImage`). Setting this flag to true will tell the Mesos agent to force pull the docker image (default is `false` which is consistent with the previous implementation and Mesos' default behaviour).
c2d4b7f to
36b3258
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on this, I figured that the network parameter is used nowhere. Just wanted to point this out, not sure if there is WIP for adding this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I'm not sure what happened here. Thanks for pointing it out.
|
Test build #62465 has finished for PR 13051 at commit
|
|
@andrewor14 This LGTM. Can you take a look and merge? |
|
@srowen Or if you could help :) |
|
Merged to master |
|
This patch just broke the build: Going to revert now. |
|
Shoot, yeah PR builder tests passed, but that was 7 days ago. Sorry @philipphoffmann can you try this one again after figuring out whatever caused this? |
What changes were proposed in this pull request?
Mesos agents by default will not pull docker images which are cached
locally already. In order to run Spark executors from mutable tags like
:latestthis commit introduces a Spark settingspark.mesos.executor.docker.forcePullImage. Setting this flag totrue will tell the Mesos agent to force pull the docker image (default is
falsewhich is consistent with the previousimplementation and Mesos' default
behaviour).
How was this patch tested?
I ran a sample application including this change on a Mesos cluster and verified the correct behaviour for both, with and without, force pulling the executor image. As expected the image is being force pulled if the flag is set.