-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-15994] [MESOS] Allow enabling Mesos fetch cache in coarse executor backend #13713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Mesos 0.23.0 introduces a Fetch Cache feature http://mesos.apache.org/documentation/latest/fetcher/ which allows caching of resources specified in command URIs. This patch: * Updates the Mesos shaded protobuf dependency to 0.23.0 * Allows setting `spark.mesos.fetchCache.enable` to enable the fetch cache for all specified URIs. (URIs must be specified for the setting to have any affect) * Updates documentation for Mesos configuration with the new setting. This patch does NOT: * Allow for per-URI caching configuration. The cache setting is global to ALL URIs for the command.
|
ok to test |
|
Test build #60998 has finished for PR 13713 at commit
|
|
@tnachen any comments? |
|
Test build #62931 has finished for PR 13713 at commit
|
|
We also fetch URIs for running drivers in cluster mode (MesosClusterScheduler.scala). I'm thinking we should also allow this configuration to effect that too. |
|
Test build #63032 has finished for PR 13713 at commit
|
|
OoooOOO master updated to 1.0.0 Fixed merge conflicts |
|
@tnachen this test is a little larger than I originally anticipated. Let me see if I can add some unit tests |
|
Test build #63046 has finished for PR 13713 at commit
|
|
@drcrallen Are you still planning to update this? It's quite a useful feature, so hoping this can get in. Also since Fine grain mode is depcreated I don't think we need to update it too. |
|
I can update this today. |
|
I'm still adding a basic test for this. |
|
Test build #66688 has finished for PR 13713 at commit
|
|
Tested with |
|
@tnachen let me know if there's any outstanding issues here. |
| assert(launchedTasks.head.getCommand.getUrisList.asScala(0).getValue == url) | ||
| } | ||
|
|
||
| test("mesos supports setting fetcher") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/supports setting fetcher/supports setting fetcher cache/g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docs/running-on-mesos.md
Outdated
| <td><code>spark.mesos.fetchCache.enable</code></td> | ||
| <td><code>false</code></td> | ||
| <td> | ||
| If set to `true`, all URIs in `spark.mesos.uris` will be eligible for caching by the [Mesos fetch cache](http://mesos.apache.org/documentation/latest/fetcher/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the implementation you actually set all downloadable URIs (like spark.executor.uri, jarUrl, etc) to be fetcher cachable. I think we need to be more explicit here that it's more than just spark.mesos.uris
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more info.
|
Test build #66690 has finished for PR 13713 at commit
|
|
Test build #66703 has finished for PR 13713 at commit
|
docs/running-on-mesos.md
Outdated
| <td><code>spark.mesos.fetchCache.enable</code></td> | ||
| <td><code>false</code></td> | ||
| <td> | ||
| If set to `true`, all URIs (example: `spark.executor.uri`, `spark.mesos.uris`) will be eligible for caching by the [Mesos fetch cache](http://mesos.apache.org/documentation/latest/fetcher/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/fetch/fetcher
s/eligible for caching/cached (they're not just eligible, they are in fact cached)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed both
docs/running-on-mesos.md
Outdated
|
|
||
|
|
||
| <tr> | ||
| <td><code>spark.mesos.fetchCache.enable</code></td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/fetchCache/fetcherCache (et al)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| private val queuedCapacity = conf.getInt("spark.mesos.maxDrivers", 200) | ||
| private val retainedDrivers = conf.getInt("spark.mesos.retainedDrivers", 200) | ||
| private val maxRetryWaitTime = conf.getInt("spark.mesos.cluster.retry.wait.max", 60) // 1 minute | ||
| private val useFetchCache = conf.getBoolean("spark.mesos.fetchCache.enable", false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need to factor out all these conf vars like YARN does at some point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea but outside scope of this PR
| ), false) | ||
| val offers = List(Resources(backend.executorMemory(sc), 1)) | ||
| offerResources(offers) | ||
| // Don't crash! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it originally didn't do much except check that it didn't fail. I have since added proper checking of the task info for if the fetcher cache was enabled. Will remove
|
|
||
| private[mesos] val mesosExecutorCores = sc.conf.getDouble("spark.mesos.mesosExecutor.cores", 1) | ||
|
|
||
| private val useFetchCache = sc.conf.getBoolean("spark.mesos.fetchCache.enable", false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove changes to the fine grained scheduler. It's deprecated, and we shouldn't be adding any new features to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
|
Test build #66850 has finished for PR 13713 at commit
|
|
Test build #66852 has finished for PR 13713 at commit
|
| useFetchCache: Boolean = false): Unit = { | ||
| uris.split(",").foreach { uri => | ||
| builder.addUris(CommandInfo.URI.newBuilder().setValue(uri.trim())) | ||
| builder.addUris(CommandInfo.URI.newBuilder().setValue(uri.trim()).setCache(useFetchCache)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/fetch/fetcher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
LGTM @srowen |
|
Test build #66909 has finished for PR 13713 at commit
|
|
@mgummelt do you know what still needs to be done to get this in? |
|
We need to get @srowen or one of the other committers to merge it. |
|
Merged to master |
…tor backend Mesos 0.23.0 introduces a Fetch Cache feature http://mesos.apache.org/documentation/latest/fetcher/ which allows caching of resources specified in command URIs. This patch: - Updates the Mesos shaded protobuf dependency to 0.23.0 - Allows setting `spark.mesos.fetcherCache.enable` to enable the fetch cache for all specified URIs. (URIs must be specified for the setting to have any affect) - Updates documentation for Mesos configuration with the new setting. This patch does NOT: - Allow for per-URI caching configuration. The cache setting is global to ALL URIs for the command. Author: Charles Allen <charles@allen-net.com> Closes apache#13713 from drcrallen/SPARK15994.
Mesos 0.23.0 introduces a Fetch Cache feature http://mesos.apache.org/documentation/latest/fetcher/ which allows caching of resources specified in command URIs.
This patch:
spark.mesos.fetcherCache.enableto enable the fetch cache for all specified URIs. (URIs must be specified for the setting to have any affect)This patch does NOT: