-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-27112][CORE] : Create a resource ordering between threads to resolve the deadlocks encountered … #24072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…when trying to kill executors either due to dynamic allocation or blacklisting Ordered synchronization constraint by acquiring lock on Task Scheduler before acquiring lock on CoarseGrainedSchedulerBackend
|
ok to test |
| force: Boolean): Seq[String] = { | ||
| logInfo(s"Requesting to kill executor(s) ${executorIds.mkString(", ")}") | ||
|
|
||
| val idleExecutorIds = executorIds.filter { id => force || !scheduler.isExecutorBusy(id) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I would not use the name idleExecutorIds for this variable as when flag force is true then not only idle executors are contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meh, I'm not sure what else you'd call it ... there is already executorsToKill lower down ... unless you have a better suggestion, idleExecutorIds is probably good enough
but this does leave a small race for SPARK-19757, doesn't it? After this executes, then an executor gets a task scheduled on it so its no longer idle, but you still kill it below? To really prevent that, you'd need to get both locks (in the same order of course) so
val response = scheduler.synchronized { this.synchronized {
it also isn't the worst thing in the world if we occasionally kill an executor which just got a task scheduled on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@squito If you wanted to prevent that race, then you need something like:
val response = scheduler.synchronized {
val idleExecutorIds = executorIds.filter { id => force || !scheduler.isExecutorBusy(id) }
this.synchronized {
...
}
}
right (so the lookup inside the scheduler lock)?
it also isn't the worst thing in the world if we occasionally kill an executor which just got a task scheduled on it.
So we don't count this as a task failure right? Not sure where to look to verify that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I meant with the filter happening inside both locks -- more like it was before the current form of the PR, or as you suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a suggestion for naming but I do not insist on that:
- renaming
idleExecutorIdstoexecutorsToKill - renaming the old
executorsToKilltoknownExecutorsToKill
I also have checked the synchronised blocks of CoarseGrainedSchedulerBackend and its derived classes and have not found any other place where the scheduler is used for locking (within the synchronised block).
squito
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense, but there are more instance of the order inversion in the locks. I noticed at least CoarseGrainedSchedulerBackend.disableExecutor() also reverses the lock order.
| force: Boolean): Seq[String] = { | ||
| logInfo(s"Requesting to kill executor(s) ${executorIds.mkString(", ")}") | ||
|
|
||
| val idleExecutorIds = executorIds.filter { id => force || !scheduler.isExecutorBusy(id) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meh, I'm not sure what else you'd call it ... there is already executorsToKill lower down ... unless you have a better suggestion, idleExecutorIds is probably good enough
but this does leave a small race for SPARK-19757, doesn't it? After this executes, then an executor gets a task scheduled on it so its no longer idle, but you still kill it below? To really prevent that, you'd need to get both locks (in the same order of course) so
val response = scheduler.synchronized { this.synchronized {
it also isn't the worst thing in the world if we occasionally kill an executor which just got a task scheduled on it.
| Some(executorData.executorAddress.hostPort)) | ||
| }.toIndexedSeq | ||
| scheduler.resourceOffers(workOffers) | ||
| val taskDescs = scheduler.synchronized { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there should be a comment here about why we need both of these locks.
|
@squito the |
|
Your PR title and description are basically copies of the bug. Could you instead describe the change? |
|
Test build #103386 has finished for PR 24072 at commit
|
|
@abellina you're right about |
@squito I did think about this yesterday and tried it out as well; the deadlock issue gets fixed alongwith the race, but I was not sure whether doing this may or may not cause a performance degradation as a bunch of threads might end up busy waiting a lot of time. I can do some perf tests with the above change and if it looks good, update the PR with the fix. Will let you know. Thank you. |
| } | ||
|
|
||
| // If an executor is already pending to be removed, do not kill it again (SPARK-9795) | ||
| // If this executor is busy, do not kill it unless we are told to force kill it (SPARK-9552) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this comment and add one where we are doing the force check.
Locking the code block in killExecutors() method with TaskSchedulerImpl followed by CoarseGrainedSchedulerBackend to avoid race condition issue and adding comments.
|
Test build #103453 has started for PR 24072 at commit |
|
Sorry to be a pain about this, but please remove the bug stuff from the PR description. If we want details about the bug, we can look at, ahem, the bug. Focus on describing what the change does and why it fixes the problem. |
| // SPARK-27112: We need to ensure that there is ordering of lock acquisition | ||
| // between TaskSchedulerImpl and CoarseGrainedSchedulerBackend objects in order to fix | ||
| // the deadlock issue exposed in SPARK-27112 | ||
| val taskDescs = scheduler.synchronized { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a quick look at the code that calls this, and I'm wondering if holding the two locks here is really needed.
For context, all this code is inside the RPC endpoint handler. This is a ThreadSafeRpcEndpoint so there's only one message being processed at a time, meaning that you won't have multiple threads calling makeOffers concurrently.
So it seems to me that it would be possible to:
- with the
CoarseGrainedSchedulerBackend.thislock held, calculate the works offers.
val workOffers = CoarseGrainedSchedulerBackend.this.synchronized {
...
}
With the scheduler lock held, make the offers:
val taskDesc = scheduler.synchronized {
scheduler.resourceOffers(workOffers)
}
And as far as I understand that should work and also be easier to understand, right?
I also noticed that later this code calls launchTasks, and that method accesses and modifies data in executorDataMap without the CoarseGrainedSchedulerBackend.this lock, which is very sketchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, seems both locks are needed because of SPARK-19757. But the launchTasks issue is still there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vanzin So in the code, I came across the following comment, wonder if that answers the launchTasks issue. I exactly do not understand the intention of the comment though.
// Accessing `executorDataMap` in `DriverEndpoint.receive/receiveAndReply` doesn't need any
// protection. But accessing `executorDataMap` out of `DriverEndpoint.receive/receiveAndReply`
// must be protected by `CoarseGrainedSchedulerBackend.this`. Besides, `executorDataMap` should
// only be modified in `DriverEndpoint.receive/receiveAndReply` with protection by
// `CoarseGrainedSchedulerBackend.this`.
private val executorDataMap = new HashMap[String, ExecutorData]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think that's fine. I checked and all modifications happen on the endpoint thread, so reading the map from that thread without a lock should be fine. The data being modified (freeCores) is also only used in the endpoint thread, so that looks safe too.
| // If this executor is busy, do not kill it unless we are told to force kill it (SPARK-9552) | ||
| val executorsToKill = knownExecutors | ||
| .filter { id => !executorsPendingToRemove.contains(id) } | ||
| .filter { id => force || !scheduler.isExecutorBusy(id) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a similar vein to my previous comment, although I'm less sure about this one.
This seems to be the only interaction with the scheduler in this method, so could this filtering be done first thing in the method, with the scheduler lock held, and then the rest of the code just needs the CoarseGrainedSchedulerBackend lock?
It seems to me the behavior wouldn't change from the current state (where the internal scheduler state can change while this method is running). And as in the other case, easier to understand things when you're only holding one lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Caught up with the previous discussion and it seems that here both locks are needed to avoid an edge case where you could kill active executors.)
|
BTW if the "two locks need to be held" thing is really needed in multiple places, might be good to have a helper function, e.g. |
|
@squito @vanzin @attilapiros @abellina Have worked on all the comments and pushed the respective changes. |
| private def makeOffers() { | ||
| // Make sure no executor is killed while some task is launching on it | ||
| val taskDescs = CoarseGrainedSchedulerBackend.this.synchronized { | ||
| // SPARK-27112: We need to ensure that there is ordering of lock acquisition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment would be great in the withLock function, instead of being copy & pasted in a few places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| // SPARK-27112: We need to ensure that there is ordering of lock acquisition | ||
| // between TaskSchedulerImpl and CoarseGrainedSchedulerBackend objects in order to fix | ||
| // the deadlock issue exposed in SPARK-27112 | ||
| val taskDescs = withLock({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for the parentheses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the extra commits, was fixing code indentation.
|
Test build #103518 has finished for PR 24072 at commit
|
|
Test build #103514 has finished for PR 24072 at commit
|
|
Test build #103519 has finished for PR 24072 at commit
|
|
LGTM |
|
lgtm |
abellina
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
merged to master. @pgandhi999 there was a merge conflict against branch-2.4, would you mind opening another PR against that branch? |
|
Sure @squito will do that. Thank you. |
…esolve the deadlocks encountered … …when trying to kill executors either due to dynamic allocation or blacklisting There are two deadlocks as a result of the interplay between three different threads: **task-result-getter thread** **spark-dynamic-executor-allocation thread** **dispatcher-event-loop thread(makeOffers())** The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks. Manual Tests Closes apache#24072 from pgandhi999/SPARK-27112-2. Authored-by: pgandhi <pgandhi@verizonmedia.com> Signed-off-by: Imran Rashid <irashid@cloudera.com>
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting Closes #24072 from pgandhi999/SPARK-27112-2. Authored-by: pgandhi <pgandhiverizonmedia.com> Signed-off-by: Imran Rashid <irashidcloudera.com> ## What changes were proposed in this pull request? There are two deadlocks as a result of the interplay between three different threads: **task-result-getter thread** **spark-dynamic-executor-allocation thread** **dispatcher-event-loop thread(makeOffers())** The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks. ## How was this patch tested? Manual Tests Closes #24134 from pgandhi999/branch-2.4-SPARK-27112. Authored-by: pgandhi <pgandhi@verizonmedia.com> Signed-off-by: Imran Rashid <irashid@cloudera.com>
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting Closes #24072 from pgandhi999/SPARK-27112-2. Authored-by: pgandhi <pgandhiverizonmedia.com> Signed-off-by: Imran Rashid <irashidcloudera.com> ## What changes were proposed in this pull request? There are two deadlocks as a result of the interplay between three different threads: **task-result-getter thread** **spark-dynamic-executor-allocation thread** **dispatcher-event-loop thread(makeOffers())** The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks. ## How was this patch tested? Manual Tests Closes #24134 from pgandhi999/branch-2.4-SPARK-27112. Authored-by: pgandhi <pgandhi@verizonmedia.com> Signed-off-by: Imran Rashid <irashid@cloudera.com> (cherry picked from commit 95e73b3) Signed-off-by: Imran Rashid <irashid@cloudera.com>
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting Closes apache#24072 from pgandhi999/SPARK-27112-2. Authored-by: pgandhi <pgandhiverizonmedia.com> Signed-off-by: Imran Rashid <irashidcloudera.com> ## What changes were proposed in this pull request? There are two deadlocks as a result of the interplay between three different threads: **task-result-getter thread** **spark-dynamic-executor-allocation thread** **dispatcher-event-loop thread(makeOffers())** The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks. ## How was this patch tested? Manual Tests Closes apache#24134 from pgandhi999/branch-2.4-SPARK-27112. Authored-by: pgandhi <pgandhi@verizonmedia.com> Signed-off-by: Imran Rashid <irashid@cloudera.com>
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting Closes apache#24072 from pgandhi999/SPARK-27112-2. Authored-by: pgandhi <pgandhiverizonmedia.com> Signed-off-by: Imran Rashid <irashidcloudera.com> ## What changes were proposed in this pull request? There are two deadlocks as a result of the interplay between three different threads: **task-result-getter thread** **spark-dynamic-executor-allocation thread** **dispatcher-event-loop thread(makeOffers())** The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks. ## How was this patch tested? Manual Tests Closes apache#24134 from pgandhi999/branch-2.4-SPARK-27112. Authored-by: pgandhi <pgandhi@verizonmedia.com> Signed-off-by: Imran Rashid <irashid@cloudera.com>
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting Closes apache#24072 from pgandhi999/SPARK-27112-2. Authored-by: pgandhi <pgandhiverizonmedia.com> Signed-off-by: Imran Rashid <irashidcloudera.com> ## What changes were proposed in this pull request? There are two deadlocks as a result of the interplay between three different threads: **task-result-getter thread** **spark-dynamic-executor-allocation thread** **dispatcher-event-loop thread(makeOffers())** The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks. ## How was this patch tested? Manual Tests Closes apache#24134 from pgandhi999/branch-2.4-SPARK-27112. Authored-by: pgandhi <pgandhi@verizonmedia.com> Signed-off-by: Imran Rashid <irashid@cloudera.com>
…when trying to kill executors either due to dynamic allocation or blacklisting
What changes were proposed in this pull request?
There are two deadlocks as a result of the interplay between three different threads:
task-result-getter thread
spark-dynamic-executor-allocation thread
dispatcher-event-loop thread(makeOffers())
The fix ensures ordering synchronization constraint by acquiring lock on
TaskSchedulerImplbefore acquiring lock onCoarseGrainedSchedulerBackendinmakeOffers()as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks.How was this patch tested?
Manual Tests