Skip to content

Conversation

@pgandhi999
Copy link

@pgandhi999 pgandhi999 commented Mar 12, 2019

…when trying to kill executors either due to dynamic allocation or blacklisting

What changes were proposed in this pull request?

There are two deadlocks as a result of the interplay between three different threads:

task-result-getter thread

spark-dynamic-executor-allocation thread

dispatcher-event-loop thread(makeOffers())

The fix ensures ordering synchronization constraint by acquiring lock on TaskSchedulerImpl before acquiring lock on CoarseGrainedSchedulerBackend in makeOffers() as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks.

How was this patch tested?

Manual Tests

…when trying to kill executors either due to dynamic allocation or blacklisting

Ordered synchronization constraint by acquiring lock on Task Scheduler before acquiring lock on CoarseGrainedSchedulerBackend
@pgandhi999
Copy link
Author

ok to test

force: Boolean): Seq[String] = {
logInfo(s"Requesting to kill executor(s) ${executorIds.mkString(", ")}")

val idleExecutorIds = executorIds.filter { id => force || !scheduler.isExecutorBusy(id) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would not use the name idleExecutorIds for this variable as when flag force is true then not only idle executors are contained.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh, I'm not sure what else you'd call it ... there is already executorsToKill lower down ... unless you have a better suggestion, idleExecutorIds is probably good enough

but this does leave a small race for SPARK-19757, doesn't it? After this executes, then an executor gets a task scheduled on it so its no longer idle, but you still kill it below? To really prevent that, you'd need to get both locks (in the same order of course) so

val response = scheduler.synchronized { this.synchronized {

it also isn't the worst thing in the world if we occasionally kill an executor which just got a task scheduled on it.

Copy link
Contributor

@abellina abellina Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@squito If you wanted to prevent that race, then you need something like:

val response = scheduler.synchronized {
  val idleExecutorIds = executorIds.filter { id => force || !scheduler.isExecutorBusy(id) }
  this.synchronized {
    ...
  }
}

right (so the lookup inside the scheduler lock)?

it also isn't the worst thing in the world if we occasionally kill an executor which just got a task scheduled on it.

So we don't count this as a task failure right? Not sure where to look to verify that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I meant with the filter happening inside both locks -- more like it was before the current form of the PR, or as you suggested

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a suggestion for naming but I do not insist on that:

  • renaming idleExecutorIds to executorsToKill
  • renaming the old executorsToKill to knownExecutorsToKill

I also have checked the synchronised blocks of CoarseGrainedSchedulerBackend and its derived classes and have not found any other place where the scheduler is used for locking (within the synchronised block).

Copy link
Contributor

@squito squito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense, but there are more instance of the order inversion in the locks. I noticed at least CoarseGrainedSchedulerBackend.disableExecutor() also reverses the lock order.

force: Boolean): Seq[String] = {
logInfo(s"Requesting to kill executor(s) ${executorIds.mkString(", ")}")

val idleExecutorIds = executorIds.filter { id => force || !scheduler.isExecutorBusy(id) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh, I'm not sure what else you'd call it ... there is already executorsToKill lower down ... unless you have a better suggestion, idleExecutorIds is probably good enough

but this does leave a small race for SPARK-19757, doesn't it? After this executes, then an executor gets a task scheduled on it so its no longer idle, but you still kill it below? To really prevent that, you'd need to get both locks (in the same order of course) so

val response = scheduler.synchronized { this.synchronized {

it also isn't the worst thing in the world if we occasionally kill an executor which just got a task scheduled on it.

Some(executorData.executorAddress.hostPort))
}.toIndexedSeq
scheduler.resourceOffers(workOffers)
val taskDescs = scheduler.synchronized {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be a comment here about why we need both of these locks.

@abellina
Copy link
Contributor

@squito the CoarseGrainedSchedulerBackend.disableExecutor(), I don't think it inverts the order. the scheduler.executorLost call happens outside of the CoarseGrainedSchedulerBackend lock, and it gets called from a driver event loop. Let me know if I've missed something.

@vanzin
Copy link
Contributor

vanzin commented Mar 12, 2019

Your PR title and description are basically copies of the bug. Could you instead describe the change?

@SparkQA
Copy link

SparkQA commented Mar 13, 2019

Test build #103386 has finished for PR 24072 at commit e649900.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor

squito commented Mar 13, 2019

@abellina you're right about disableExecutors, thanks for taking a closer look, sorry that was just from a really quick scan. But we should be sure to take a close look at all places the lock is used.

@pgandhi999 pgandhi999 changed the title [SPARK-27112] : Spark Scheduler encounters two independent Deadlocks … [SPARK-27112] : Create a resource ordering between threads to resolve the deadlocks encountered … Mar 13, 2019
@pgandhi999
Copy link
Author

but this does leave a small race for SPARK-19757, doesn't it? After this executes, then an executor gets a task scheduled on it so its no longer idle, but you still kill it below? To really prevent that, you'd need to get both locks (in the same order of course) so

val response = scheduler.synchronized { this.synchronized {

@squito I did think about this yesterday and tried it out as well; the deadlock issue gets fixed alongwith the race, but I was not sure whether doing this may or may not cause a performance degradation as a bunch of threads might end up busy waiting a lot of time. I can do some perf tests with the above change and if it looks good, update the PR with the fix. Will let you know. Thank you.

}

// If an executor is already pending to be removed, do not kill it again (SPARK-9795)
// If this executor is busy, do not kill it unless we are told to force kill it (SPARK-9552)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comment and add one where we are doing the force check.

Locking the code block in killExecutors() method with TaskSchedulerImpl followed by CoarseGrainedSchedulerBackend to avoid race condition issue and adding comments.
@SparkQA
Copy link

SparkQA commented Mar 13, 2019

Test build #103453 has started for PR 24072 at commit ed12daf.

@vanzin
Copy link
Contributor

vanzin commented Mar 13, 2019

Sorry to be a pain about this, but please remove the bug stuff from the PR description. If we want details about the bug, we can look at, ahem, the bug. Focus on describing what the change does and why it fixes the problem.

// SPARK-27112: We need to ensure that there is ordering of lock acquisition
// between TaskSchedulerImpl and CoarseGrainedSchedulerBackend objects in order to fix
// the deadlock issue exposed in SPARK-27112
val taskDescs = scheduler.synchronized {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a quick look at the code that calls this, and I'm wondering if holding the two locks here is really needed.

For context, all this code is inside the RPC endpoint handler. This is a ThreadSafeRpcEndpoint so there's only one message being processed at a time, meaning that you won't have multiple threads calling makeOffers concurrently.

So it seems to me that it would be possible to:

  • with the CoarseGrainedSchedulerBackend.this lock held, calculate the works offers.
val workOffers = CoarseGrainedSchedulerBackend.this.synchronized {
  ...
}

With the scheduler lock held, make the offers:

val taskDesc = scheduler.synchronized {
  scheduler.resourceOffers(workOffers)
}

And as far as I understand that should work and also be easier to understand, right?

I also noticed that later this code calls launchTasks, and that method accesses and modifies data in executorDataMap without the CoarseGrainedSchedulerBackend.this lock, which is very sketchy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, seems both locks are needed because of SPARK-19757. But the launchTasks issue is still there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin So in the code, I came across the following comment, wonder if that answers the launchTasks issue. I exactly do not understand the intention of the comment though.

// Accessing `executorDataMap` in `DriverEndpoint.receive/receiveAndReply` doesn't need any
  // protection. But accessing `executorDataMap` out of `DriverEndpoint.receive/receiveAndReply`
  // must be protected by `CoarseGrainedSchedulerBackend.this`. Besides, `executorDataMap` should
  // only be modified in `DriverEndpoint.receive/receiveAndReply` with protection by
  // `CoarseGrainedSchedulerBackend.this`.
  private val executorDataMap = new HashMap[String, ExecutorData]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think that's fine. I checked and all modifications happen on the endpoint thread, so reading the map from that thread without a lock should be fine. The data being modified (freeCores) is also only used in the endpoint thread, so that looks safe too.

// If this executor is busy, do not kill it unless we are told to force kill it (SPARK-9552)
val executorsToKill = knownExecutors
.filter { id => !executorsPendingToRemove.contains(id) }
.filter { id => force || !scheduler.isExecutorBusy(id) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a similar vein to my previous comment, although I'm less sure about this one.

This seems to be the only interaction with the scheduler in this method, so could this filtering be done first thing in the method, with the scheduler lock held, and then the rest of the code just needs the CoarseGrainedSchedulerBackend lock?

It seems to me the behavior wouldn't change from the current state (where the internal scheduler state can change while this method is running). And as in the other case, easier to understand things when you're only holding one lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Caught up with the previous discussion and it seems that here both locks are needed to avoid an edge case where you could kill active executors.)

@vanzin
Copy link
Contributor

vanzin commented Mar 14, 2019

BTW if the "two locks need to be held" thing is really needed in multiple places, might be good to have a helper function, e.g.

def withLock[T](fn: => T): T = lock1.synchronized { lock2.synchronized { fn } }

@pgandhi999
Copy link
Author

@squito @vanzin @attilapiros @abellina Have worked on all the comments and pushed the respective changes.

private def makeOffers() {
// Make sure no executor is killed while some task is launching on it
val taskDescs = CoarseGrainedSchedulerBackend.this.synchronized {
// SPARK-27112: We need to ensure that there is ordering of lock acquisition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment would be great in the withLock function, instead of being copy & pasted in a few places.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// SPARK-27112: We need to ensure that there is ordering of lock acquisition
// between TaskSchedulerImpl and CoarseGrainedSchedulerBackend objects in order to fix
// the deadlock issue exposed in SPARK-27112
val taskDescs = withLock({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the parentheses.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the extra commits, was fixing code indentation.

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

Test build #103518 has finished for PR 24072 at commit 47448b7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

Test build #103514 has finished for PR 24072 at commit 2b4f226.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2019

Test build #103519 has finished for PR 24072 at commit 09f9b47.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@attilapiros
Copy link
Contributor

LGTM

@squito
Copy link
Contributor

squito commented Mar 15, 2019

lgtm

@vanzin vanzin changed the title [SPARK-27112] : Create a resource ordering between threads to resolve the deadlocks encountered … [SPARK-27112][core] Create a resource ordering between threads to resolve the deadlocks encountered … Mar 15, 2019
Copy link
Contributor

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@pgandhi999 pgandhi999 changed the title [SPARK-27112][core] Create a resource ordering between threads to resolve the deadlocks encountered … [SPARK-27112][CORE] : Create a resource ordering between threads to resolve the deadlocks encountered … Mar 15, 2019
@dhruve
Copy link
Contributor

dhruve commented Mar 18, 2019

+1
@squito @vanzin Can we merge this PR in. Thanks.

@squito
Copy link
Contributor

squito commented Mar 18, 2019

merged to master.

@pgandhi999 there was a merge conflict against branch-2.4, would you mind opening another PR against that branch?

@asfgit asfgit closed this in 7043aee Mar 18, 2019
@pgandhi999
Copy link
Author

Sure @squito will do that. Thank you.

pgandhi999 pushed a commit to pgandhi999/spark that referenced this pull request Mar 18, 2019
…esolve the deadlocks encountered …

…when trying to kill executors either due to dynamic allocation or blacklisting

There are two deadlocks as a result of the interplay between three different threads:

**task-result-getter thread**

**spark-dynamic-executor-allocation thread**

**dispatcher-event-loop thread(makeOffers())**

The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks.

Manual Tests

Closes apache#24072 from pgandhi999/SPARK-27112-2.

Authored-by: pgandhi <pgandhi@verizonmedia.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
asfgit pushed a commit that referenced this pull request Mar 19, 2019
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting

Closes #24072 from pgandhi999/SPARK-27112-2.

Authored-by: pgandhi <pgandhiverizonmedia.com>
Signed-off-by: Imran Rashid <irashidcloudera.com>

## What changes were proposed in this pull request?

There are two deadlocks as a result of the interplay between three different threads:

**task-result-getter thread**

**spark-dynamic-executor-allocation thread**

**dispatcher-event-loop thread(makeOffers())**

The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks.

## How was this patch tested?

Manual Tests

Closes #24134 from pgandhi999/branch-2.4-SPARK-27112.

Authored-by: pgandhi <pgandhi@verizonmedia.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
asfgit pushed a commit that referenced this pull request Mar 19, 2019
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting

Closes #24072 from pgandhi999/SPARK-27112-2.

Authored-by: pgandhi <pgandhiverizonmedia.com>
Signed-off-by: Imran Rashid <irashidcloudera.com>

## What changes were proposed in this pull request?

There are two deadlocks as a result of the interplay between three different threads:

**task-result-getter thread**

**spark-dynamic-executor-allocation thread**

**dispatcher-event-loop thread(makeOffers())**

The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks.

## How was this patch tested?

Manual Tests

Closes #24134 from pgandhi999/branch-2.4-SPARK-27112.

Authored-by: pgandhi <pgandhi@verizonmedia.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
(cherry picked from commit 95e73b3)
Signed-off-by: Imran Rashid <irashid@cloudera.com>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 23, 2019
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting

Closes apache#24072 from pgandhi999/SPARK-27112-2.

Authored-by: pgandhi <pgandhiverizonmedia.com>
Signed-off-by: Imran Rashid <irashidcloudera.com>

## What changes were proposed in this pull request?

There are two deadlocks as a result of the interplay between three different threads:

**task-result-getter thread**

**spark-dynamic-executor-allocation thread**

**dispatcher-event-loop thread(makeOffers())**

The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks.

## How was this patch tested?

Manual Tests

Closes apache#24134 from pgandhi999/branch-2.4-SPARK-27112.

Authored-by: pgandhi <pgandhi@verizonmedia.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 25, 2019
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting

Closes apache#24072 from pgandhi999/SPARK-27112-2.

Authored-by: pgandhi <pgandhiverizonmedia.com>
Signed-off-by: Imran Rashid <irashidcloudera.com>

## What changes were proposed in this pull request?

There are two deadlocks as a result of the interplay between three different threads:

**task-result-getter thread**

**spark-dynamic-executor-allocation thread**

**dispatcher-event-loop thread(makeOffers())**

The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks.

## How was this patch tested?

Manual Tests

Closes apache#24134 from pgandhi999/branch-2.4-SPARK-27112.

Authored-by: pgandhi <pgandhi@verizonmedia.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Aug 1, 2019
…esolve the deadlocks encountered when trying to kill executors either due to dynamic allocation or blacklisting

Closes apache#24072 from pgandhi999/SPARK-27112-2.

Authored-by: pgandhi <pgandhiverizonmedia.com>
Signed-off-by: Imran Rashid <irashidcloudera.com>

## What changes were proposed in this pull request?

There are two deadlocks as a result of the interplay between three different threads:

**task-result-getter thread**

**spark-dynamic-executor-allocation thread**

**dispatcher-event-loop thread(makeOffers())**

The fix ensures ordering synchronization constraint by acquiring lock on `TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in `makeOffers()` as well as killExecutors() method. This ensures resource ordering between the threads and thus, fixes the deadlocks.

## How was this patch tested?

Manual Tests

Closes apache#24134 from pgandhi999/branch-2.4-SPARK-27112.

Authored-by: pgandhi <pgandhi@verizonmedia.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants