Commit 3130ac9
[SPARK-46861][CORE] Avoid Deadlock in DAGScheduler
* The DAGScheduler could currently run into a deadlock with another thread if both access the partitions of the same RDD at the same time.
* To make progress in getCacheLocs, we require both exclusive access to the RDD partitions and the location cache. We first lock on the location cache, and then on the RDD.
* When accessing partitions of an RDD, the RDD first acquires exclusive access on the partitions, and then might acquire exclusive access on the location cache.
* If thread 1 is able to acquire access on the RDD, while thread 2 holds the access to the location cache, we can run into a deadlock situation.
* To fix this, acquire locks in the same order. Change the DAGScheduler to first acquire the lock on the RDD, and then the lock on the location cache.
* This is a deadlock you can run into, which can prevent any progress on the cluster.
* No
* Unit test that reproduces the issue.
No
Closes #44882 from fred-db/fix-deadlock.
Authored-by: fred-db <fredrik.klauss@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 617014c)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>1 parent e56bd97 commit 3130ac9
File tree
3 files changed
+62
-18
lines changed- core/src
- main/scala/org/apache/spark
- rdd
- scheduler
- test/scala/org/apache/spark/scheduler
3 files changed
+62
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
223 | 223 | | |
224 | 224 | | |
225 | 225 | | |
226 | | - | |
227 | | - | |
228 | | - | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
229 | 232 | | |
230 | 233 | | |
231 | 234 | | |
232 | 235 | | |
233 | | - | |
| 236 | + | |
234 | 237 | | |
235 | 238 | | |
236 | 239 | | |
| |||
Lines changed: 18 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
176 | 179 | | |
177 | 180 | | |
178 | 181 | | |
| |||
408 | 411 | | |
409 | 412 | | |
410 | 413 | | |
411 | | - | |
412 | | - | |
413 | | - | |
414 | | - | |
415 | | - | |
416 | | - | |
417 | | - | |
418 | | - | |
419 | | - | |
420 | | - | |
421 | | - | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
422 | 427 | | |
| 428 | + | |
423 | 429 | | |
424 | | - | |
| 430 | + | |
425 | 431 | | |
426 | | - | |
427 | 432 | | |
428 | 433 | | |
429 | 434 | | |
| |||
Lines changed: 37 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
51 | | - | |
| 51 | + | |
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| |||
589 | 589 | | |
590 | 590 | | |
591 | 591 | | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
592 | 628 | | |
593 | 629 | | |
594 | 630 | | |
| |||
0 commit comments