-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-53394][CORE] UninterruptibleLock.isInterruptible should avoid duplicated interrupt #52139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala
Outdated
Show resolved
Hide resolved
|
Can the existing tests verify this fix? |
@cloud-fan Yes. |
|
@Ngone51 What is the assumption? If thread A or thread B interrupts |
core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/util/UninterruptibleThread.scala
Outdated
Show resolved
Hide resolved
vrozov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@Ngone51 since this needs manual test, have you verified the new fix locally? |
|
@cloud-fan Yes, it still works. |
|
thanks, merging to master/4.0! |
…duplicated interrupt ### What changes were proposed in this pull request? This PR fixes `UninterruptibleLock.isInterruptible` to avoid duplicated interruption when the thread is already interrupted. ### Why are the changes needed? The "uninterruptible" semantic of `UninterruptibleThread`is broken (i.e., `UninterruptibleThread` is interruptible even if it's under `runUninterruptibly`) after the fix #50594. The probelm is that the state of `shouldInterruptThread` becomes unsafe when there are multiple interrupts concurrently. For example, thread A could interrupt UninterruptibleThread ut first before UninterruptibleThread enters `runUninterruptibly`. Right after that, another thread B starts to invoke ut.interrupt() and pass through `uninterruptibleLock.isInterruptible` (becasue at this point, `shouldInterruptThread = uninterruptible = false`). Before thread B invokes `super.interrupt()`, UninterruptibleThread ut enters `runUninterruptibly` and pass through `uninterruptibleLock.getAndSetUninterruptible` and set `uninterruptible = true`. Then, thread ut continues the check `uninterruptibleLock.isInterruptPending`. However, `uninterruptibleLock.isInterruptPending` return false at this point (due to `shouldInterruptThread = Thread.interrupted = true`) even though thread B is actully interrupting. *As a result, the state of `shouldInterruptThread` becomes inconsistent between thread B and thread ut.* Then, as `uninterruptibleLock.isInterruptPending` returns false, ut to continute to execute `f`. At the same time, thread B invokes `super.interrupt()`, and `f` could be interrupted ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually tested. The issue can be easily reproduced if we run `UninterruptibleThreadSuite.stress test` for 100 times in a row: ``` [info] true did not equal false (UninterruptibleThreadSuite.scala:208) [info] org.scalatest.exceptions.TestFailedException: [info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472) [info] at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471) [info] at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231) [info] at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295) [info] at org.apache.spark.util.UninterruptibleThreadSuite.$anonfun$new$7(UninterruptibleThreadSuite.scala:208) ... ``` And the issue is gone after the fix. ### Was this patch authored or co-authored using generative AI tooling? No Closes #52139 from Ngone51/fix-uninterruptiable. Authored-by: Yi Wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 78871d7) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
Thanks! @cloud-fan @vrozov |
…duplicated interrupt ### What changes were proposed in this pull request? This PR fixes `UninterruptibleLock.isInterruptible` to avoid duplicated interruption when the thread is already interrupted. ### Why are the changes needed? The "uninterruptible" semantic of `UninterruptibleThread`is broken (i.e., `UninterruptibleThread` is interruptible even if it's under `runUninterruptibly`) after the fix apache#50594. The probelm is that the state of `shouldInterruptThread` becomes unsafe when there are multiple interrupts concurrently. For example, thread A could interrupt UninterruptibleThread ut first before UninterruptibleThread enters `runUninterruptibly`. Right after that, another thread B starts to invoke ut.interrupt() and pass through `uninterruptibleLock.isInterruptible` (becasue at this point, `shouldInterruptThread = uninterruptible = false`). Before thread B invokes `super.interrupt()`, UninterruptibleThread ut enters `runUninterruptibly` and pass through `uninterruptibleLock.getAndSetUninterruptible` and set `uninterruptible = true`. Then, thread ut continues the check `uninterruptibleLock.isInterruptPending`. However, `uninterruptibleLock.isInterruptPending` return false at this point (due to `shouldInterruptThread = Thread.interrupted = true`) even though thread B is actully interrupting. *As a result, the state of `shouldInterruptThread` becomes inconsistent between thread B and thread ut.* Then, as `uninterruptibleLock.isInterruptPending` returns false, ut to continute to execute `f`. At the same time, thread B invokes `super.interrupt()`, and `f` could be interrupted ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually tested. The issue can be easily reproduced if we run `UninterruptibleThreadSuite.stress test` for 100 times in a row: ``` [info] true did not equal false (UninterruptibleThreadSuite.scala:208) [info] org.scalatest.exceptions.TestFailedException: [info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472) [info] at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471) [info] at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231) [info] at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295) [info] at org.apache.spark.util.UninterruptibleThreadSuite.$anonfun$new$7(UninterruptibleThreadSuite.scala:208) ... ``` And the issue is gone after the fix. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#52139 from Ngone51/fix-uninterruptiable. Authored-by: Yi Wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 271392a) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This PR fixes
UninterruptibleLock.isInterruptibleto avoid duplicated interruption when the thread is already interrupted.Why are the changes needed?
The "uninterruptible" semantic of
UninterruptibleThreadis broken (i.e.,UninterruptibleThreadis interruptible even if it's underrunUninterruptibly) after the fix #50594. The probelm is that the state ofshouldInterruptThreadbecomes unsafe when there are multiple interrupts concurrently.For example, thread A could interrupt UninterruptibleThread ut first before UninterruptibleThread enters
runUninterruptibly. Right after that, another thread B starts to invoke ut.interrupt() and pass throughuninterruptibleLock.isInterruptible(becasue at this point,shouldInterruptThread = uninterruptible = false). Before thread B invokessuper.interrupt(), UninterruptibleThread ut entersrunUninterruptiblyand pass throughuninterruptibleLock.getAndSetUninterruptibleand setuninterruptible = true. Then, thread ut continues the checkuninterruptibleLock.isInterruptPending. However,uninterruptibleLock.isInterruptPendingreturn false at this point (due toshouldInterruptThread = Thread.interrupted = true) even though thread B is actully interrupting. As a result, the state ofshouldInterruptThreadbecomes inconsistent between thread B and thread ut. Then, asuninterruptibleLock.isInterruptPendingreturns false, ut to continute to executef. At the same time, thread B invokessuper.interrupt(), andfcould be interruptedDoes this PR introduce any user-facing change?
No.
How was this patch tested?
Manually tested. The issue can be easily reproduced if we run
UninterruptibleThreadSuite.stress testfor 100 times in a row:And the issue is gone after the fix.
Was this patch authored or co-authored using generative AI tooling?
No