Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onCancel callback of the withTaskCancellationHandler() is called more than once. #80161

Closed
ser-0xff opened this issue Mar 20, 2025 · 6 comments · Fixed by #80456
Closed

onCancel callback of the withTaskCancellationHandler() is called more than once. #80161

ser-0xff opened this issue Mar 20, 2025 · 6 comments · Fixed by #80456
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. concurrency Feature: umbrella label for concurrency language features

Comments

@ser-0xff
Copy link

ser-0xff commented Mar 20, 2025

Description

The documentation doesn't clearly state that the onCancel can be called more than once.
We observed sometimes it happens and not clear is it bug or not.
Minimized reproducing test is available on github

Reproduction

git clone https://github.com/ordo-one/external-reproducers
cd external-reproducers/swift/task-cancellation
swift Sources/Main.swift 

Sometimes test fails:

task-cancellation % swift Sources/Main.swift 
Main/Main.swift:39: Precondition failed: cancelled more than once, iteration: 234607
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  swift-frontend             0x000000010a1c6a9c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  swift-frontend             0x000000010a1c4cf0 llvm::sys::RunSignalHandlers() + 112
2  swift-frontend             0x000000010a1c7068 SignalHandler(int) + 292
3  libsystem_platform.dylib   0x000000018b436de4 _sigtramp + 56
4  libswiftCore.dylib         0x000000019c1d90a4 $ss17_assertionFailure__4file4line5flagss5NeverOs12StaticStringV_SSAHSus6UInt32VtF + 268
5  libswiftCore.dylib         0x00000001108709ac $ss17_assertionFailure__4file4line5flagss5NeverOs12StaticStringV_SSAHSus6UInt32VtF + 18446744071367653908
6  libswift_Concurrency.dylib 0x000000026bcd427c swift::runJobInEstablishedExecutorContext(swift::Job*) + 252
7  libswift_Concurrency.dylib 0x000000026bcd57a4 swift_job_runImpl(swift::Job*, swift::SerialExecutorRef) + 144
8  libdispatch.dylib          0x000000018b25f4cc _dispatch_root_queue_drain + 392
9  libdispatch.dylib          0x000000018b25fcd8 _dispatch_worker_thread2 + 156
10 libsystem_pthread.dylib    0x000000018b3fc39c _pthread_wqthread + 228
11 libsystem_pthread.dylib    0x000000018b3fb0f0 start_wqthread + 8
zsh: trace trap  swift Sources/Main.swift

Expected behavior

precondition check at line 39 always success

Environment

task-cancellation % swiftc -version 
swift-driver version: 1.115.1 Apple Swift version 6.0.3 (swiftlang-6.0.3.1.10 clang-1600.0.30.1)
Target: arm64-apple-macosx15.0

Additional information

No response

@ser-0xff ser-0xff added bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. triage needed This issue needs more specific labels labels Mar 20, 2025
@jamieQ
Copy link
Contributor

jamieQ commented Mar 20, 2025

cc @ktoso – is this possibility expected?

@ktoso ktoso added concurrency Feature: umbrella label for concurrency language features and removed triage needed This issue needs more specific labels labels Mar 20, 2025
@adozenlines
Copy link

Sorry to hijack this issue, but the following pattern also crashes without a precondition check.

public struct RaceConditionOnCancel {
    
  public func mainTaskGroupTaskCancel() async throws {
    for _ in 0 ..< 10 {
      try await withThrowingTaskGroup(of: Void.self) { group in
        group.addTask {
          try await withTaskCancellationHandler {
              print("Waiting...")
              try await Task.sleep(nanoseconds: 100_000_000)
              print("Running...")
          } onCancel: {
              print("Cancelled in \(#function)")
          }
        }
        try await Task.sleep(nanoseconds: 60_000_000)
        group.cancelAll()
      }
    }
  }
}

do {
    let race = RaceConditionOnCancel()
    try await race.mainTaskGroupTaskCancel()
} catch {
    print(error)
}

@ktoso
Copy link
Contributor

ktoso commented Mar 31, 2025

A cancellation handler should run only ever once, so there may be something off here. We'll have to investigate.

Please always include detailed environment you're testing on (macos? ios? linux? which version of swift) when reporting bugs.


I did reproduce the initial reproducer on a mac, we should look into this some more.

@ktoso
Copy link
Contributor

ktoso commented Mar 31, 2025

It's a race between the task_cancel and the "immediately run the handler". I think we can have a fix soon.

adozenlines added a commit to adozenlines/swift that referenced this issue Apr 1, 2025
ktoso added a commit to ktoso/swift that referenced this issue Apr 2, 2025
This avoids the potential to race with the triggering coming from
task_cancel, because we first set the cancelled flag, and only THEN
take the lock and iterate over the inserted records. Because of this we
could: T1 flip the cancelled bit; T2 observes that, and triggers
"immediately" during installing the handler record. T1 then proceeds to
lock records and trigger it again, causing a double trigger of the
cancellation handler.

resolves swiftlang#80161
resolves rdar://147493150
@ktoso
Copy link
Contributor

ktoso commented Apr 2, 2025

Fix incoming #80456

ktoso added a commit that referenced this issue Apr 2, 2025
…k. (#80456)

This avoids the potential to race with the triggering coming from
task_cancel, because we first set the cancelled flag, and only THEN
take the lock and iterate over the inserted records. Because of this we
could: T1 flip the cancelled bit; T2 observes that, and triggers
"immediately" during installing the handler record. T1 then proceeds to
lock records and trigger it again, causing a double trigger of the
cancellation handler.

resolves #80161
resolves rdar://147493150
@ser-0xff
Copy link
Author

ser-0xff commented Apr 2, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. concurrency Feature: umbrella label for concurrency language features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants