Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancellation blocks task_status.started #2544

Open
smurfix opened this issue Jan 24, 2023 · 2 comments · May be fixed by #2896
Open

Cancellation blocks task_status.started #2544

smurfix opened this issue Jan 24, 2023 · 2 comments · May be fixed by #2896

Comments

@smurfix
Copy link
Contributor

smurfix commented Jan 24, 2023

The problem: when a task starts a shielded subtask, but is cancelled before the subtask re-parents itself, the cancellation isn't propagated until the subtask ends.

Consider this code:

import trio

async def t(task_status):
    with trio.CancelScope(shield=True) as sc:  # turn off shielding, the code works
        await trio.sleep(2)
        print("Send",sc)
        task_status.started(sc)
        await trio.sleep(3)
        print("Terminating")

async def r(tg):
    sc = await tg.start(t)
    print("Receive",sc)
    sc.cancel()

async def main():
    async with trio.open_nursery() as tg:
        tg.start_soon(r,tg)
        await trio.sleep(1)
        tg.cancel_scope.cancel()  # comment this off, the code works

trio.run(main)

What I expect to happen is that tg.start returns the value from task_status.started, task t gets cancelled, this code takes one second to run.

In my "real" usecase the subtask starts a database connection which must be (a) cached and (b) closed properly. Thus turning off the shield won't work. The cancellation in the last line of main is a stand-in for any kund of exception that might happen in the rest of the program.

@oremanj
Copy link
Member

oremanj commented Jan 24, 2023

Nursery.start() is implemented using an inner nursery. started() moves a task from this inner nursery to the final nursery for it, which causes the inner nursery to be closed. The inner nursery executes a checkpoint in its __aexit__. This raises Cancelled and stops you from seeing the result of started(). All the surprises here are in r(), not in t(). See also #1457 for other consequences of nursery __aexit__ being a checkpoint. We reached consensus there about a way forward but I don't think it was implemented. I believe fixing that would fix this issue too.

@gschaffner
Copy link
Member

what i believe should happen:

  • at t = 1 s: r is doing await tg.start(t), and t is halfway through its shielded await sleep(2). await sleep(2) is shielded so it keeps sleeping.

  • at t = 2 s:

    • print("Send", sc).

    • task_status.started(sc), so t gets reparented into tg.

    • t starts a shielded await sleep(3).

    • t is now started (i.e. reparented), so r's await tg.start(t) returns sc.

    • print("Receive", sc).

    • sc.cancel().

    • await sleep(3) raises Cancelled, sc catches it, and tg exits.

N.B.: it seems to me that the expected runtime should be 2 s, not the 1 s that you wrote?

what happened prior to #1696:

  • at t = 1 s: r is doing await tg.start(t), and t is halfway through its shielded await sleep(2). await sleep(2) is shielded so it keeps sleeping.

  • at t = 2 s:

    • print("Send", sc).

    • task_status.started(sc) incorrectly returns without actually reparenting t. even though r is in a cancelled scope, it remains blocked doing await tg.start(t) until t (which is shielded) finishes.

    • t starts a shielded await sleep(3).

  • at t = 5 s:

    • print("Terminating").

    • t finishes (inside of the internal tg.start(t) nursery). this wakes up the internal nursery in start(). await tg.start(t) raises Cancelled because internal_nursery.__aexit__ raises Cancelled when run in a cancelled scope.

    • ("Receive" never gets printed)

what happens currently:

  • at t = 1 s: r is doing await tg.start(t), and t is halfway through its shielded await sleep(2). await sleep(2) is shielded so it keeps sleeping.

  • at t = 2 s:

    • print("Send", sc).

    • task_status.started(sc) incorrectly returns without actually reparenting t. even though r is in a cancelled scope, it remains blocked doing await tg.start(t) until t (which is shielded) finishes.

    • t starts a shielded await sleep(3).

  • at t = 5 s:

    • print("Terminating").

    • t finishes (inside of the internal tg.start(t) nursery). this wakes up the internal nursery in start(). await tg.start(t) returns sc.

    • print("Receive", sc).

    • sc.cancel(). but sc has already exited, so this does nothing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants