Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio doesn't warn if a task is destroyed during its execution #65362

Closed
RichardKiss mannequin opened this issue Apr 5, 2014 · 37 comments
Closed

asyncio doesn't warn if a task is destroyed during its execution #65362

RichardKiss mannequin opened this issue Apr 5, 2014 · 37 comments
Labels
stdlib Python modules in the Lib dir topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@RichardKiss
Copy link
Mannequin

RichardKiss mannequin commented Apr 5, 2014

BPO 21163
Nosy @gvanrossum, @pitrou, @vstinner, @giampaolo, @1st1, @gvanrossum, @mpaolini
Files
  • asyncio-gc-issue.py
  • log_destroyed_pending_task.patch
  • dont_log_pending.patch
  • test2.py: script showing gc collecting unreferenced asyncio tasks
  • issue_22163_patch_0.diff: hold references to all waited futures in task
  • test3.py: script showing real-life-like example gc collecting unreferenced asyncio tasks
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-07-16.16:57:08.632>
    created_at = <Date 2014-04-05.21:10:31.905>
    labels = ['type-bug', 'library', 'expert-asyncio']
    title = "asyncio doesn't warn if a task is destroyed during its execution"
    updated_at = <Date 2014-09-20.21:15:28.592>
    user = 'https://bugs.python.org/richardkiss'

    bugs.python.org fields:

    activity = <Date 2014-09-20.21:15:28.592>
    actor = 'gvanrossum'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-07-16.16:57:08.632>
    closer = 'vstinner'
    components = ['Library (Lib)', 'asyncio']
    creation = <Date 2014-04-05.21:10:31.905>
    creator = 'richard.kiss'
    dependencies = []
    files = ['34741', '35691', '35781', '36405', '36408', '36413']
    hgrepos = []
    issue_num = 21163
    keywords = ['patch']
    message_count = 37.0
    messages = ['215633', '215638', '215639', '215640', '215642', '215646', '215647', '220986', '221010', '221096', '221375', '221387', '221470', '221495', '221503', '221508', '221573', '221574', '221578', '221580', '221956', '221957', '222699', '223230', '223232', '223233', '225493', '225497', '225498', '225499', '225500', '225504', '225520', '225522', '227173', '227176', '227177']
    nosy_count = 10.0
    nosy_names = ['gvanrossum', 'pitrou', 'vstinner', 'giampaolo.rodola', 'python-dev', 'yselivanov', 'Guido.van.Rossum', 'richard.kiss', 'Richard.Kiss', 'mpaolini']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue21163'
    versions = ['Python 3.4', 'Python 3.5']

    @RichardKiss
    Copy link
    Mannequin Author

    RichardKiss mannequin commented Apr 5, 2014

    Some tasks created via asyncio are vanishing because there is no reference to their resultant futures.

    This behaviour does not occur in Python 3.3.3 with asyncio-0.4.1.

    Also, doing a gc.collect() immediately after creating the tasks seems to fix the problem.

    Attachment also available at https://gist.github.com/richardkiss/9988156

    @RichardKiss RichardKiss mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Apr 5, 2014
    @RichardKiss RichardKiss mannequin changed the title asyncio Task Possibly Incorrectly Garbage Collected asyncio task possibly incorrectly garbage collected Apr 5, 2014
    @gvanrossum
    Copy link
    Member

    Ouch. That example is very obfuscated -- I fail to understand what it is trying to accomplish. Running it I see that it always prints 100 for the count with 3.3 or DO_CG on; for me it prints 87 with 3.4 an DO_GC off. But I wouldn't be surprised if the reason calling do.collect() "fixes" whatever issue you have is that it causes there not to be any further collections until the next cycle through that main loop, and everything simply runs before being collected. But that's just a theory based on minimal understanding of the example.

    I'm not sure that tasks are *supposed* to stay alive when there are no references to them. It seems that once they make it past a certain point they keep themselves alive.

    One more thing: using a try/except I see that the "lost" consumers all get a GeneratorExit on their first iteration. You might want to look into this. (Sorry, gotta run, wanted to dump this.)

    @RichardKiss
    Copy link
    Mannequin Author

    RichardKiss mannequin commented Apr 5, 2014

    I agree it's confusing and I apologize for that.

    Background:

    This multiplexing pattern is used in pycoinnet, a bitcoin client I'm developing at <https://github.com/richardkiss/pycoinnet\>. The BitcoinPeerProtocol class multiplexes protocol messages into multiple asyncio.Queue objects so each interested listener can react. An example listener is in pycoinnet.helpers.standards.install_pong_manager, which looks for "ping" messages and sends "pong" responses. When the peer disconnects, the pong manager sees a None (to indicate EOF), and it exits. The return value is uninteresting, so no reference to the Task is kept.

    My client is in late alpha, and mostly works, but when I tried it on Python 3.4.0, it stopped working and I narrowed it down to this.

    I'm not certain this behaviour is incorrect, but it's definitely different from 3.3.3, and it seems odd that a GC cycle BEFORE additional references can be made would allow it to work.

    @gvanrossum
    Copy link
    Member

    Most likely your program is simply relying on undefined behavior and the right way to fix it is to keep strong references to all tasks until they self-destruct.

    @RichardKiss
    Copy link
    Mannequin Author

    RichardKiss mannequin commented Apr 5, 2014

    I'll investigate further.

    @RichardKiss
    Copy link
    Mannequin Author

    RichardKiss mannequin commented Apr 6, 2014

    You were right: adding a strong reference to each Task seems to have solved the original problem in pycoinnet. I see that the reference to the global lists of asyncio.tasks is a weakset, so it's necessary to keep a strong reference myself.

    This does seem a little surprising. It can make it trickier to create a task that is only important in its side effect. Compare to threaded programming: unreferenced threads are never collected.

    For example:

    f = asyncio.Task(some_coroutine())
    f.add_callback(some_completion_callback)
    f = None

    In theory, the "some_coroutine" task can be collected, preventing "some_completion_callback" from ever being called. While technically correct, it does seem surprising.

    (I couldn't get this to work in a simple example, although I did get it to work in a complicated example.)

    Some change between 3.3 and 3.4 means garbage collection is much more aggressive at collecting up unreferenced tasks, which means broken code, like mine, that worked in 3.3 fails in 3.4. This may trip up other early adopters of tulip.

    Maybe adding a "do_not_collect=True" flag to asyncio.async or asyncio.Task, which would keep a strong reference by throwing it into a singleton set (removing it as a future callback) would bring attention to this subtle issue. Or displaying a warning in debug mode when a Task is garbage-collected before finishing.

    Thanks for your help.

    @RichardKiss RichardKiss mannequin added the invalid label Apr 6, 2014
    @gvanrossum
    Copy link
    Member

    Thanks for understanding.

    It's definitely subtle: there is also some code in asyncio that logs an error when a Future holding an exception becomes unreachable before anyone has asked for it; this has been a valuable debugging tool, and it depends on *not* holding references to Futures.

    Regarding the difference between Python 3.3 and 3.4, I don't know the exact reason, but GC definitely gets more precise with each new revision, and there are also some differences in how exactly the debug feature I just mentioned is implemented (look for _tb_logger in asyncio/futures.py).

    OTOH it does seem a little odd to just GC a coroutine that hasn't exited yet, and I'm not 100% convinced there *isn't* a bug here. The more I think about it, the more I think that it's suspicious that it's always the *first* iteration that gets GC'ed. So I'd like to keep this open as a reminder.

    Finally, I'm not sure the analogy with threads holds -- a thread manages OS resources that really do have to be destroyed explicitly, but that's not so for tasks, and any cleanup you do need can be handled using try/finally. (In general drawing analogies between threads and asyncio tasks/coroutines is probably one of the less fruitful lines of thinking about the latter; there are more differences than similarities.)

    @vstinner
    Copy link
    Member

    Ok, I agree that this issue is very tricky :-)

    The first problem in asyncio-gc-issue.py is that the producer keeps *weak* references to Queue object, so the Queue objects are quickly destroyed, especially if gc.collect() is called explicitly.

    When "yield from queue.get()" is used in a task, the task is paused. The queue creates a Future object and the task registers its _wakeup() method into the Future object.

    When the queue object is destroyed, the internal future object (used by the get() method) is destroyed too. The last reference to the task was in this future object. As a consequence, the task is also destroyed.

    While there is a bug in asyncio-gc-issue.py, it's very tricky to understand it and I think that asyncio should help developers to detect such bugs.

    I propose attached patch which emits a warning if a task is destroyed whereas it is not done (its status is still PENDING). I wrote a unit test which is much simpler than asyncio-gc-issue.py. Read the test to understand the issue. I added many comments to explain the state.

    --

    My patch was written for Python 3.4+: it adds a destructor to the Task class, and we cannot add a destructor in Future objects because these objects are likely to be part of reference cycles. See the following issue which proposes a fix:
    https://code.google.com/p/tulip/issues/detail?id=155

    Using this fix for reference cycle, it may be possible to emit also the log in Tulip (Python 3.3).

    @vstinner vstinner changed the title asyncio task possibly incorrectly garbage collected asyncio doesn't warn if a task is destroyed during its execution Jun 19, 2014
    @RichardKiss
    Copy link
    Mannequin Author

    RichardKiss mannequin commented Jun 19, 2014

    The more I use asyncio, the more I am convinced that the correct fix is to keep a strong reference to a pending task (perhaps in a set in the eventloop) until it starts.

    Without realizing it, I implicitly made this assumption when I began working on my asyncio project (a bitcoin node) in Python 3.3. I think it may be a common assumption for users. Ask around. I can say that it made the transition to Python 3.4 very puzzling.

    In several cases, I've needed to create a task where the side effects are important but the result is not. Sometimes this task is created in another task which may complete before its child task begins, which means there is no natural place to store a reference to this task. (Goofy workaround: wait for child to finish.)

    @vstinner
    Copy link
    Member

    The more I use asyncio, the more I am convinced that the correct fix is to keep a strong reference to a pending task (perhaps in a set in the eventloop) until it starts.

    The problem is not the task, read again my message. The problem is that nobody holds a strong reference to the Queue, whereas the producer is supposed to fill this queue, and the task is waiting for it.

    I cannot make a suggestion how to fix your example, it depends on what you want to do.

    Without realizing it, I implicitly made this assumption when I began working on my asyncio project (a bitcoin node) in Python 3.3. I think it may be a common assumption for users. Ask around. I can say that it made the transition to Python 3.4 very puzzling.

    Sorry, I don't understand the relation between this issue and the Python version (3.3 vs 3.4). Do you mean that Python 3.4 behaves differently?

    The garbage collection of Python 3.4 has been *improved*. Python 3.4 is able to break more reference cycles.

    If your program doesn't run anymore on Python 3.4, it means maybe that you rely on reference cycle, which sounds very silly.

    In several cases, I've needed to create a task where the side effects are important but the result is not. Sometimes this task is created in another task which may complete before its child task begins, which means there is no natural place to store a reference to this task. (Goofy workaround: wait for child to finish.)

    I'm not sure that this is the same issue. If you think so, could you please write a short example showing the problem?

    @vstinner
    Copy link
    Member

    @guido, @yury: What do you think of log_destroyed_pending_task.patch? Does it sound correct?

    Or would you prefer to automatically keep a strong reference somewhere and then break the strong reference when the task is done? Such approach sounds to be error prone :)

    @1st1
    Copy link
    Member

    1st1 commented Jun 24, 2014

    @guido, @yury: What do you think of log_destroyed_pending_task.patch? Does it sound correct?

    Premature task garbage collection is indeed hard to debug. But at least, with your patch, one gets an exception and has a chance to track the bug down. So I'm +1 for the patch.

    As for having strong references to tasks: it may have its own downsides, such as hard to debug memory leaks. I'd rather prefer my program to crash and/or having your patch report me the problem, than to search for an obscure code that eats all server memory once a week. I think we need to collect more evidence that the problem is common & annoying, before making any decisions on this topic, as that's something that will be hard to revert. Hence I'm -1 for strong references.

    @gvanrossum
    Copy link
    Member

    Patch looks good. Go ahead.

    @RichardKiss
    Copy link
    Mannequin Author

    RichardKiss mannequin commented Jun 24, 2014

    I reread more carefully, and I am in agreement now that I better understand what's going on. Thanks for your patience.

    @vstinner
    Copy link
    Member

    I commited my change in Tulip (78dc74d4e8e6), Python 3.4 and 3.5:

    changeset: 91359:978525270264
    branch: 3.4
    parent: 91357:a941bb617c2a
    user: Victor Stinner <victor.stinner@gmail.com>
    date: Tue Jun 24 22:37:53 2014 +0200
    files: Lib/asyncio/futures.py Lib/asyncio/tasks.py Lib/test/test_asyncio/test_base_events.py Lib/test/test_asyncio/test_tasks.py
    description:
    asyncio: Log an error if a Task is destroyed while it is still pending

    changeset: 91360:e1d81c32f13d
    parent: 91358:3fa0d2b297c6
    parent: 91359:978525270264
    user: Victor Stinner <victor.stinner@gmail.com>
    date: Tue Jun 24 22:38:31 2014 +0200
    files: Lib/test/test_asyncio/test_tasks.py
    description:
    (Merge 3.4) asyncio: Log an error if a Task is destroyed while it is still pending

    @vstinner
    Copy link
    Member

    The new check emits a lot of "Task was destroyed but it is pending!" messages when running test_asyncio. I keep the issue open to remember me that I have to fix them.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 25, 2014

    New changeset 1088023d971c by Victor Stinner in branch '3.4':
    Issue bpo-21163, asyncio: Fix some "Task was destroyed but it is pending!" logs in tests
    http://hg.python.org/cpython/rev/1088023d971c

    New changeset 7877aab90c61 by Victor Stinner in branch 'default':
    (Merge 3.4) Issue bpo-21163, asyncio: Fix some "Task was destroyed but it is
    http://hg.python.org/cpython/rev/7877aab90c61

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 25, 2014

    New changeset e9150fdf068a by Victor Stinner in branch '3.4':
    asyncio: sync with Tulip
    http://hg.python.org/cpython/rev/e9150fdf068a

    New changeset d92dc4462d26 by Victor Stinner in branch 'default':
    (Merge 3.4) asyncio: sync with Tulip
    http://hg.python.org/cpython/rev/d92dc4462d26

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 25, 2014

    New changeset 4e4c6e2ed0c5 by Victor Stinner in branch '3.4':
    Issue bpo-21163: Fix one more "Task was destroyed but it is pending!" log in tests
    http://hg.python.org/cpython/rev/4e4c6e2ed0c5

    New changeset 24282c6f6019 by Victor Stinner in branch 'default':
    (Merge 3.4) Issue bpo-21163: Fix one more "Task was destroyed but it is pending!"
    http://hg.python.org/cpython/rev/24282c6f6019

    @vstinner
    Copy link
    Member

    I fixed the first "Task was destroyed but it is pending!" messages when the fix was simple.

    Attached dont_log_pending.patch fixes remaining messages when running test_asyncio. I'm not sure yet that this patch is the best approach to fix the issue.

    Modified functions with example of related tests:

    • BaseEventLoop.run_until_complete(): don't log because there the method already raises an exception if the future didn't complete ("Event loop stopped before Future completed.")

    => related test: test_run_until_complete_stopped() of test_events.py

    • wait(): don't log because the caller doesn't have control on the internal sub-tasks, and the task executing wait() will already emit a message if it is destroyed whereas it didn't completed

    => related test: test_wait_errors() of test_tasks.py

    • gather(): same rationale than wait()

    => related test: test_one_exception() of test_tasks.py

    • test_utils.run_briefly(): the caller doesn't have access to the task and the function is a best effort approach, it doesn't have to guarantee that running a step of the event loop is until to execute all pending callbacks

    => related test: test_baseexception_during_cancel() of test_tasks.py

    @vstinner
    Copy link
    Member

    Hum, dont_log_pending.patch is not correct for wait(): wait() returns (done, pending), where pending is a set of pending tasks. So it's still possible that pending tasks are destroyed while they are not a still pending, after the end of wait(). The log should not be made quiet here.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 30, 2014

    New changeset 13e78b9cf290 by Victor Stinner in branch '3.4':
    Issue bpo-21163: BaseEventLoop.run_until_complete() and test_utils.run_briefly()
    http://hg.python.org/cpython/rev/13e78b9cf290

    New changeset 2d0fa8f383c8 by Victor Stinner in branch 'default':
    (Merge 3.4) Issue bpo-21163: BaseEventLoop.run_until_complete() and
    http://hg.python.org/cpython/rev/2d0fa8f383c8

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 10, 2014

    New changeset f13cde63ca73 by Victor Stinner in branch '3.4':
    asyncio: sync with Tulip
    http://hg.python.org/cpython/rev/f13cde63ca73

    New changeset a67adfaf670b by Victor Stinner in branch 'default':
    (Merge 3.4) asyncio: sync with Tulip
    http://hg.python.org/cpython/rev/a67adfaf670b

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 16, 2014

    New changeset 6d5a76214166 by Victor Stinner in branch '3.4':
    Issue bpo-21163, asyncio: Ignore "destroy pending task" warnings for private tasks
    http://hg.python.org/cpython/rev/6d5a76214166

    New changeset fbd3e9f635b6 by Victor Stinner in branch 'default':
    (Merge 3.4) Issue bpo-21163, asyncio: Ignore "destroy pending task" warnings for
    http://hg.python.org/cpython/rev/fbd3e9f635b6

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 16, 2014

    New changeset e4fe6706b7b4 by Victor Stinner in branch '3.4':
    Issue bpo-21163: Fix "destroy pending task" warning in test_wait_errors()
    http://hg.python.org/cpython/rev/e4fe6706b7b4

    New changeset a627b23f57d4 by Victor Stinner in branch 'default':
    (Merge 3.4) Issue bpo-21163: Fix "destroy pending task" warning in test_wait_errors()
    http://hg.python.org/cpython/rev/a627b23f57d4

    @vstinner
    Copy link
    Member

    Ok, I fixed the last warnings emitted in unit tests ran in debug mode. I close the issue.

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Aug 18, 2014

    I finally wrapped my head around this. I wrote a (simpler) script to get a better picture.

    What happens
    -------------

    When a consumer task is first istantiated, the loop holds a strong reference to it (_ready)

    Later on, as the loop starts, the consumer task is yielded and it waits on an unreachable future. The last strong ref to it is lost (loop._ready).

    It is not collected immediately because it just created a reference loop
    (task -> coroutine -> stack -> future -> task) that will be broken only at task completion.

    gc.collect() called *before* the tasks are ever run has the weird side effect of moving the automatic gc collection forward in time.
    Automatic gc triggers after a few (but not all) consumers have become unreachable, depending on how many instructions were executed before running the loop.

    gc.collect() called after all the consumers are waiting on the unreachable future reaps all consumer tasks as expected. No bug in garbage collection.

    Yielding from asyncio.sleep() prevents the consumers from being
    collected: it creates a strong ref to the future in the loop.
    I suspect also all network-related asyncio coroutines behave this way.

    Summing up: Tasks that have no strong refs may be garbage collected unexpectedly or not at all, depending on which future they yield to. It is very difficult to debug and undestand why these tasks disappear.

    Side note: the patches submitted and merged in this issue do emit the relevant warnings when PYTHONASYNCIODEBUG is set. This is very useful.

    Proposed enhanchements
    ----------------------

    1. Document that you should always keep strong refs to tasks or to futures/coroutines the tasks yields from. This knowledge is currently passed around the brave asyncio users like oral tradition.

    2. Alternatively, keep strong references to all futures that make it through Task._step. We are already keeping strong refs to some of the asyncio builtin coroutines (asyncio.sleep is one of those). Also, we do keep strong references to tasks that are ready to be run (the ones that simply yield or the ones that have not started yet)

    If you also think 1. or 2. are neeed, let me know and I'll try cook a patch.

    Sorry for the noise

    @gvanrossum
    Copy link
    Member

    I'm all in favor of documenting that you must keep a strong reference to a
    task that you want to keep alive. I'm not keen on automatically keep all
    tasks alive, that might exacerbate leaks (which are by definition hard to
    find) in existing programs.

    On Mon, Aug 18, 2014 at 7:20 AM, Marco Paolini <report@bugs.python.org>
    wrote:

    Marco Paolini added the comment:

    I finally wrapped my head around this. I wrote a (simpler) script to get a
    better picture.

    What happens
    -------------

    When a consumer task is first istantiated, the loop holds a strong
    reference to it (_ready)

    Later on, as the loop starts, the consumer task is yielded and it waits on
    an unreachable future. The last strong ref to it is lost (loop._ready).

    It is not collected immediately because it just created a reference loop
    (task -> coroutine -> stack -> future -> task) that will be broken only at
    task completion.

    gc.collect() called *before* the tasks are ever run has the weird side
    effect of moving the automatic gc collection forward in time.
    Automatic gc triggers after a few (but not all) consumers have become
    unreachable, depending on how many instructions were executed before
    running the loop.

    gc.collect() called after all the consumers are waiting on the unreachable
    future reaps all consumer tasks as expected. No bug in garbage collection.

    Yielding from asyncio.sleep() prevents the consumers from being
    collected: it creates a strong ref to the future in the loop.
    I suspect also all network-related asyncio coroutines behave this way.

    Summing up: Tasks that have no strong refs may be garbage collected
    unexpectedly or not at all, depending on which future they yield to. It is
    very difficult to debug and undestand why these tasks disappear.

    Side note: the patches submitted and merged in this issue do emit the
    relevant warnings when PYTHONASYNCIODEBUG is set. This is very useful.

    Proposed enhanchements
    ----------------------

    1. Document that you should always keep strong refs to tasks or to
      futures/coroutines the tasks yields from. This knowledge is currently
      passed around the brave asyncio users like oral tradition.

    2. Alternatively, keep strong references to all futures that make it
      through Task._step. We are already keeping strong refs to some of the
      asyncio builtin coroutines (asyncio.sleep is one of those). Also, we do
      keep strong references to tasks that are ready to be run (the ones that
      simply yield or the ones that have not started yet)

    If you also think 1. or 2. are neeed, let me know and I'll try cook a
    patch.

    Sorry for the noise

    ----------
    nosy: +mpaolini
    Added file: http://bugs.python.org/file36405/test2.py


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue21163\>


    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Aug 18, 2014

    Asking the user to manage strong refs is just passing the potential
    leak issue outside of the standard library. It doesn't really solve anything.

    If the user gets the strong refs wrong he can either lose tasks or
    leak memory.

    If the standard library gets it right, both issues are avoided.

    I'm all in favor of documenting that you must keep a strong reference to a
    task that you want to keep alive. I'm not keen on automatically keep all
    tasks alive, that might exacerbate leaks (which are by definition hard to
    find) in existing programs.

    @gvanrossum
    Copy link
    Member

    So you are changing your mind and withdrawing your option #1.

    I don't have the time to really dig deeply into the example app and what's
    going on. If you want to help, you can try to come up with a patch (and it
    should have good unit tests).

    I'll be on vacation most of this week.

    On Mon, Aug 18, 2014 at 9:17 AM, Marco Paolini <report@bugs.python.org>
    wrote:

    Marco Paolini added the comment:

    Asking the user to manage strong refs is just passing the potential
    leak issue outside of the standard library. It doesn't really solve
    anything.

    If the user gets the strong refs wrong he can either lose tasks or
    leak memory.

    If the standard library gets it right, both issues are avoided.

    > I'm all in favor of documenting that you must keep a strong reference to
    a
    > task that you want to keep alive. I'm not keen on automatically keep all
    > tasks alive, that might exacerbate leaks (which are by definition hard to
    find) in existing programs.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue21163\>


    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Aug 18, 2014

    So you are changing your mind and withdrawing your option #1.

    I think option #1 (tell users to keep strong refs to tasks) is
    OK but option #2 is better.

    Yes, I changed my mind ;)

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Aug 18, 2014

    Submitted a first stab at #2. Let me know what you think.

    If this works we'll have to remove the test_gc_pending test and then maybe even the code that now logs errors when a pending task is gc'ed

    @vstinner
    Copy link
    Member

    I don't understand how keeping a strong refrence would fix anything. You
    only provided one example (async-gc-bug.py) which uses Queue objects but
    keep weak references to them. Keeping strong references to tasks is not the
    right fix. You must keep strong references to queues. If a queue is
    destroyed, how can you put an item into it? Otherwise, the task will wait
    forever. Keeping a strong refrence to the task just hides the bug. Or I
    missed something.

    I dislike the idea of keeping strong references to tasks, it may create
    even more reference cycles. We already have too many cycles with exceptions
    stored in futures (in tasks).

    The current unit test uses low level functions to remove a variable using a
    frame object. Can you provide an example which shows the bug without using
    low level functions?

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Aug 19, 2014

    I don't understand how keeping a strong refrence would fix anything. You
    only provided one example (async-gc-bug.py) which uses Queue objects but
    keep weak references to them. Keeping strong references to tasks is not the
    right fix. You must keep strong references to queues. If a queue is
    destroyed, how can you put an item into it? Otherwise, the task will wait
    forever. Keeping a strong refrence to the task just hides the bug. Or I
    missed something.

    The original asyncio-gc-issue.py wasn't written by me, and yes, as you say it does have the reference bug you describe. I argue that bug shouldn't cause tasks to die: it should rather limit (as gc proceeds) the number of queues available to the producer in the WeakSet() and leaving alive all consumer waiting on an unreachable queue.

    Please look at my test2.py or even better test3.py for a simpler example.

    Note that in my issue_22163_patch_0.diff I only keep strong refs to futures a task is waiting on. Just as asyncio is already doing with asyncio.sleep() coroutine.

    I dislike the idea of keeping strong references to tasks, it may create
    even more reference cycles. We already have too many cycles with exceptions
    stored in futures (in tasks).
    We are also already keeping strong refs to futures like asyncio.sleep

    I dislike the idea of randomly losing tasks.

    I also dislike the idea of forcing the user to manage strong refs to its tasks. All 3rd party libraries will have to invent their own way and it will lead to even more leaks/cycles very hard to debug.

    Not just exceptions: everytime a task is yielding on a future asyncio creates a reference cycle.

    The current unit test uses low level functions to remove a variable using a
    frame object. Can you provide an example which shows the bug without using
    low level functions?

    My issue_22163_patch_0.diff only clears references by setting variables to None. No low level stuff needed.

    My test2.py example script also doesn't use any low level stuff

    I just uploaded test3.py with a simpler (and possibly more realistic) example.

    @mpaolini
    Copy link
    Mannequin

    mpaolini mannequin commented Sep 20, 2014

    Sorry for keeping this alive.

    Take a look at the wait_for.py just submitted in the unrelated bpo-22448: no strong refs to the tasks are kept. Tasks remain alive only because they are timers and the event loop keeps strong ref.

    Do you think my proposed patch is OK? Sould I open a new issue?

    @gvanrossum
    Copy link
    Member

    I'm not sure how that wait_for.py example from bpo-2116 relates to this issue -- it seems to demonstrate the opposite problem (tasks are kept alive even though they are cancelled).

    Then again I admit I haven't looked deeply into the example (though I am sympathetic with the issue it purports to demonstrate).

    @gvanrossum
    Copy link
    Member

    (Whoops meant to link to bpo-22448.)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-asyncio type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants