-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
task referencer #18914
base: main
Are you sure you want to change the base?
task referencer #18914
Conversation
Pull Request Test Coverage Report for Build 12483476002Details
💛 - Coveralls |
I like this idea as a |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Conflicts have been resolved. A maintainer will review the pull request shortly. |
1 similar comment
Conflicts have been resolved. A maintainer will review the pull request shortly. |
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Conflicts have been resolved. A maintainer will review the pull request shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not excited about keeping this taks list global, but I understand it's still an improvement over keeping hidden references in the asyncio scheduler as we do today.
One concern I have is that, for tasks that we currently handle correctly, it seems this would unnecessarily extend their lifetime until the next culling. i.e. tasks that we create and hold a reference to, and then await
or gather()
, will still be kept alive longer than they would today.
Seeing that you wrapped create_task()
, at first I also expected that you'd wrap the Task
object itself as well, just to be able to hook on it being awaited, and immediately remove it from the taask referencer.
drafting so we can discuss the raised ideas before a merge happens. i agree that it will keep a completed task object alive longer. if we have some situation where the task object keeps something else significant alive because of this that could indeed cause trouble. i think the most likely case for that is an exception result that references other problematic objects. it appears that the task object already provides a solution to this so perhaps there would be no need to wrap it. https://docs.python.org/3/library/asyncio-task.html#asyncio.Task.add_done_callback the cost would be more frequent single-item culling which would be less efficient with the existing list. it looks like task objects are hashable. i don't recall at the moment if there was another reason i didn't use a set. switching the list to a set should alleviate the cost of more frequent and smaller culling. |
we would still need to look through the tasks and remove the ones that are done though, right? To simulate "detach", which seems to be how we commonly use new tasks |
i expected that somewhere around the time a task changes to the done state that it would call the done callback where we would remove it from the dict. are you concerned that the done callback relates to a task being awaited? I believe that state is independent and represents the completion of execution, not the awaiting of a result. Python 3.12.7 (main, Nov 20 2024, 20:26:30) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import asyncio
>>> async def af():
... await asyncio.sleep(0.1)
...
>>> async def main():
... task = asyncio.create_task(af())
... task.add_done_callback(print)
... await asyncio.sleep(5)
... print(task.done())
... await task
...
>>> asyncio.run(main())
<Task finished name='Task-2' coro=<af() done, defined at <stdin>:1> result=None>
True there's still room for a bug where a task somehow finishes without triggering the callback or the callback somehow failing to remove properly i suppose. then an occasionally explicit culling would catch those tasks that fell through the cracks. i had decided not to leave such behavior in, but it could be added back. |
oh, I see. I assumed the callback would only be called when the task was awaited on. but if it isn't; is the callback called on the same task that's done? and I imagine the lifespan left of the task is so short it doesn't matter that we drop the reference to it from within itself. |
the callback is sync and couldn't be injected into any existing task afaik. also note that the callback is executed when the task is already marked as done. that could still leave us dropping our reference before the task is awaited, but if it is going to be awaited by something else then that other thing must also have a reference that would keep it alive. >>> import asyncio
>>> async def af():
... print(f"in task: {asyncio.current_task()}")
... await asyncio.sleep(0.1)
...
>>> async def main():
... print(f"main task: {asyncio.current_task()}")
... task = asyncio.create_task(af())
... task.add_done_callback(lambda task: print(f"in done callback: {task.done()} {asyncio.current_task()}"))
... await asyncio.sleep(5)
... print(task.done())
... await task
...
>>> asyncio.run(main())
main task: <Task pending name='Task-5' coro=<main() running at <stdin>:2> cb=[_run_until_complete_cb() at /home/altendky/.pyenv/versions/3.12.7/lib/python3.12/asyncio/base_events.py:182]>
in task: <Task pending name='Task-6' coro=<af() running at <stdin>:2> cb=[main.<locals>.<lambda>() at <stdin>:4]>
in done callback: True None
True |
|
Purpose:
Current Behavior:
New Behavior:
Testing Notes:
benchmark runs:
Draft For:
asyncio.create_task()