Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task referencer #18914

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

task referencer #18914

wants to merge 17 commits into from

Conversation

altendky
Copy link
Contributor

@altendky altendky commented Nov 20, 2024

Purpose:

@dataclasses.dataclass
class _TaskReferencer:
    """Holds strong references to tasks until they are done.  This compensates for
    asyncio holding only weak references.  This should be replaced by patterns using
    task groups such as from anyio.
    """

Current Behavior:

New Behavior:

Testing Notes:

benchmark runs:

Draft For:

@altendky altendky added Changed Required label for PR that categorizes merge commit message as "Changed" for changelog Exclude_Notes Use this label if the changes in the PR should be excluded from the release notes labels Nov 20, 2024
@altendky altendky changed the title task pit task referencer Nov 20, 2024
Copy link

coveralls-official bot commented Nov 22, 2024

Pull Request Test Coverage Report for Build 12483476002

Details

  • 162 of 180 (90.0%) changed or added relevant lines in 39 files are covered.
  • 55 unchanged lines in 9 files lost coverage.
  • Overall coverage decreased (-0.02%) to 91.529%

Changes Missing Coverage Covered Lines Changed/Added Lines %
chia/daemon/server.py 4 5 80.0%
chia/farmer/farmer.py 5 6 83.33%
chia/seeder/crawler.py 2 3 66.67%
chia/server/server.py 3 4 75.0%
chia/util/beta_metrics.py 1 2 50.0%
chia/full_node/full_node.py 14 16 87.5%
chia/timelord/timelord.py 3 5 60.0%
chia/server/ws_connection.py 10 13 76.92%
chia/util/task_referencer.py 25 28 89.29%
chia/wallet/wallet_node.py 6 9 66.67%
Files with Coverage Reduction New Missed Lines %
chia/data_layer/data_store.py 1 95.47%
chia/server/node_discovery.py 1 81.87%
chia/daemon/server.py 1 84.3%
chia/wallet/wallet_node.py 1 88.32%
chia/timelord/timelord_launcher.py 4 89.29%
chia/server/address_manager.py 5 90.48%
chia/full_node/full_node.py 7 86.09%
chia/timelord/timelord.py 11 78.82%
chia/_tests/core/util/test_lockfile.py 24 77.31%
Totals Coverage Status
Change from base Build 12475392303: -0.02%
Covered Lines: 105309
Relevant Lines: 114878

💛 - Coveralls

@altendky altendky marked this pull request as ready for review November 25, 2024 20:21
@altendky altendky requested a review from a team as a code owner November 25, 2024 20:21
pytest.ini Outdated Show resolved Hide resolved
ruff.toml Outdated Show resolved Hide resolved
@emlowe
Copy link
Contributor

emlowe commented Nov 26, 2024

I like this idea as a for now thing - I would like @arvidn to consider it though

@github-actions github-actions bot added the merge_conflict Branch has conflicts that prevent merge to main label Dec 11, 2024
Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions github-actions bot removed merge_conflict Branch has conflicts that prevent merge to main labels Dec 11, 2024
Copy link
Contributor

Conflicts have been resolved. A maintainer will review the pull request shortly.

1 similar comment
Copy link
Contributor

Conflicts have been resolved. A maintainer will review the pull request shortly.

Copy link
Contributor

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@github-actions github-actions bot added the merge_conflict Branch has conflicts that prevent merge to main label Dec 11, 2024
@github-actions github-actions bot removed the merge_conflict Branch has conflicts that prevent merge to main label Dec 11, 2024
Copy link
Contributor

Conflicts have been resolved. A maintainer will review the pull request shortly.

@altendky altendky closed this Dec 18, 2024
@altendky altendky reopened this Dec 18, 2024
arvidn
arvidn previously approved these changes Dec 19, 2024
Copy link
Contributor

@arvidn arvidn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not excited about keeping this taks list global, but I understand it's still an improvement over keeping hidden references in the asyncio scheduler as we do today.

One concern I have is that, for tasks that we currently handle correctly, it seems this would unnecessarily extend their lifetime until the next culling. i.e. tasks that we create and hold a reference to, and then await or gather(), will still be kept alive longer than they would today.

Seeing that you wrapped create_task(), at first I also expected that you'd wrap the Task object itself as well, just to be able to hook on it being awaited, and immediately remove it from the taask referencer.

@altendky altendky marked this pull request as draft December 19, 2024 15:04
@altendky
Copy link
Contributor Author

drafting so we can discuss the raised ideas before a merge happens.

i agree that it will keep a completed task object alive longer. if we have some situation where the task object keeps something else significant alive because of this that could indeed cause trouble. i think the most likely case for that is an exception result that references other problematic objects.

it appears that the task object already provides a solution to this so perhaps there would be no need to wrap it. https://docs.python.org/3/library/asyncio-task.html#asyncio.Task.add_done_callback the cost would be more frequent single-item culling which would be less efficient with the existing list. it looks like task objects are hashable. i don't recall at the moment if there was another reason i didn't use a set. switching the list to a set should alleviate the cost of more frequent and smaller culling.

@altendky altendky marked this pull request as ready for review December 19, 2024 17:09
@altendky altendky requested review from arvidn and emlowe December 19, 2024 17:09
@arvidn
Copy link
Contributor

arvidn commented Dec 20, 2024

we would still need to look through the tasks and remove the ones that are done though, right? To simulate "detach", which seems to be how we commonly use new tasks

@altendky
Copy link
Contributor Author

i expected that somewhere around the time a task changes to the done state that it would call the done callback where we would remove it from the dict. are you concerned that the done callback relates to a task being awaited? I believe that state is independent and represents the completion of execution, not the awaiting of a result.

Python 3.12.7 (main, Nov 20 2024, 20:26:30) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import asyncio
>>> async def af():
...     await asyncio.sleep(0.1)
... 
>>> async def main():
...     task = asyncio.create_task(af())
...     task.add_done_callback(print)
...     await asyncio.sleep(5)
...     print(task.done())
...     await task
... 
>>> asyncio.run(main())
<Task finished name='Task-2' coro=<af() done, defined at <stdin>:1> result=None>
True

there's still room for a bug where a task somehow finishes without triggering the callback or the callback somehow failing to remove properly i suppose. then an occasionally explicit culling would catch those tasks that fell through the cracks. i had decided not to leave such behavior in, but it could be added back.

@arvidn
Copy link
Contributor

arvidn commented Dec 23, 2024

oh, I see. I assumed the callback would only be called when the task was awaited on. but if it isn't; is the callback called on the same task that's done? and I imagine the lifespan left of the task is so short it doesn't matter that we drop the reference to it from within itself.

@altendky
Copy link
Contributor Author

the callback is sync and couldn't be injected into any existing task afaik. also note that the callback is executed when the task is already marked as done. that could still leave us dropping our reference before the task is awaited, but if it is going to be awaited by something else then that other thing must also have a reference that would keep it alive.

>>> import asyncio
>>> async def af():
...     print(f"in task: {asyncio.current_task()}")
...     await asyncio.sleep(0.1)
... 
>>> async def main():
...     print(f"main task: {asyncio.current_task()}")
...     task = asyncio.create_task(af())
...     task.add_done_callback(lambda task: print(f"in done callback: {task.done()} {asyncio.current_task()}"))
...     await asyncio.sleep(5)
...     print(task.done())
...     await task
... 
>>> asyncio.run(main())
main task: <Task pending name='Task-5' coro=<main() running at <stdin>:2> cb=[_run_until_complete_cb() at /home/altendky/.pyenv/versions/3.12.7/lib/python3.12/asyncio/base_events.py:182]>
in task: <Task pending name='Task-6' coro=<af() running at <stdin>:2> cb=[main.<locals>.<lambda>() at <stdin>:4]>
in done callback: True None
True

@altendky altendky closed this Dec 24, 2024
@altendky altendky reopened this Dec 24, 2024
Copy link
Contributor

File Coverage Missing Lines
chia/daemon/server.py 80.0% lines 1060
chia/farmer/farmer.py 83.3% lines 202
chia/full_node/full_node.py 87.5% lines 288, 298
chia/seeder/crawler.py 66.7% lines 223
chia/server/server.py 75.0% lines 506
chia/server/ws_connection.py 76.9% lines 681, 728, 733
chia/timelord/timelord.py 60.0% lines 171, 1140
chia/util/beta_metrics.py 50.0% lines 88
chia/util/task_referencer.py 89.3% lines 23, 53-54
chia/wallet/wallet_node.py 66.7% lines 430, 433, 1242
Total Missing Coverage
180 lines 18 lines 90%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changed Required label for PR that categorizes merge commit message as "Changed" for changelog Exclude_Notes Use this label if the changes in the PR should be excluded from the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants