Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: await wasn't used with future #7117

Closed
1 task done
DevilXD opened this issue Dec 4, 2022 · 46 comments · Fixed by #7785
Closed
1 task done

RuntimeError: await wasn't used with future #7117

DevilXD opened this issue Dec 4, 2022 · 46 comments · Fixed by #7785
Labels

Comments

@DevilXD
Copy link
Contributor

DevilXD commented Dec 4, 2022

Describe the bug

This is a strange one, as it's not something that has ever happened to me, but one of my users reported as an reoccurring issue. There's no real repro steps. It seems like it's something wrong with my code, but I can't pin-point exactly what it is, and the error makes zero sense. I don't know aiohttp or asyncio internals, so I'm making this bug report in hopes of finding out if it's really just my code, or is there something wrong with aiohttp itself.

To Reproduce

No steps as I said, but here are the two functions that are common to both tracebacks, right before it enters aiohttp internals:

# from twitch.py
@asynccontextmanager
async def request(
    self, method: str, url: URL | str, *, invalidate_after: datetime | None = None, **kwargs
) -> abc.AsyncIterator[aiohttp.ClientResponse]:
    session = await self.get_session()
    method = method.upper()
    if self.settings.proxy and "proxy" not in kwargs:
        kwargs["proxy"] = self.settings.proxy
    logger.debug(f"Request: ({method=}, {url=}, {kwargs=})")
    session_timeout = timedelta(seconds=session.timeout.total or 0)
    backoff = ExponentialBackoff(maximum=3*60)
    for delay in backoff:
        if self.gui.close_requested:
            raise ExitRequest()
        elif (
            invalidate_after is not None
            # account for the expiration landing during the request
            and datetime.now(timezone.utc) >= (invalidate_after - session_timeout)
        ):
            raise RequestInvalid()
        try:
            response: aiohttp.ClientResponse | None = None
            response = await self.gui.coro_unless_closed(
                session.request(method, url, **kwargs)
            )
            assert response is not None
            logger.debug(f"Response: {response.status}: {response}")
            if response.status < 500:
                # pre-read the response to avoid getting errors outside of the context manager
                raw_response = await response.read()  # noqa
                yield response
                return
            self.print(_("error", "site_down").format(seconds=round(delay)))
        except (aiohttp.ClientConnectionError, asyncio.TimeoutError):
            # just so that quick retries that often happen, aren't shown
            if backoff.steps > 1:
                self.print(_("error", "no_connection").format(seconds=round(delay)))
        finally:
            if response is not None:
                response.release()
        with suppress(asyncio.TimeoutError):
            await asyncio.wait_for(self.gui.wait_until_closed(), timeout=delay)

# from gui.py
async def coro_unless_closed(self, coro: abc.Awaitable[_T]) -> _T:
        # In Python 3.11, we need to explicitly wrap awaitables
        tasks = [asyncio.ensure_future(coro), asyncio.ensure_future(self._close_requested.wait())]
        done: set[asyncio.Task[Any]]
        pending: set[asyncio.Task[Any]]
        done, pending = await asyncio.wait(tasks, return_when=asyncio.FIRST_COMPLETED)
        for task in pending:
            task.cancel()
        if self._close_requested.is_set():
            raise ExitRequest()
        return await next(iter(done))  # L1929

Expected behavior

Well, I expect no error.

Logs/tracebacks

Traceback (most recent call last):
  File "main.py", line 178, in <module>
  File "asyncio\base_events.py", line 646, in run_until_complete
  File "twitch.py", line 670, in run
  File "twitch.py", line 712, in _run
  File "twitch.py", line 1417, in fetch_inventory
  File "gui.py", line 1247, in add_campaign
  File "cache.py", line 97, in get
  File "contextlib.py", line 199, in __aenter__
  File "twitch.py", line 1258, in request
  File "gui.py", line 1861, in coro_unless_closed
  File "aiohttp\client.py", line 1125, in send
  File "aiohttp\client.py", line 536, in _request
  File "aiohttp\connector.py", line 540, in connect
  File "aiohttp\connector.py", line 901, in _create_connection
  File "aiohttp\connector.py", line 1175, in _create_direct_connection
  File "aiohttp\connector.py", line 980, in _wrap_create_connection
  File "asyncio\base_events.py", line 1049, in create_connection
  File "asyncio\base_events.py", line 960, in _connect_sock
  File "asyncio\proactor_events.py", line 705, in sock_connect
RuntimeError: await wasn't used with future


Traceback (most recent call last):
  File "utils.py", line 96, in wrapper
  File "twitch.py", line 928, in _watch_loop
  File "channel.py", line 343, in send_watch
  File "channel.py", line 230, in get_spade_url
  File "contextlib.py", line 199, in __aenter__
  File "twitch.py", line 1269, in request
  File "gui.py", line 1929, in coro_unless_closed
  File "aiohttp\client.py", line 1125, in send
  File "aiohttp\client.py", line 536, in _request
  File "aiohttp\connector.py", line 540, in connect
  File "aiohttp\connector.py", line 901, in _create_connection
  File "aiohttp\connector.py", line 1175, in _create_direct_connection
  File "aiohttp\connector.py", line 980, in _wrap_create_connection
  File "asyncio\base_events.py", line 1049, in create_connection
  File "asyncio\base_events.py", line 960, in _connect_sock
  File "asyncio\proactor_events.py", line 705, in sock_connect
RuntimeError: await wasn't used with future

Python Version

Theirs: 3.10.6 (where this happens)

Mine: 3.8.15 (where this doesn't happen)

In both cases, the code is frozen into an application via PyInstaller.

aiohttp Version

Version: 3.8.3 (both cases)

multidict Version

Version: 6.0.2 (both cases)

yarl Version

Version: 1.8.1 (both cases)

OS

Windows 10 x64 (theirs)

Windows 7 x64 (mine)

Related component

Client

Additional context

The entirety of the source code, for reference, is available here: https://github.com/DevilXD/TwitchDropsMiner

Ref issue: DevilXD/TwitchDropsMiner#80

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@DevilXD DevilXD added the bug label Dec 4, 2022
@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Dec 4, 2022

No idea, but I can give a couple of suggestions for improvements, which may be unrelated.

asyncio.create_task() should be used, not asyncio.ensure_future(). I suspect you'll also be able to remove the done/pending annotations (and probably the response one too), mypy should infer them just fine (though I suspect your return type for coro_unless_closed should be _T | None).

After task.cancel(), you should await task. Otherwise, exceptions could be hidden, and you'll get an unused awaitable warning.

I guess trying it on the same Python version might also help reproduce the problem, and maybe using python -Wall -X dev to ensure all warnings etc. are displayed.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Dec 4, 2022

Could also just be some weird bug with proactor event loop. The actual line of code that appears to be triggering the error is:
https://github.com/python/cpython/blob/3.10/Lib/asyncio/proactor_events.py#L709

I'm not sure how that can possibly be wrong, unless someone is passing in a custom proactor to the event loop with a non-async connect() method.. Unless PyInstaller does something odd with the event loop? Maybe see if they can run it without PyInstaller and reproduce the same results.

@DevilXD
Copy link
Contributor Author

DevilXD commented Dec 4, 2022

asyncio.create_task() should be used, not asyncio.ensure_future().

Isn't the outcome of it the same though? Per documentation, ensure_future returns a task for coros. The reason I've used ensure_future is to keep the 3.11 behavior inline with previous Python versions. It can also handle a task being passed in, instead of just a coro - (despite what the function name suggests), IIRC create_task will raise an exception when trying to make a task out of a task.

your return type for coro_unless_closed should be _T | None

Why is that? If self._close_requested.wait() completes first, the later if statement is going to raise ExitRequest() and no return will ever happen. Is there a possibility of a task suddenly returning None instead of it's intended type at some point during execution?

someone is passing in a custom proactor to the event loop

I'm using whatever comes in stock in Windows. PyInstaller should just freeze the existing code. To me, none of this makes any sense either. My suspicion was that it has to do with internet connection instability, or something like the network interface becoming unavailable / broken for a short moment, but it's just pure guessing, really.

Also as a general FYI, the connect method of the proactor being passed in there, isn't an async function - it's a normal one, returning a future by the looks of it:

https://github.com/python/cpython/blob/995f6170c78570eca818f7e7dbd8a7661c171a81/Lib/asyncio/windows_events.py#L566-L595

I have no idea what's happening here, but the error sounds like there's a return statement somewhere down the chain of calls, that doesn't return a future, in some quirky edge case I happened to make my code run into.

@Dreamsorcerer
Copy link
Member

asyncio.create_task() should be used, not asyncio.ensure_future().

Isn't the outcome of it the same though? Per documentation, ensure_future returns a task for coros. The reason I've used ensure_future is to keep the 3.11 behavior inline with previous Python versions. It can also handle a task being passed in, instead of just a coro - (despite what the function name suggests), IIRC create_task will raise an exception when trying to make a task out of a task.

Well, if you're sure you need that behaviour, maybe it's fine, I've not seen the rest of the code. But, it's typically discouraged and not needed for most applications.

Why is that? If self._close_requested.wait() completes first, the later if statement is going to raise ExitRequest() and no return will ever happen.

Right, that's not too obvious from the code, and is not going to be verifiable with mypy. You could just add an assert not None to convince mypy that there is no None return value.

someone is passing in a custom proactor to the event loop

I'm using whatever comes in stock in Windows. PyInstaller should just freeze the existing code. To me, none of this makes any sense either. My suspicion was that it has to do with internet connection instability, or something like the network interface becoming unavailable / broken for a short moment, but it's just pure guessing, really.

Yep, still seems odd. I'd try and test without PyInstaller, just to be sure. But, as you say, it shouldn't matter.

@Dreamsorcerer
Copy link
Member

May be worth filing a bug with cpython as well, I don't see anything here that should result in that.

@stalkerg
Copy link
Contributor

I have same error message but for hosts = await asyncio.shield(host_resolved) in _create_direct_connection, but I am use it insiede Twisted reactor, but it still should work. Also I am under Python3.11

@stalkerg
Copy link
Contributor

stalkerg commented Apr 14, 2023

Can this issue #81353 relate to this?

@stalkerg
Copy link
Contributor

Okey, if I remove ensure_future and await directly self._resolve_host I will stuck in _wrap_create_connection on await self._loop.create_connection(*args, **kwargs) which is uvloop in my case with the same error.

@DevilXD
Copy link
Contributor Author

DevilXD commented Apr 30, 2023

One of my application's users was able to get a reliable repro on Linux, by intentionally breaking the SSL certificates filepath. Doing so results in this exact, non-descriptive error. Ref: DevilXD/TwitchDropsMiner#80 (comment) There's also a more extensive explanation of the find here: DevilXD/TwitchDropsMiner#113 (comment)

Please note that this original issue comes from Windows, but these clearly show that if something goes wrong on the system side of things, one might end up with this exact, confusing error. The question is, is this an issue with aiohttp, or is it the underlying asyncio and it's sock_connect being broken here?

@Dreamsorcerer
Copy link
Member

I think it needs a cpython bug.

If you want to try and dig further, and you can convince them to hack their Python install, it appears that error comes from the Future itself (which makes the error seem a little misleading):
https://github.com/python/cpython/blob/main/Lib/asyncio/futures.py#L288

So, I'm guessing that when the loop yields from the Future, for some reason it ends up resuming the future before it has actually finished, which should never happen. Unfortunately, I've not found any further information on why that check exists as it has been there since asyncio was introduced 10 years ago.
I suspect the only way that can happen is if the future somehow gets put into _ready or _scheduled even though it's not done. So, maybe it's possible to print some things out in this code to get an idea of how that is happening:
https://github.com/python/cpython/blob/3.10/Lib/asyncio/base_events.py#L1840

But, I'm pretty lost at this point on how exactly the future gets to that point, or what an SSL error should have to do with it.

@stalkerg
Copy link
Contributor

stalkerg commented May 1, 2023

@Dreamsorcerer I did the same investigation. ) In my case, Twisted somehow breaks it, but the behavior is the same. I need extra few days to figure out.
I have a small repro for my case twisted/twisted#11841 I think in my case reason is different but Twisted breaks the same thing as in your case.

@stalkerg
Copy link
Contributor

stalkerg commented Jul 1, 2023

okey, it's happened if you back to future without back to the event loop. For Twisted I solved it here twisted/twisted#11890

@DevilXD
Copy link
Contributor Author

DevilXD commented Jul 1, 2023

@stalkerg You seem to understand what's happening. Mind leaving a more in-depth explanation here? Where is the source of the error, in asyncio or aiohttp?

@stalkerg
Copy link
Contributor

stalkerg commented Jul 2, 2023

@DevilXD In my case, it was what twisted fromCoroutine doesn't support asyncio. In your case, I think you can't use await next(iter(done)) . Hmm, is it windows? Can you try it on Linux?

The main idea of this issue - we found what future is not done, we do yield and expect to return control to event loop, and after we return to this function future should be solved if it's not solved, you will see await wasn't used with future.

@Dreamsorcerer
Copy link
Member

I think you can't use await next(iter(done)) . Hmm, is it windows? Can you try it on Linux?

I think that's fine (and works in a trivial example). done is just a set of tasks, so this just awaits on a task in the set (and there should only be one).

Certainly seems to be something to do with Windows/Proactor though. Did we manage to test without PyInstaller? Looking back at that traceback, to me it doesn't look like it matches the cpython source code properly, so I'm not sure it's actually using the exact same code as a normal Python install...

@DevilXD
Copy link
Contributor Author

DevilXD commented Jul 2, 2023

Did we manage to test without PyInstaller?

There is a "supposedly 100% working" repro explanation for Linux available here: DevilXD/TwitchDropsMiner#80 (comment)

I've tried testing this on Windows, by setting the same env var via set SSL_CERT_FILE=nul before running the application, but it didn't do anything - the application proceeded just fine without errors. Not sure if it just didn't work, didn't matter because Windows isn't using that SSL lib, or I did something wrong along the way.

@Dreamsorcerer
Copy link
Member

Feel free to provide some more information on how to test. I ran that application with that command, getting the GUI open, but didn't get any errors. If it requires some user credentials or something, feel free to email me privately with testing details.

@DevilXD
Copy link
Contributor Author

DevilXD commented Jul 2, 2023

It's probably because it's been a while since it happened, and this issue got worked around. As far as I can recall, the finder of the repro ran into the error by accident, while trying to add Linux support to the TDM project. This commit was the result: DevilXD/TwitchDropsMiner@e31dcd8

I'm not a linux person, but the commit includes some lenghty explanation to it. I'm assuming that commenting out the import and SSL injection lines will make it happen again? If that won't do, we can ask the repro finder to retrace their steps from back then, I guess.

@Dreamsorcerer
Copy link
Member

I'm assuming that commenting out the import and SSL injection lines will make it happen again

Still no error on starting the app.

@stalkerg
Copy link
Contributor

stalkerg commented Jul 3, 2023

it's 100% not a ssl issue.

done is just a set of tasks, so this just awaits on a task in the set (and there should only be one).

this is the case, as I saw my issue next(iter(done)) will work for trivial coroutines but failed on Futures because have a chance back to Future too quickly when the event loop does have not enough time to do proper things.
Please try to rewrite it in a different style.

@Dreamsorcerer
Copy link
Member

it's 100% not a ssl issue.

I don't think anybody claimed it was. It's just that messing with SSL seems to be a way to reproduce the issue in the given app.

done is just a set of tasks, so this just awaits on a task in the set (and there should only be one).

this is the case, as I saw my issue next(iter(done)) will work for trivial coroutines but failed on Futures because have a chance back to Future too quickly when the event loop does have not enough time to do proper things. Please try to rewrite it in a different style.

I think what you were looking at in Twisted is code that was handling event loop logic, whereas this is high-level application logic. This code merely awaits on a task or future, it cannot directly cause the event loop to return to a future before it is completed (that logic would be internal to asyncio). Furthermore, the logic of wait() means that the future/task will already be completed before we await it.

@DevilXD
Copy link
Contributor Author

DevilXD commented Jul 3, 2023

@guihkx Would it be possible for you to recreate the repro from here?: DevilXD/TwitchDropsMiner#80 (comment)

I know things have changed since then, but it'd be helpful here. Removing the truststore from play doesn't seem to break it anymore. Would it be possible for you to do some quick tests on your side to confirm this?

If you would somehow still reproduce it, I guess it'd be good to know what system it happens on. The DevilXD/TwitchDropsMiner@e31dcd8 commit mentions several different Linux systems, so I'd imagine it works on some, while breaks on the others.

@guihkx
Copy link

guihkx commented Jul 4, 2023

@DevilXD Still reproducible for me on master of TwitchDropsMiner, but only if I undo the truststore commit you linked.

Also, in addition to pointing SSL_CERT_FILE to /dev/null, I have to also do the same for SSL_CERT_DIR, otherwise things don't break like they used to.

I'm not sure why that is, but due to the bleeding-edge nature of Arch Linux, system packages tend to be updated to their latest version as soon as possible, and both python and openssl have had rather big updates since I shared the reproduction steps a few months ago:

  • python 3.10.10 -> 3.11.3
  • openssl 3.0.8 -> 3.1.1

But like I mentioned in my other post, I'm not sure what intentionally breaking libssl has anything to do with this bug, especially when it also happens on Python for Windows, which doesn't rely on libssl, AFAIK.

If it helps, here's the latest stack trace of that error at DevilXD/TwitchDropsMiner@562627e:

Click me to expand
Fatal error encountered:

Traceback (most recent call last):
  File "/home/gui/dev/TwitchDropsMiner/main.py", line 174, in <module>
    loop.run_until_complete(client.run())
  File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/twitch.py", line 768, in run
    await self._run()
  File "/home/gui/dev/TwitchDropsMiner/twitch.py", line 787, in _run
    auth_state = await self.get_auth()
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/twitch.py", line 1458, in get_auth
    await self._auth_state.validate()
  File "/home/gui/dev/TwitchDropsMiner/twitch.py", line 516, in validate
    await self._validate()
  File "/home/gui/dev/TwitchDropsMiner/twitch.py", line 525, in _validate
    async with self._twitch.request(
  File "/usr/lib/python3.11/contextlib.py", line 204, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/twitch.py", line 1483, in request
    response = await self.gui.coro_unless_closed(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/gui.py", line 2054, in coro_unless_closed
    return await next(iter(done))
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/env311/lib/python3.11/site-packages/aiohttp/client.py", line 1125, in send
    return self._coro.send(arg)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/env311/lib/python3.11/site-packages/aiohttp/client.py", line 536, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/env311/lib/python3.11/site-packages/aiohttp/connector.py", line 540, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/env311/lib/python3.11/site-packages/aiohttp/connector.py", line 901, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/env311/lib/python3.11/site-packages/aiohttp/connector.py", line 1175, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gui/dev/TwitchDropsMiner/env311/lib/python3.11/site-packages/aiohttp/connector.py", line 980, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore[return-value]  # noqa
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 1069, in create_connection
    sock = await self._connect_sock(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 973, in _connect_sock
    await self.sock_connect(sock, address)
  File "/usr/lib/python3.11/asyncio/selector_events.py", line 634, in sock_connect
    return await fut
           ^^^^^^^^^
RuntimeError: await wasn't used with future

Exiting...

Application Terminated.
Close the window to exit the application.

pip freeze:

aiohttp==3.8.4
aiosignal==1.3.1
async-timeout==4.0.2
attrs==23.1.0
charset-normalizer==3.1.0
frozenlist==1.3.3
idna==3.4
multidict==6.0.4
Pillow==9.5.0
pycairo==1.24.0
PyGObject==3.44.1
pystray==0.19.4
python-xlib==0.33
six==1.16.0
truststore==0.7.0
yarl==1.9.2

You don't have to be logged in to Twitch and it happens instantly too, as soon as the GUI loads.

Let me know if I can provide any other useful information.

@Dreamsorcerer
Copy link
Member

Thanks, reproduced with SSL_CERT_FILE=/dev/null SSL_CERT_DIR=/dev/null python3 main.py. I'll see if I can find some more information from that.

@Dreamsorcerer
Copy link
Member

No idea: python/cpython#106429

@stalkerg
Copy link
Contributor

stalkerg commented Jul 5, 2023

If sleep also shows such an error, it means something (SSL?) is blocking/interrupt the event loop. And yes, we need a smaller repro.
I usually start cutting line by line until I will not find a small patter to reproduce.

@Dreamsorcerer
Copy link
Member

Yes, exactly what I was about to say. I suspect we'll struggle to figure out anything more unless someone can start cutting down the code to find the minimum amount of code that reproduces the error.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Jul 5, 2023

I've tried testing this on Windows, by setting the same env var via set SSL_CERT_FILE=nul before running the application, but it didn't do anything - the application proceeded just fine without errors.

If you can't test on a Linux machine for some reason, then I'd suggest setting the env vars to an empty file (/dev/null is a valid file path on Linux) to see if that reproduces. There is also a Linux subsystem for Windows or something, which should allow you to run the app in a Linux-like environment with the same arguments.

If you can reproduce and then break your code down until you get a minimal reproducer, then we can probably figure out what the issue is. It could be something you are doing in the app that is breaking it, but I don't think it's the snippet of code provided earlier.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Jul 5, 2023

Having a glance at the code more widely, I'm not sure I can pick out what is necessarily the issue, but I can see some things that have a small chance of causing problems.

  • I'd suggest using asyncio.run() instead of the low-level loop APIs.
  • When you cancel a task, you need to await that task after cancellation. e.g.
task.cancel()
with suppress(CancelledError):
    await task

Also, a more robust, cross-platform approach to only run 1 instance (https://github.com/DevilXD/TwitchDropsMiner/blob/master/main.py#L134) is to use a lockfile (e.g. https://bazaar.launchpad.net/~dreamsorcerer/btrfs-backup-suite/trunk/view/head:/btrfs_backup/commands.py#L37)

@DevilXD
Copy link
Contributor Author

DevilXD commented Jul 5, 2023

I'd suggest using asyncio.run() instead of the low-level loop APIs.

I've started with Python 3.6, back when it wasn't very popular to use asyncio.run, so I started with that structure during initial development, and just stuck with it. I guess now it'd be a good moment to go back and wrap the entire starting sequence into a coro, and just asyncio.run it. I'll try doing so in some spare time.

When you cancel a task, you need to await that task after cancellation

As far as I'm aware, tasks are "self-awaitable". As long as the loop is running, a cancelled task will eventually finish running and just get GCed, without having to await on it. Furthermore, the only thing awaiting on it does, is give me the result of it, either whatever it returned, or reraises an exception it raised (CancelledError in case it was cancelled).

In all places where tasks are getting cancelled, 99.9% of the time, it's the asyncio.sleep getting cancelled, which does not produce any side effects nor exceptions on it's own. There is one case where an aiohttp request may get cancelled in the middle of it, but that can happen only during a shutdown sequence (after closing the GUI), where one does not care much about the result of the request or side effects anyway.

Is there a particular line in my code I should really be awaiting on the cancelled tasks?

a more robust, cross-platform approach to only run 1 instance (...) is to use a lockfile

Thanks, I'm still kinda new to multi-platform support. Linux support was added not too long ago, hence why that part stayed as-is. I'll look into changing the implementation to use a lock file eventually. Thanks for the tip 🙂

EDIT: I've implemented the asyncio.run usage, turns out it was much easier than I thought: DevilXD/TwitchDropsMiner@c2a5db8

@DevilXD
Copy link
Contributor Author

DevilXD commented Nov 1, 2023

Is there a possibility of moving this forward by chance? Reliable repro for this has been posted in python/cpython#106429

If I'd need to provide some additional information, or can somehow help in any other way, please let me know.

@stalkerg
Copy link
Contributor

stalkerg commented Nov 1, 2023

Can you put your repro here? Seems like it's a aiohttp bug or around. Maybe even better to create a new fresh issue with small repro and etc.

@DevilXD
Copy link
Contributor Author

DevilXD commented Nov 1, 2023

The repro is quite complicated, but sure (from the linked issue):

import asyncio
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        try:
            response = await asyncio.ensure_future(session.get('https://www.google.com/'))
            print(await response.text())
        finally:
            response.release()

asyncio.run(main())

Env vars needed (intentionally break SSL certs dir path): SSL_CERT_FILE=/dev/null SSL_CERT_DIR=/dev/null

  • OS: Arch Linux
  • Python 3.11.3
  • OpenSSL 3.1.1
aiohttp==3.8.4
aiosignal==1.3.1
async-timeout==4.0.2
attrs==23.1.0
charset-normalizer==3.1.0
frozenlist==1.3.3
idna==3.4
multidict==6.0.4
yarl==1.9.2

More specifics as to where all of this comes from is in the linked issue, specifically here: python/cpython#106429 (comment)

The issue itself still appears to manifest itself "randomly" on different machines, running both Windows or Linux. In general, there's 3 cases of it:

  • It pretty much doesn't happen at all - I had it happen to me only once in a middle of tinkering around during dev, and wasn't able to reproduce it myself since.
  • It happens occasionally, when there are connection issues present - weak Wi-Fi signal, changing hotspots, etc. (most reports from others)
  • It happens 70-100% of the time, and the user generally cannot use the application. (few reports)

Again, this seems to be on a per-machine basis.

Extra info EDIT: The linked issue reduces this to session.get in aiohttp not being used with a context manager (normally done to avoid manually calling reponse.release()). There are cases where one would like to just have a task await on the response instead though. That's where it breaks - the request-response wrapper "doesn't like" to be directly awaited on.

@DevilXD
Copy link
Contributor Author

DevilXD commented Nov 2, 2023

Okay, so I have some additional information to the issue. I've had a user contact me today, and tell me they've ran into problem (for context, I'm talking about the project I maintain, that is linked in the initial message of this issue). After explaining them the bug is machine-dependent and they've probably did something to their PC recently, they've told me they've "fully reset their pc today". Knowing this possibly has to do with SSL certs store and them most likely being missing on a fresh machine, I've told them to run this as a completely separate script:

import asyncio
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get("https://www.google.com") as response:
            await response.text()

asyncio.run(main())

..., and then check if the application works now - and it did. What I imagined may happen, is that this bit will take care of fetching and caching the certs on the machine locally, so that the application code won't have to do it, avoiding the error - and it seems to have worked? All of this happened on Win10.

Whatever this bug is related to, it seems to be affected by something aiohttp accesses and stores on the local machine (directly, or through asyncio usage), that is then used in any future requests, even after closing the Python session and starting a new one (running a different script, etc.).

Hope this helps in any way.

@stalkerg
Copy link
Contributor

stalkerg commented Nov 2, 2023

Thanks, I can reproduce it! I will try to look into it.
Just in case, my trace:

Traceback (most recent call last):
  File "/home/stalkerg/projects/silveregg/aiohttp-test/test2.py", line 14, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/stalkerg/projects/silveregg/aiohttp-test/test2.py", line 8, in main
    response = await asyncio.ensure_future(session.get('https://www.google.com/'))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/client.py", line 1151, in send
    return self._coro.send(arg)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/client.py", line 562, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/connector.py", line 540, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/connector.py", line 901, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/connector.py", line 1178, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/connector.py", line 980, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore[return-value]  # noqa
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 1069, in create_connection
    sock = await self._connect_sock(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 973, in _connect_sock
    await self.sock_connect(sock, address)
  File "/usr/lib/python3.11/asyncio/selector_events.py", line 634, in sock_connect
    return await fut
           ^^^^^^^^^
RuntimeError: await wasn't used with future

similar for uvloop:

Future exception was never retrieved
future: <Future finished exception=CancelledError()>
asyncio.exceptions.CancelledError
Traceback (most recent call last):
  File "/home/stalkerg/projects/silveregg/aiohttp-test/test2.py", line 16, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
  File "/home/stalkerg/projects/silveregg/aiohttp-test/test2.py", line 9, in main
    response = await asyncio.ensure_future(session.get('https://www.google.com/'))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/client.py", line 1151, in send
    return self._coro.send(arg)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/client.py", line 562, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/connector.py", line 540, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/connector.py", line 901, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/connector.py", line 1178, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stalkerg/.cache/pypoetry/virtualenvs/aiohttp-test-8taSn_2J-py3.11/lib/python3.11/site-packages/aiohttp/connector.py", line 980, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore[return-value]  # noqa
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 2029, in create_connection
  File "uvloop/loop.pyx", line 2016, in uvloop.loop.Loop.create_connection
RuntimeError: await wasn't used with future

After disabling C implementation for asyncio this issue became more clear. Seems like I can find a reason.

@stalkerg
Copy link
Contributor

stalkerg commented Nov 2, 2023

Okey, I solved it locally. :) But to explain what is it I need a time. It's combination of aiohttp trick + race condition in Python.
Also, I know why it's ok for some people. :)))

@DevilXD
Copy link
Contributor Author

DevilXD commented Nov 2, 2023

I know why it's ok for some people. :)))

The forbidden knowledge...

Take your time. I'm glad this is finally going somewhere. In any case, I'd like to know the real cause behind it, but I'm willing to wait if that's needed of course.

@stalkerg
Copy link
Contributor

stalkerg commented Nov 2, 2023

The forbidden knowledge...

response = await asyncio.ensure_future(session.get('https://www.google.com/')) should work same as response = await session.get('https://www.google.com/') if you have only ipv4 connection (or resolver return only ipv4).

But the connection with the real issue is very indirect.

@Dreamsorcerer
Copy link
Member

Good to hear, I never managed to reproduce the issue myself.
The only thing I had to go on so far, is a mysterious bit of code that appeared in the traceback:
python/cpython#106429 (comment)

But, still haven't had time to look at it yet (and may have been completely unrelated).

@stalkerg
Copy link
Contributor

stalkerg commented Nov 3, 2023

Okey, I am ready to make PR. It's a one line fix in aiohttp.

@stalkerg
Copy link
Contributor

stalkerg commented Nov 3, 2023

@Dreamsorcerer @DevilXD can you check my bugfix? #7785

@DevilXD
Copy link
Contributor Author

DevilXD commented Nov 3, 2023

Verifying it myself will be tricky, considering I've only ran into this issue once, and then never again. Unless you know how to reliably force it to trigger on Windows.

I've asked people who experience it to test it, but it'll take some time before someone does so. I'll report once I have some results.

@stalkerg
Copy link
Contributor

stalkerg commented Nov 3, 2023

@DevilXD you can try my repro from the PR

@DevilXD
Copy link
Contributor Author

DevilXD commented Nov 3, 2023

In that case, then besides adding the missing import asyncio and import socket to the repro, adding the missing return in the throw method changes the exception to this:

Traceback (most recent call last):
  File "C:\Users\DevilXD\Desktop\test.py", line 67, in <module>
    asyncio.run(main())
  File "C:\Python38\lib\asyncio\runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "C:\Python38\lib\asyncio\base_events.py", line 616, in run_until_complete
    return future.result()
  File "C:\Users\DevilXD\Desktop\test.py", line 65, in main
    await task
  File "C:\Users\DevilXD\Desktop\test.py", line 16, in send
    return self._coro.send(arg)
  File "C:\Users\DevilXD\Desktop\test.py", line 59, in my_task
    raise Exception("test")
Exception: test

Without the change, the original exception from this issue appears.

I don't think this qualifies as a valid library test though. To confirm the PR solves this issue, we need a confirmation from someone who experienced it, then installed your PR fix, and then tested again to see if it stops happening. I'm pretty sure that's really all it takes to fix it, but I'd still like to make sure. Reproducing this on Windows would most likely need removing some cached SSL certs and/or DNS cache, or similar.

@stalkerg
Copy link
Contributor

stalkerg commented Nov 3, 2023

@DevilXD you should just add ipv6 somehow or mock host resolve to return more than one IP.
Btw seems like you test it successful.
I can make repro, for you.

@stalkerg
Copy link
Contributor

stalkerg commented Nov 3, 2023

@DevilXD please try this

import sys
sys.modules['_asyncio'] = None
import asyncio
import aiohttp
import socket
import ssl

async def _resolve_host(self, host, port, traces = None):
    return [
        {
            "host": "142.251.222.4",
            "port": 443,
            "family": socket.AddressFamily.AF_INET,
            "flags": socket.AddressInfo.AI_NUMERICHOST | socket.AddressInfo.AI_NUMERICSERV,
            "proto": 6,
            "hostname": "www.google.com",
        },
        {
            "host": "142.251.222.4",
            "port": 443,
            "family": socket.AddressFamily.AF_INET,
            "flags": socket.AddressInfo.AI_NUMERICHOST | socket.AddressInfo.AI_NUMERICSERV,
            "proto": 6,
            "hostname": "www.google.com",
        }
    ]

async def main():
    aiohttp.TCPConnector._resolve_host = _resolve_host
    async with aiohttp.ClientSession() as session:
        response = None
        try:
            response = await asyncio.ensure_future(session.get('https://www.google.com/'))
            print(await response.text())
        finally:
            if response:
                response.release()

asyncio.run(main())

it should cause the error even for you.

Dreamsorcerer added a commit that referenced this issue Nov 3, 2023
This PR fixes #7117 and similar issues. Short explanation - our
coroutine wrapper does not properly handle the exception, which breaks
coroutine handling. As you can see, any task expects results from
`throw` -
https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L303
but it seems like in aiohttp it was acidently removed by this commit
stalkerg@f04ecc2#diff-f334e752b4894ef951105572ab8b195aeb8db90eb6e48b1dfbd9a01da4c854f5L842

This is repro a case without aiohttp:
```python
import ssl
import collections


class TestCoroWrapper(collections.abc.Coroutine):
    __slots__ = ("_coro", "_resp")
    def __init__(self, coro):
        self._coro = coro

    def __getattr__(self, attr):
        return getattr(self._coro, attr)

    def send(self, arg):
        return self._coro.send(arg)

    def throw(self, arg):
        self._coro.throw(arg)

    def close(self):
        return self._coro.close()

    def __await__(self):
        ret = self._coro.__await__()
        return ret

async def ssl_call(context):
    loop = asyncio.get_event_loop()
    return await loop.create_connection(
        lambda: asyncio.Protocol(),
        '2404:6800:4004:824::2004',
        443,
        ssl=context,
        family=socket.AddressFamily.AF_INET6,
        proto=6,
        flags=socket.AddressInfo.AI_NUMERICHOST | socket.AddressInfo.AI_NUMERICSERV,
        server_hostname='www.google.com',
        local_addr=None
    )

async def prepare_call():
    context = ssl.create_default_context()
    try:
        connection = await ssl_call(context)
    except ssl.SSLError as e:
        print(f"Got exception1: {e}")
        raise e

    return connection

async def my_task():
    try:
        await prepare_call()
    except Exception as e:
        print(f"Got exception2: {e}")

    await asyncio.sleep(1)
    raise Exception("test")

async def main():
    my_coro = TestCoroWrapper(my_task())
    print(f"is coro? {asyncio.iscoroutine(my_coro)}")
    task = asyncio.create_task(my_coro)
    await task

asyncio.run(main())
```

The `TestCoroWrapper` here is equivalent of
`_BaseRequestContextManager`. If you run such code like:
`SSL_CERT_FILE=/dev/null SSL_CERT_DIR=/dev/null python test.py` you will
get an error: `await wasn't used with future`.
The main idea here is that you are trying to await the sleep function
after getting and catching an exception from the native (SSL) module.

Now I should explain why repro with aiohttp for some users return the
same error:
```python
import asyncio
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        try:
            response = await asyncio.ensure_future(session.get('https://www.google.com/'))
            print(await response.text())
        finally:
            response.release()

asyncio.run(main())
```

here it's happened because in `TCPConnector._create_direct_connection`
we are getting all IPs for the given host and trying to connect one by
one. If the first connection gets an error we will catch this error and
try again for the next IP. If you have IPv6 you will have at least 2 IPs
here (ipv6 and ipv4), and after the first error, you will try to connect
to a second IP and get the same error.

Why it's problem only for `asyncio.ensure_future`? Because
`asyncio.ensure_future` creates a `task` such a task starts processing
coroutines and directly communicates with our coroutine wrapper witch
not return a result for `throw`.

I will write changelog and etc after people validate this PR. But
honestly, I don't think I can write a unit test for a such case.

Regards,

## Checklist

- [ ] I think the code is well written
- [ ] Unit tests for the changes exist
- [ ] Documentation reflects the changes
- [ ] If you provide code modification, please add yourself to
`CONTRIBUTORS.txt`
  * The format is &lt;Name&gt; &lt;Surname&gt;.
  * Please keep alphabetical order, the file is sorted by names.
- [ ] Add a new news fragment into the `CHANGES` folder
  * name it `<issue_id>.<type>` for example (588.bugfix)
* if you don't have an `issue_id` change it to the pr id after creating
the pr
  * ensure type is one of the following:
    * `.feature`: Signifying a new feature.
    * `.bugfix`: Signifying a bug fix.
    * `.doc`: Signifying a documentation improvement.
    * `.removal`: Signifying a deprecation or removal of public API.
* `.misc`: A ticket has been closed, but it is not of interest to users.
* Make sure to use full sentences with correct case and punctuation, for
example: "Fix issue with non-ascii contents in doctest text files."

---------

Co-authored-by: Sam Bull <aa6bs0@sambull.org>
Co-authored-by: Sam Bull <git@sambull.org>
Dreamsorcerer pushed a commit that referenced this issue Nov 3, 2023
This PR fixes #7117 and similar issues. Short explanation - our
coroutine wrapper does not properly handle the exception, which breaks
coroutine handling. As you can see, any task expects results from
`throw` -
https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L303
but it seems like in aiohttp it was acidently removed by this commit
stalkerg@f04ecc2#diff-f334e752b4894ef951105572ab8b195aeb8db90eb6e48b1dfbd9a01da4c854f5L842

This is repro a case without aiohttp:
```python
import ssl
import collections

class TestCoroWrapper(collections.abc.Coroutine):
    __slots__ = ("_coro", "_resp")
    def __init__(self, coro):
        self._coro = coro

    def __getattr__(self, attr):
        return getattr(self._coro, attr)

    def send(self, arg):
        return self._coro.send(arg)

    def throw(self, arg):
        self._coro.throw(arg)

    def close(self):
        return self._coro.close()

    def __await__(self):
        ret = self._coro.__await__()
        return ret

async def ssl_call(context):
    loop = asyncio.get_event_loop()
    return await loop.create_connection(
        lambda: asyncio.Protocol(),
        '2404:6800:4004:824::2004',
        443,
        ssl=context,
        family=socket.AddressFamily.AF_INET6,
        proto=6,
        flags=socket.AddressInfo.AI_NUMERICHOST | socket.AddressInfo.AI_NUMERICSERV,
        server_hostname='www.google.com',
        local_addr=None
    )

async def prepare_call():
    context = ssl.create_default_context()
    try:
        connection = await ssl_call(context)
    except ssl.SSLError as e:
        print(f"Got exception1: {e}")
        raise e

    return connection

async def my_task():
    try:
        await prepare_call()
    except Exception as e:
        print(f"Got exception2: {e}")

    await asyncio.sleep(1)
    raise Exception("test")

async def main():
    my_coro = TestCoroWrapper(my_task())
    print(f"is coro? {asyncio.iscoroutine(my_coro)}")
    task = asyncio.create_task(my_coro)
    await task

asyncio.run(main())
```

The `TestCoroWrapper` here is equivalent of
`_BaseRequestContextManager`. If you run such code like:
`SSL_CERT_FILE=/dev/null SSL_CERT_DIR=/dev/null python test.py` you will
get an error: `await wasn't used with future`.
The main idea here is that you are trying to await the sleep function
after getting and catching an exception from the native (SSL) module.

Now I should explain why repro with aiohttp for some users return the
same error:
```python
import asyncio
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        try:
            response = await asyncio.ensure_future(session.get('https://www.google.com/'))
            print(await response.text())
        finally:
            response.release()

asyncio.run(main())
```

here it's happened because in `TCPConnector._create_direct_connection`
we are getting all IPs for the given host and trying to connect one by
one. If the first connection gets an error we will catch this error and
try again for the next IP. If you have IPv6 you will have at least 2 IPs
here (ipv6 and ipv4), and after the first error, you will try to connect
to a second IP and get the same error.

Why it's problem only for `asyncio.ensure_future`? Because
`asyncio.ensure_future` creates a `task` such a task starts processing
coroutines and directly communicates with our coroutine wrapper witch
not return a result for `throw`.

I will write changelog and etc after people validate this PR. But
honestly, I don't think I can write a unit test for a such case.

Regards,

- [ ] I think the code is well written
- [ ] Unit tests for the changes exist
- [ ] Documentation reflects the changes
- [ ] If you provide code modification, please add yourself to
`CONTRIBUTORS.txt`
  * The format is &lt;Name&gt; &lt;Surname&gt;.
  * Please keep alphabetical order, the file is sorted by names.
- [ ] Add a new news fragment into the `CHANGES` folder
  * name it `<issue_id>.<type>` for example (588.bugfix)
* if you don't have an `issue_id` change it to the pr id after creating
the pr
  * ensure type is one of the following:
    * `.feature`: Signifying a new feature.
    * `.bugfix`: Signifying a bug fix.
    * `.doc`: Signifying a documentation improvement.
    * `.removal`: Signifying a deprecation or removal of public API.
* `.misc`: A ticket has been closed, but it is not of interest to users.
* Make sure to use full sentences with correct case and punctuation, for
example: "Fix issue with non-ascii contents in doctest text files."

---------

Co-authored-by: Sam Bull <aa6bs0@sambull.org>
Co-authored-by: Sam Bull <git@sambull.org>
(cherry picked from commit a57dc31)
Dreamsorcerer added a commit that referenced this issue Nov 3, 2023
This PR fixes #7117 and similar issues. Short explanation - our
coroutine wrapper does not properly handle the exception, which breaks
coroutine handling. As you can see, any task expects results from
`throw` -
https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L303
but it seems like in aiohttp it was acidently removed by this commit
stalkerg@f04ecc2#diff-f334e752b4894ef951105572ab8b195aeb8db90eb6e48b1dfbd9a01da4c854f5L842

This is repro a case without aiohttp:
```python
import ssl
import collections

class TestCoroWrapper(collections.abc.Coroutine):
    __slots__ = ("_coro", "_resp")
    def __init__(self, coro):
        self._coro = coro

    def __getattr__(self, attr):
        return getattr(self._coro, attr)

    def send(self, arg):
        return self._coro.send(arg)

    def throw(self, arg):
        self._coro.throw(arg)

    def close(self):
        return self._coro.close()

    def __await__(self):
        ret = self._coro.__await__()
        return ret

async def ssl_call(context):
    loop = asyncio.get_event_loop()
    return await loop.create_connection(
        lambda: asyncio.Protocol(),
        '2404:6800:4004:824::2004',
        443,
        ssl=context,
        family=socket.AddressFamily.AF_INET6,
        proto=6,
        flags=socket.AddressInfo.AI_NUMERICHOST | socket.AddressInfo.AI_NUMERICSERV,
        server_hostname='www.google.com',
        local_addr=None
    )

async def prepare_call():
    context = ssl.create_default_context()
    try:
        connection = await ssl_call(context)
    except ssl.SSLError as e:
        print(f"Got exception1: {e}")
        raise e

    return connection

async def my_task():
    try:
        await prepare_call()
    except Exception as e:
        print(f"Got exception2: {e}")

    await asyncio.sleep(1)
    raise Exception("test")

async def main():
    my_coro = TestCoroWrapper(my_task())
    print(f"is coro? {asyncio.iscoroutine(my_coro)}")
    task = asyncio.create_task(my_coro)
    await task

asyncio.run(main())
```

The `TestCoroWrapper` here is equivalent of
`_BaseRequestContextManager`. If you run such code like:
`SSL_CERT_FILE=/dev/null SSL_CERT_DIR=/dev/null python test.py` you will
get an error: `await wasn't used with future`.
The main idea here is that you are trying to await the sleep function
after getting and catching an exception from the native (SSL) module.

Now I should explain why repro with aiohttp for some users return the
same error:
```python
import asyncio
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        try:
            response = await asyncio.ensure_future(session.get('https://www.google.com/'))
            print(await response.text())
        finally:
            response.release()

asyncio.run(main())
```

here it's happened because in `TCPConnector._create_direct_connection`
we are getting all IPs for the given host and trying to connect one by
one. If the first connection gets an error we will catch this error and
try again for the next IP. If you have IPv6 you will have at least 2 IPs
here (ipv6 and ipv4), and after the first error, you will try to connect
to a second IP and get the same error.

Why it's problem only for `asyncio.ensure_future`? Because
`asyncio.ensure_future` creates a `task` such a task starts processing
coroutines and directly communicates with our coroutine wrapper witch
not return a result for `throw`.

---------

Co-authored-by: Sam Bull <aa6bs0@sambull.org>
Co-authored-by: Sam Bull <git@sambull.org>
(cherry picked from commit a57dc31)

Co-authored-by: Yury Zhuravlev <stalkerg@gmail.com>
xiangxli pushed a commit to xiangxli/aiohttp that referenced this issue Dec 4, 2023
…ibs#7786)

This PR fixes aio-libs#7117 and similar issues. Short explanation - our
coroutine wrapper does not properly handle the exception, which breaks
coroutine handling. As you can see, any task expects results from
`throw` -
https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L303
but it seems like in aiohttp it was acidently removed by this commit
stalkerg@f04ecc2#diff-f334e752b4894ef951105572ab8b195aeb8db90eb6e48b1dfbd9a01da4c854f5L842

This is repro a case without aiohttp:
```python
import ssl
import collections

class TestCoroWrapper(collections.abc.Coroutine):
    __slots__ = ("_coro", "_resp")
    def __init__(self, coro):
        self._coro = coro

    def __getattr__(self, attr):
        return getattr(self._coro, attr)

    def send(self, arg):
        return self._coro.send(arg)

    def throw(self, arg):
        self._coro.throw(arg)

    def close(self):
        return self._coro.close()

    def __await__(self):
        ret = self._coro.__await__()
        return ret

async def ssl_call(context):
    loop = asyncio.get_event_loop()
    return await loop.create_connection(
        lambda: asyncio.Protocol(),
        '2404:6800:4004:824::2004',
        443,
        ssl=context,
        family=socket.AddressFamily.AF_INET6,
        proto=6,
        flags=socket.AddressInfo.AI_NUMERICHOST | socket.AddressInfo.AI_NUMERICSERV,
        server_hostname='www.google.com',
        local_addr=None
    )

async def prepare_call():
    context = ssl.create_default_context()
    try:
        connection = await ssl_call(context)
    except ssl.SSLError as e:
        print(f"Got exception1: {e}")
        raise e

    return connection

async def my_task():
    try:
        await prepare_call()
    except Exception as e:
        print(f"Got exception2: {e}")

    await asyncio.sleep(1)
    raise Exception("test")

async def main():
    my_coro = TestCoroWrapper(my_task())
    print(f"is coro? {asyncio.iscoroutine(my_coro)}")
    task = asyncio.create_task(my_coro)
    await task

asyncio.run(main())
```

The `TestCoroWrapper` here is equivalent of
`_BaseRequestContextManager`. If you run such code like:
`SSL_CERT_FILE=/dev/null SSL_CERT_DIR=/dev/null python test.py` you will
get an error: `await wasn't used with future`.
The main idea here is that you are trying to await the sleep function
after getting and catching an exception from the native (SSL) module.

Now I should explain why repro with aiohttp for some users return the
same error:
```python
import asyncio
import aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        try:
            response = await asyncio.ensure_future(session.get('https://www.google.com/'))
            print(await response.text())
        finally:
            response.release()

asyncio.run(main())
```

here it's happened because in `TCPConnector._create_direct_connection`
we are getting all IPs for the given host and trying to connect one by
one. If the first connection gets an error we will catch this error and
try again for the next IP. If you have IPv6 you will have at least 2 IPs
here (ipv6 and ipv4), and after the first error, you will try to connect
to a second IP and get the same error.

Why it's problem only for `asyncio.ensure_future`? Because
`asyncio.ensure_future` creates a `task` such a task starts processing
coroutines and directly communicates with our coroutine wrapper witch
not return a result for `throw`.

---------

Co-authored-by: Sam Bull <aa6bs0@sambull.org>
Co-authored-by: Sam Bull <git@sambull.org>
(cherry picked from commit a57dc31)

Co-authored-by: Yury Zhuravlev <stalkerg@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants