Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore moving the connect ceil_timeout in the client inside the Connector connect #9598

Closed
bdraco opened this issue Oct 30, 2024 · 3 comments · Fixed by #9600
Closed

Explore moving the connect ceil_timeout in the client inside the Connector connect #9598

bdraco opened this issue Oct 30, 2024 · 3 comments · Fixed by #9600
Milestone

Comments

@bdraco
Copy link
Member

bdraco commented Oct 30, 2024

async with ceil_timeout(

If the connection is reused, we create a TimerHandler and than cancel is right away.... in testing this seems to be the case 80%+ of the time so this seem ripe to optimize away. Since we pass on the timeout object to connect anyways we could move the timeout into connect so it only does the timeout if its actually going to wait. Since we expect connection reuse to be common with HTTP/1.1+ this takes up a significant amount of the request time.

There is a small risk someone has subclassed and replaced the connect implementation with a custom connector, however it connect is quite complex so it seems unlikely someone would reimplement it. Either way it would need a breaking change note in the changelog.

@Dreamsorcerer Dreamsorcerer added this to the 4.0 milestone Oct 30, 2024
@bdraco
Copy link
Member Author

bdraco commented Oct 31, 2024

Avoiding the timeout saves ~13.5% of the request time.

Simple test was to remove async with ceil_timeout and re-run the client benchmark

@bdraco
Copy link
Member Author

bdraco commented Oct 31, 2024

Here is the benchmark script. Its cobbled together and iterated on from many different places so its a bit messy, but good enough for a test.

import asyncio
import sys
import time
from typing import Any, Coroutine, Iterator

import matplotlib.pyplot as plt
import uvloop

import aiohttp
from aiohttp import web

asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())


PORT = 8082
URL = f"http://localhost:{PORT}/req"
RESP = "a" * 2000
REQUESTS = 10000
CONCURRENCY = 20


def run_web_server():
    async def handle(_request):
        return web.Response(text=RESP)

    app = web.Application()
    app.add_routes([web.get("/req", handle)])
    web.run_app(app, host="localhost", port=PORT)


def duration(start: float) -> int:
    return int((time.monotonic() - start) * 1000)


async def run_requests(axis: plt.Axes):
    async def gather_limited_concurrency(coros: Iterator[Coroutine[Any, Any, Any]]):
        sem = asyncio.Semaphore(CONCURRENCY)

        async def coro_with_sem(coro):
            async with sem:
                return await coro

        return await asyncio.gather(*(coro_with_sem(c) for c in coros))

    async def aiohttp_get(session: aiohttp.ClientSession, timings: list[int]):
        start = time.monotonic()
        async with session.request("GET", URL) as res:
            assert len(await res.read()) == len(RESP)
            assert res.status == 200, f"status={res.status}"
        timings.append(duration(start))

    async with aiohttp.ClientSession() as session:
        # warmup
        await asyncio.gather(*(aiohttp_get(session, []) for _ in range(REQUESTS)))

        timings = []
        start = time.monotonic()
        await gather_limited_concurrency(
            aiohttp_get(session, timings) for _ in range(REQUESTS)
        )

        axis.plot(
            [*range(REQUESTS)], timings, label=f"aiohttp (tot={duration(start)}ms)"
        )


def main(mode: str):
    assert mode in {"server", "client"}, f"invalid mode: {mode}"

    if mode == "server":
        run_web_server()
    else:
        fig, ax = plt.subplots()
        asyncio.run(run_requests(ax))
        plt.legend(loc="upper left")
        ax.set_xlabel("# request")
        ax.set_ylabel("[ms]")
        plt.show()

    print("DONE", flush=True)


if __name__ == "__main__":
    assert len(sys.argv) == 2, f"Usage: {sys.argv[0]} server|client"
    main(sys.argv[1])

@bdraco
Copy link
Member Author

bdraco commented Oct 31, 2024

I think its possible someone might subclass connect and call super() but it seems very unlikely they would reimplement connect given how many internals we call back into it

aiohttp/client.py:        if connector._loop is not loop:
aiohttp/client.py:                        timeout_ceil_threshold=self._connector._timeout_ceil_threshold,
aiohttp/client_ws.py:            conn._connector._timeout_ceil_threshold if conn is not None else 5
aiohttp/client_ws.py:            conn._connector._timeout_ceil_threshold if conn is not None else 5
aiohttp/connector.py:            self._connector._release(self._key, self._protocol, should_close=True)
aiohttp/connector.py:            self._connector._release(self._key, self._protocol, should_close=True)
aiohttp/connector.py:            self._connector._release(

Probably ok to backport to 3.11 but not 3.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants