-
-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: check endpoint handler is async only once #2536
Conversation
We improve the dispatch in the routing module to only check once whether the handler is async. This gives an improvement of 2.5% (sync), 1.82% (async) in the number of requests/s. The average latency decreased 1.6% (sync) and 1.5% (async). Note that we had to use a cast in the helper function, as the typeguard does not work for the negative case. In the main branch the code is working without a cast, because the typeguard return type is in practice `AwaitableCAllable[Any]`, which end up swallowing the other types in the union. Benchmarking We use a simple json app, with both a sync and async endpoint, and the wrk tool to get the measurements. The measuerements were done on a Macbook Pro with M1 chip, 16GB of memory and macOS 12.3. The Python version used for the tests is Python 3.12.2, and the uvicorn version is 0.27.1 Before ``` $ wrk http://localhost:8000/sync Running 10s test @ http://localhost:8000/sync 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 733.77us 55.57us 3.21ms 78.35% Req/Sec 6.84k 147.06 7.15k 87.13% 137474 requests in 10.10s, 18.35MB read Requests/sec: 13610.69 Transfer/sec: 1.82MB $ wrk http://localhost:8000/async Running 10s test @ http://localhost:8000/async 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 717.14us 49.05us 1.83ms 71.11% Req/Sec 7.00k 112.14 7.36k 76.24% 140613 requests in 10.10s, 18.77MB read Requests/sec: 13922.97 Transfer/sec: 1.86MB ```` After ``` $ wrk http://localhost:8000/sync Running 10s test @ http://localhost:8000/sync 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 721.34us 202.40us 11.13ms 99.32% Req/Sec 7.01k 230.04 7.62k 94.00% 139558 requests in 10.00s, 18.63MB read Requests/sec: 13956.14 Transfer/sec: 1.86MB $ wrk http://localhost:8000/async Running 10s test @ http://localhost:8000/async 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 706.04us 109.90us 7.46ms 98.30% Req/Sec 7.12k 136.09 7.39k 90.59% 143188 requests in 10.10s, 19.12MB read Requests/sec: 14176.95 Transfer/sec: 1.89MB ``` The app used for the test is as follows ```python from starlette.applications import Starlette from starlette.responses import JSONResponse from starlette.routing import Route import uvicorn async def async_page(request): return JSONResponse({'status': 'ok'}) async def sync_page(request): return JSONResponse({'status': 'ok'}) app = Starlette(routes=[ Route('/async', async_page), Route('/sync', sync_page), ]) if __name__ == "__main__": uvicorn.run("app:app", port=8000, log_level="critical") ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if those changes are worth it, but...
Could you share why you think the changes are not worth it? Any improvements on removing blocking code would help all applications, as by removing unneeded operations we reduce the amount of blocking code, which affects (tail) latencies. This is a no brainer if the code in question is in fact unneeded because it's doing the same thing over an over again. |
* refactor: check endpoint handler is async only once We improve the dispatch in the routing module to only check once whether the handler is async. This gives an improvement of 2.5% (sync), 1.82% (async) in the number of requests/s. The average latency decreased 1.6% (sync) and 1.5% (async). Note that we had to use a cast in the helper function, as the typeguard does not work for the negative case. In the main branch the code is working without a cast, because the typeguard return type is in practice `AwaitableCAllable[Any]`, which end up swallowing the other types in the union. Benchmarking We use a simple json app, with both a sync and async endpoint, and the wrk tool to get the measurements. The measuerements were done on a Macbook Pro with M1 chip, 16GB of memory and macOS 12.3. The Python version used for the tests is Python 3.12.2, and the uvicorn version is 0.27.1 Before ``` $ wrk http://localhost:8000/sync Running 10s test @ http://localhost:8000/sync 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 733.77us 55.57us 3.21ms 78.35% Req/Sec 6.84k 147.06 7.15k 87.13% 137474 requests in 10.10s, 18.35MB read Requests/sec: 13610.69 Transfer/sec: 1.82MB $ wrk http://localhost:8000/async Running 10s test @ http://localhost:8000/async 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 717.14us 49.05us 1.83ms 71.11% Req/Sec 7.00k 112.14 7.36k 76.24% 140613 requests in 10.10s, 18.77MB read Requests/sec: 13922.97 Transfer/sec: 1.86MB ```` After ``` $ wrk http://localhost:8000/sync Running 10s test @ http://localhost:8000/sync 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 721.34us 202.40us 11.13ms 99.32% Req/Sec 7.01k 230.04 7.62k 94.00% 139558 requests in 10.00s, 18.63MB read Requests/sec: 13956.14 Transfer/sec: 1.86MB $ wrk http://localhost:8000/async Running 10s test @ http://localhost:8000/async 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 706.04us 109.90us 7.46ms 98.30% Req/Sec 7.12k 136.09 7.39k 90.59% 143188 requests in 10.10s, 19.12MB read Requests/sec: 14176.95 Transfer/sec: 1.89MB ``` The app used for the test is as follows ```python from starlette.applications import Starlette from starlette.responses import JSONResponse from starlette.routing import Route import uvicorn async def async_page(request): return JSONResponse({'status': 'ok'}) async def sync_page(request): return JSONResponse({'status': 'ok'}) app = Starlette(routes=[ Route('/async', async_page), Route('/sync', sync_page), ]) if __name__ == "__main__": uvicorn.run("app:app", port=8000, log_level="critical") ``` * Apply PR suggestion --------- Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
Same optimization can be applied here: starlette/starlette/background.py Lines 27 to 30 in 8b1d2c2
But is this snippet executed more than once? |
Summary
We improve the dispatch in the routing module to only check once whether the handler is async. This gives an improvement of 2.5% (sync), 1.82% (async) in the number of requests/s. The average latency decreased 1.6% (sync) and 1.5% (async).
Note that we had to use a cast in the helper function, as the typeguard does not work for the negative case. In the main branch the code is working without a cast, because the typeguard return type is in practice
AwaitableCallable[Any]
, which end up swallowing the other types in the union.Benchmarking
We use a simple json app, with both a sync and async endpoint, and the
wrk
tool to get the measurements. WE also show the profile shown by thesnakeviz
tool.The measurements were done on a Macbook Pro with M1 chip, 16GB of memory and macOS 12.3. The Python version used for the tests is Python 3.12.2, and the uvicorn version is 0.27.1
Before
After
Test App
The app used for the test is as follows
Checklist
The two points above are not checked as this is just refactoring code, and it does not require any new tests or docs. If required I could add tests for the added helper function
_to_async
.