Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve inter-process queue fairness #839

Closed
deliahu opened this issue Mar 3, 2020 · 0 comments · Fixed by #1526
Closed

Improve inter-process queue fairness #839

deliahu opened this issue Mar 3, 2020 · 0 comments · Fixed by #1526
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@deliahu
Copy link
Member

deliahu commented Mar 3, 2020

Description

When using the Uvicorn process manager, requests are assigned to Uvicorn workers randomly. This causes issues including unbalanced queue wait times and the max concurrency limit not working as expected (since each process applies max concurrency independently).

Possible solutions

NGINX

See #1298

Gunicorn + Uvicorn workers

Consider switching to Gunicorn + Uvicorn workers. Gunicorn is a more full-featured process manager than Uvicorn's built in one, and may balance requests across processes better.

Blockers for switching to Gunicorn

Currently there are two features of the Uvicorn process manager that are not supported by Gunicorn:

  • --limit-concurrency is used to respond with 503s when the user specified concurrency limit is reached. Here, Uvicorn currently says "Gunicorn provides a different set of configuration options to Uvicorn, so some options such as --limit-concurrency are not yet supported when running with Gunicorn."
  • It is not currently possible to configure how many threads are used by the Uvicorn worker

Here is the Uvicorn Changelog

Add request forwarder sidecar container

A sidecar container could receive all requests, count in-flight requests, manage max_replica_concurrency, and forward them to the application container. Within the application container, we'd use FastAPI on Gunicorn with Uvicorn workers (for worker queue fairness), and configure unlimited backlog and limit-concurrency. Knative does something similar to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants