Clarify Gunicorn's advice to use Nginx proxy #3265

vanschelven · 2024-07-11T12:24:26Z

vanschelven
Jul 11, 2024

Gunicorn currently opens its advice for deployment with a blanket recommendation for a proxy server, particularly Nginx

We strongly recommend using Gunicorn behind a proxy server. [..] we strongly advise that you use Nginx.

The main reason provided is:

you need to make sure that it buffers slow clients when you use default Gunicorn workers. Without this buffering Gunicorn will be easily susceptible to denial-of-service attacks.

However:

If you want to be able to handle streaming request/responses or other fancy features like Comet, Long polling, or Web sockets, you need to turn off the proxy buffering. When you do this you must run with one of the async worker classes.

It would be nice if the blanket recommendation for Nginx could be broken up into parts, such that a the particular trade-offs / risks become more clear.

In particular:

The only reason I could find in the docs to use Nginx is the denial-of-service attack through slow clients. Are there other reasons that Nginx is desirable? If so, it would be nice to list them, such that individuals can decide whether those reasons apply to them.*
Turning off request buffering is an (obvious) requirement when handling requests in a streaming manner (or long polling etc). Given that doing so puts us right back at the DOS that the first line of the docs warns about, can we be sure that "run[ning] with one of the async worker classes" is enough of a mitigation? Why so† ? Also: if it is sufficient to run with an async worker, wouldn't it be better to clarify this in the opening line of the document ("run with an async worker or run with the sync worker behind a proxy")?
More generally, isn't the need for a proxy almost entirely determined by the worker class? It's the worker class that does all of the reading/writing to sockets, which is what the weak point w.r.t. DOSsing is, right? E.g. the uvicorn docs state that "Using Nginx as a proxy in front of your Uvicorn processes may not be necessary".
The docs mention Hey as a mechanism to test susceptibility to slow clients, but I cannot find a command-line argument to Hey to specifically act as a slow client. Wouldn't it be better to point to e.g. slowloris.py?

===
*A partial answer to this question that I can come up with myself is:

a. virtual hosts
b. serving static files
c. SSL conf (gunicorn provides SSL, but perhaps not as configurable?)
d. "general hardening" (parsing of HTTP?)

†My guess based on my reading of the code is: because they do not block on read, and do not spawn threads/processes per request. But this raises the question: is there anything that could/should be configured, such as the number of connections to accept per worker, and the request timeout? All of this should be explained in the docs.

vanschelven · 2024-07-11T12:52:02Z

vanschelven
Jul 11, 2024
Author

[5] how does the (accidental or adversarial) DOS through slow clients compare to other DOS mechanisms against nginx, or on the async workers? (i.e. is this an especially easy, or the only realistic, attack)?

0 replies

pajod · 2024-07-11T21:30:53Z

pajod
Jul 11, 2024

Plus, now that we briefly had some fantasies about HTTP/2 speeding up the web and may soon see practical, feature-complete implementations of HTTP/3 in widespread browsers, there is a new argument in favour of proxying that did not exist back when this was originally written.

0 replies

vanschelven · 2024-07-26T12:08:29Z

vanschelven
Jul 26, 2024
Author

e. Run your webserver at a privileged port (80, 443) as root, while dropping into a less privileged user to run the actual application code

1 reply

benoitc Aug 11, 2024
Maintainer

e. Run your webserver at a privileged port (80, 443) as root, while dropping into a less privileged user to run the actual application code

this can be done by Gunicorn, so obviously it is not a reason :)

benoitc · 2024-08-11T07:27:34Z

benoitc
Aug 11, 2024
Maintainer

The reason we advise a buffered proxy in current documentation is because Gunicorn has been designed to act as a WSGI server, a gateway which pass HTTP messages to a python application (and then also pass CGI variables passed from this server). Ie. Gunicorn was at first designed to run behind a web server like a fastcgi server (fpm for example). and then rely to the HTTP server to handle part of work : buffering connections when needed, deny/rewrite/alias rules... ALso so minimal DDOS protection ( though most DDOS are done at L2 / L3 level and not L7 for obvious reasons). We are targeting NGNIX (and today we should update the doc toi link to its forks) because even today it's one of the only HTTP server pproviding a good proxy features far above the others.

Over the time more features have been added to run Gunicorn as main web server (originally to help development) but it appears more and more people use it as a standalone server thinking the services provided on top by k8s or their cloud provider can provide the same promise (which is not true but let's forget it for now). Recent works from @pajod and others has helped to make Gunicorn more robust for that purpose and also ensure we are more HTTP 1.1 compliant (thanks!). When acting as webserver without a proxy on top it is advised to use an async worker.

We should indeed update the documentation to reflect above. Would you mind to create a PR for this?

That said, the updated goal of gunicorn is to provides more feature as agateway for python application to let them receive and send HTTP messages in the modern world. This include the following developments:

a revamped async architecture with native support of async properties of Python
suppport demanding application (IA, and others) to allow them to load and keep running in the background to pass and receive messages.

A branch will be created this month to handle the work for it.

0 replies

vanschelven · 2024-08-17T11:15:24Z

vanschelven
Aug 17, 2024
Author

Would you mind to create a PR for this?

I might just do that, but I can't commit to any particular timeline TBH

0 replies

vanschelven · 2024-08-17T11:20:11Z

vanschelven
Aug 17, 2024
Author

agateway for python application to let them receive and send HTTP messages in the modern world

not mentioned by @benoitc, but IMO within that goal are applications that have different expectations about the kind of (possibly hostile) traffic they can receive, e.g. because they are not exposed to the internet (behind a firewall).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify Gunicorn's advice to use Nginx proxy #3265

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Clarify Gunicorn's advice to use Nginx proxy #3265

vanschelven Jul 11, 2024

Replies: 6 comments · 1 reply

vanschelven Jul 11, 2024 Author

pajod Jul 11, 2024

vanschelven Jul 26, 2024 Author

benoitc Aug 11, 2024 Maintainer

benoitc Aug 11, 2024 Maintainer

vanschelven Aug 17, 2024 Author

vanschelven Aug 17, 2024 Author

vanschelven
Jul 11, 2024

Replies: 6 comments 1 reply

vanschelven
Jul 11, 2024
Author

pajod
Jul 11, 2024

vanschelven
Jul 26, 2024
Author

benoitc Aug 11, 2024
Maintainer

benoitc
Aug 11, 2024
Maintainer

vanschelven
Aug 17, 2024
Author

vanschelven
Aug 17, 2024
Author