Clarify Gunicorn's advice to use Nginx proxy #3265
Replies: 6 comments 1 reply
-
[5] how does the (accidental or adversarial) DOS through slow clients compare to other DOS mechanisms against nginx, or on the async workers? (i.e. is this an especially easy, or the only realistic, attack)? |
Beta Was this translation helpful? Give feedback.
-
Plus, now that we briefly had some fantasies about HTTP/2 speeding up the web and may soon see practical, feature-complete implementations of HTTP/3 in widespread browsers, there is a new argument in favour of proxying that did not exist back when this was originally written. |
Beta Was this translation helpful? Give feedback.
-
e. Run your webserver at a privileged port (80, 443) as root, while dropping into a less privileged user to run the actual application code |
Beta Was this translation helpful? Give feedback.
-
The reason we advise a buffered proxy in current documentation is because Gunicorn has been designed to act as a WSGI server, a gateway which pass HTTP messages to a python application (and then also pass CGI variables passed from this server). Ie. Gunicorn was at first designed to run behind a web server like a fastcgi server (fpm for example). and then rely to the HTTP server to handle part of work : buffering connections when needed, deny/rewrite/alias rules... ALso so minimal DDOS protection ( though most DDOS are done at L2 / L3 level and not L7 for obvious reasons). We are targeting NGNIX (and today we should update the doc toi link to its forks) because even today it's one of the only HTTP server pproviding a good proxy features far above the others. Over the time more features have been added to run Gunicorn as main web server (originally to help development) but it appears more and more people use it as a standalone server thinking the services provided on top by k8s or their cloud provider can provide the same promise (which is not true but let's forget it for now). Recent works from @pajod and others has helped to make Gunicorn more robust for that purpose and also ensure we are more HTTP 1.1 compliant (thanks!). When acting as webserver without a proxy on top it is advised to use an async worker. We should indeed update the documentation to reflect above. Would you mind to create a PR for this? That said, the updated goal of gunicorn is to provides more feature as agateway for python application to let them receive and send HTTP messages in the modern world. This include the following developments:
A branch will be created this month to handle the work for it. |
Beta Was this translation helpful? Give feedback.
-
I might just do that, but I can't commit to any particular timeline TBH |
Beta Was this translation helpful? Give feedback.
-
not mentioned by @benoitc, but IMO within that goal are applications that have different expectations about the kind of (possibly hostile) traffic they can receive, e.g. because they are not exposed to the internet (behind a firewall). |
Beta Was this translation helpful? Give feedback.
-
Gunicorn currently opens its advice for deployment with a blanket recommendation for a proxy server, particularly Nginx
The main reason provided is:
However:
It would be nice if the blanket recommendation for Nginx could be broken up into parts, such that a the particular trade-offs / risks become more clear.
In particular:
The only reason I could find in the docs to use Nginx is the denial-of-service attack through slow clients. Are there other reasons that Nginx is desirable? If so, it would be nice to list them, such that individuals can decide whether those reasons apply to them.*
Turning off request buffering is an (obvious) requirement when handling requests in a streaming manner (or long polling etc). Given that doing so puts us right back at the DOS that the first line of the docs warns about, can we be sure that "run[ning] with one of the async worker classes" is enough of a mitigation? Why so† ? Also: if it is sufficient to run with an async worker, wouldn't it be better to clarify this in the opening line of the document ("run with an async worker or run with the sync worker behind a proxy")?
More generally, isn't the need for a proxy almost entirely determined by the worker class? It's the worker class that does all of the reading/writing to sockets, which is what the weak point w.r.t. DOSsing is, right? E.g. the uvicorn docs state that "Using Nginx as a proxy in front of your Uvicorn processes may not be necessary".
The docs mention Hey as a mechanism to test susceptibility to slow clients, but I cannot find a command-line argument to Hey to specifically act as a slow client. Wouldn't it be better to point to e.g. slowloris.py?
===
*A partial answer to this question that I can come up with myself is:
a. virtual hosts
b. serving static files
c. SSL conf (gunicorn provides SSL, but perhaps not as configurable?)
d. "general hardening" (parsing of HTTP?)
†My guess based on my reading of the code is: because they do not block on read, and do not spawn threads/processes per request. But this raises the question: is there anything that could/should be configured, such as the number of connections to accept per worker, and the request timeout? All of this should be explained in the docs.
Beta Was this translation helpful? Give feedback.
All reactions