-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix connection reset when over max content size #2051
fix connection reset when over max content size #2051
Conversation
I think this could be a Firefox bug, as I'm fairly sure it is valid to send a response at any point after the headers have been received and chrome seems happy with the early response. I worry this approach commits Flask servers to waiting for and reading request data it has no use for - which seems to reduce a denial of service mitigation. |
Fairly sure Werkzeug assumes this is the responsibility of the WSGI server layer, similarly to ensuring that input is terminated. It's not clear in the spec. Is this something Gunicorn can address more safely/efficiently? What do other WSGI servers do? What do ASGI servers do? That said, there have been other requests related to this, and it seems like a special case where the application is deliberately reading less than the content length. If we were to fix this, I'd rather get the work done on #1513 instead of adding specific behavior to form parsing. |
+1 to @pgjones' concern that this increases vulnerability to denial-of-service attacks. How easy would this make it for a malicious client to get Werkzeug to spend arbitrary amounts of its time exhausting huge bodies of bogus requests, instead of servicing real requests of good clients? Is it possible this is (at least in part) Firefox's fault? What does Chrome do? What does curl do? For example, when sending PUT requests (or POST requests larger than 1024 bytes), by default curl does the Also, doesn't this duplicate some functionality we already have in |
@asottile I just realized your example uses Gunicorn with its default sync worker, but doesn't put Gunicorn behind a reverse proxy, which is not how the sync worker is meant to be deployed. Can you please test this again but passing |
werkzeug doesn't -- or at least there are other cases where werkzeug indicates intent of exhausing the input stream: werkzeug/src/werkzeug/formparser.py Line 268 in 865b19c
this is already the case in the two methods decorated with note also, that putting this behind a reverse proxy will cause the reverse proxy to consume the entire input and properly, cleanly close the connection -- so it has to happen somewhere
it does, as well as the api for
this only """fixes""" the problem because gunicorn acts as a reverse proxy in the gthread version -- it consumes the entire input stream itself before sending the response |
I'm not completely opposed to handling this in Werkzeug, but the existing support is inconsistent, see #1513. As you point out, the WSGI server or proxy can be responsible for consuming any remaining input, and I think that's probably the safer place to handle it. I don't think this is enabling a DOS any more than existing behavior, because we already only act on the stream if we know it's safe.
So currently, you can avoid this issue by using a WSGI server worker or HTTP proxy that exhausts the stream. If we want to address it more in Werkzeug, #1513 is where to go. |
I think I'll merge this, but it will probably become obsolete when we come up with a good solution for #1513. I don't want to block 2.0 on that, so this at least makes the behavior consistent for now. |
the current version of werkzeug fails to exhaust the input stream when it detects that the content exceeds the configured max size. when run with gunicorn, the stream is never exhausted which leads to firefox (and others) reporting a connection reset as they receive the response without the body being uploaded. here's a sample application: ```python from flask import Flask, request, redirect, url_for class Config: SECRET_KEY = 'foo' MAX_CONTENT_LENGTH = 1024 * 1024 * 1 app = Flask(__name__) app.config.from_object(Config) @app.route('/') def index(): return '''\ <!doctype html> <html> <head> <title>File Upload</title> </head> <body> <h1>File Upload</h1> <form method="POST" action="" enctype="multipart/form-data"> <p><input type="file" name="file"></p> <p><input type="submit" value="Submit"></p> </form> </body> </html> ''' @app.route('/', methods=['POST']) def upload_file(): request.files['file'] print('uploaded!') return redirect(url_for('index')) ``` when run with gunicorn: ```bash strace -f gunicorn --bind 127.0.0.1:8080 app:app --access-logfile - 2> log ``` the strace indicates the following happens: ``` [pid 24372] recvfrom(9, "POST / HTTP/1.1\r\nHost: localhost"..., 8192, 0, NULL, NULL) = 8192 ... [pid 24372] sendto(9, "HTTP/1.1 413 REQUEST ENTITY TOO "..., 183, 0, NULL, 0) = 183 [pid 24372] sendto(9, "<!DOCTYPE HTML PUBLIC \"-//W3C//D"..., 196, 0, NULL, 0) = 196 ``` in this, werkzeug reads the first 8KB of the request, but then never any more and starts sending the response after the fix, werkzeug reads the entire input (and discards it) before sending a response
I'm working on #1513 about handling I don't think the "connection reset" behavior is incorrect in this case, and it's not clear how either Werkzeug or Gunicorn could otherwise handle it. If the app sets a max content length of 1MB, then the intention is "don't read more than 1MB". If a 10MB file is uploaded, it shouldn't be read past 1MB, that would defeat the purpose of the limit. You'd also get this if you crafted a request that sent more data than its content length indicated, since we don't read past that. Nginx's docs on |
See #2620. The application never exhausts the stream anymore, and will only read up to |
the current version of werkzeug fails to exhaust the input stream
when it detects that the content exceeds the configured max size.
when run with gunicorn, the stream is never exhausted which leads
to firefox (and others) reporting a connection reset as they
receive the response without the body being uploaded.
here's a sample application:
when run with gunicorn:
strace -f gunicorn --bind 127.0.0.1:8080 app:app --access-logfile - 2> log
the strace indicates the following happens:
in this, werkzeug reads the first 8KB of the request, but then never
any more and starts sending the response
after the fix, werkzeug reads the entire input (and discards it)
before sending a response
Checklist:
CHANGES.rst
summarizing the change and linking to the issue... versionchanged::
entries in any relevant code docs. N/Apre-commit
hooks and fix any issues.pytest
andtox
, no tests failed.