-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/http desync #2181
Fix/http desync #2181
Conversation
@benoitc all yours! |
gunicorn/http/message.py
Outdated
content_length = value | ||
elif name == "TRANSFER-ENCODING": | ||
chunked = value.lower() == "chunked" | ||
normalized_value = value.lower() | ||
if normalized_value == "identity": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gunicorn is a gateway so it shouldn't have a strong opinion about the transfer encoding. I think it's better to only test if the transfer encoding header is chuned or not.
if value.lower() == "chunked":
chunked = True
and let it pass for other cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removes the need of UnsupportedTransferEncoding
also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your proposal can still be attacked by the following desync:
Content-Length: 5
Transfer-Encoding: chunked
Transfer-Encoding: Identity
Most proxy will use the chunked, but we will use the CL.
My understanding of Transfer-Encoding is that its a hop-by-hop header, meaning that we should parse it, use it and strip it before passing the data down the line.
For example, AWS ALB will not accept anything that is not identity or chunked. If you read my improvements section above, I suggest that we do something similar to other WSGI server: parse the request to the best of our ablity and only transfer data to the application using a CL. We do buffer the data in this application anyway and we don't support gzip, compress and deflate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reading a bit more about PEP333. It says: WSGI servers must handle any supported inbound "hop-by-hop" headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.
There is a great thread here to read: pallets/flask#367
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gunicorn has a long history of supporting chunked encoding and last 20 is having the support of wsgi.input_terminated
, i'm not inclined to change this actually.
if I understand your concern, in the case of your example
Content-Length: 5
Transfer-Encoding: chunked
Transfer-Encoding: Identity
TE should be Identity and not chunked? Or do we want it to be chunked (which is really weird).
My main concern is to make sure we support the last TE given whatever it is (there are other cases than "Identity" in some apis), so we must not return an error, if it's different.
Thanks for the patch! Seems to handle the case. Can you do th erequested changes? It would be good to make a release today. |
This is a first try at fixing #2176
I have looked at multiple server implementation and frankly I have seen a lot of different things especially concerning the TE parsing.
In summary the fixes are:
No whitespace is allowed between the header field-name and colon
duplicate Content-Length header fields have been generated or combined by an upstream message processor, then the recipient MUST either reject the message as invalid or replace the duplicated field-values with a single valid Content-Length
identity
andchunked
Transport-EncodingFuture improvements:
Dict[str, List[str]]
instead of aList[Tuple[str, str]]
for headers