Fix/http desync #2181

Sytten · 2019-11-19T03:41:35Z

This is a first try at fixing #2176
I have looked at multiple server implementation and frankly I have seen a lot of different things especially concerning the TE parsing.
In summary the fixes are:

Don't rstrip the names of headers
- Go server does that
- RFC 7230 says No whitespace is allowed between the header field-name and colon
Only accept one Content-Length
- Node does that
- RFC 7230 says duplicate Content-Length header fields have been generated or combined by an upstream message processor, then the recipient MUST either reject the message as invalid or replace the duplicated field-values with a single valid Content-Length
- It's easier to just deny that case than try to handle it and most likely induce a vulnerability
Only accept identity and chunked Transport-Encoding
- In this implementation, the order does not matter (it probably should). The Go implementation only uses the first value of the header.
- Seems to be in sync with the behaviour of AWS ALB
- All other valid (gzip, compress, etc.) and invalid TE will return a 501, since we don't have readers for them I figured this was the right move, but feel free to correct me

Future improvements:

Handle the duplicates of headers earlier in the parsing
Use a Dict[str, List[str]] instead of a List[Tuple[str, str]] for headers
Drop the chunked TE for WSGI to app, this is what Waitress decided to do here and I think it's a good idea since we already buffer the whole thing in the server anyway

Sytten · 2019-11-19T23:05:11Z

@benoitc all yours!

benoitc · 2019-11-20T08:30:20Z

gunicorn/http/message.py

                content_length = value
            elif name == "TRANSFER-ENCODING":
-                chunked = value.lower() == "chunked"
+                normalized_value = value.lower()
+                if normalized_value == "identity":


Gunicorn is a gateway so it shouldn't have a strong opinion about the transfer encoding. I think it's better to only test if the transfer encoding header is chuned or not.

if value.lower() == "chunked": chunked = True

and let it pass for other cases

This removes the need of UnsupportedTransferEncoding also

Your proposal can still be attacked by the following desync:

Content-Length: 5 Transfer-Encoding: chunked Transfer-Encoding: Identity

Most proxy will use the chunked, but we will use the CL.
My understanding of Transfer-Encoding is that its a hop-by-hop header, meaning that we should parse it, use it and strip it before passing the data down the line.
For example, AWS ALB will not accept anything that is not identity or chunked. If you read my improvements section above, I suggest that we do something similar to other WSGI server: parse the request to the best of our ablity and only transfer data to the application using a CL. We do buffer the data in this application anyway and we don't support gzip, compress and deflate.

reading a bit more about PEP333. It says: WSGI servers must handle any supported inbound "hop-by-hop" headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable. There is a great thread here to read: pallets/flask#367

Gunicorn has a long history of supporting chunked encoding and last 20 is having the support of wsgi.input_terminated, i'm not inclined to change this actually.

if I understand your concern, in the case of your example

Content-Length: 5 Transfer-Encoding: chunked Transfer-Encoding: Identity

TE should be Identity and not chunked? Or do we want it to be chunked (which is really weird).

My main concern is to make sure we support the last TE given whatever it is (there are other cases than "Identity" in some apis), so we must not return an error, if it's different.

gunicorn/http/message.py

benoitc · 2019-11-20T08:35:48Z

Thanks for the patch! Seems to handle the case. Can you do th erequested changes? It would be good to make a release today.

Sytten added 3 commits November 18, 2019 19:44

Remove default strip of header name

b798412

Handle duplicate content-length

bd8670b

Handle multiple transfer-encoding

f74324b

benoitc self-assigned this Nov 19, 2019

benoitc added bug :( Feature/Http top priority working on it :) labels Nov 19, 2019

benoitc requested changes Nov 20, 2019

View reviewed changes

benoitc assigned Sytten Nov 20, 2019

Remove strict check of Transfer-Encoding

ddf5e66

Sytten mentioned this pull request Nov 20, 2019

HTTP desync attack follow up pallets/werkzeug#1667

Closed

Add syttent to THANKS

be51323

benoitc merged commit c5be1ba into benoitc:master Nov 20, 2019

ZeddYu mentioned this pull request Dec 9, 2019

Non-proper handling of Content-Length and Transfer-Encoding: chunked headers netty/netty#9861

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/http desync #2181

Fix/http desync #2181

Sytten commented Nov 19, 2019

Sytten commented Nov 19, 2019

benoitc Nov 20, 2019

benoitc Nov 20, 2019

Sytten Nov 20, 2019

Sytten Nov 20, 2019 •

edited

Loading

benoitc Nov 20, 2019

benoitc commented Nov 20, 2019

Fix/http desync #2181

Fix/http desync #2181

Conversation

Sytten commented Nov 19, 2019

Sytten commented Nov 19, 2019

benoitc Nov 20, 2019

Choose a reason for hiding this comment

benoitc Nov 20, 2019

Choose a reason for hiding this comment

Sytten Nov 20, 2019

Choose a reason for hiding this comment

Sytten Nov 20, 2019 • edited Loading

Choose a reason for hiding this comment

benoitc Nov 20, 2019

Choose a reason for hiding this comment

benoitc commented Nov 20, 2019

Sytten Nov 20, 2019 •

edited

Loading