Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ASCII headers are handled incorrectly by CurlAsyncHTTPClient #1608

Closed
Sjord opened this issue Dec 17, 2015 · 0 comments
Closed

Non-ASCII headers are handled incorrectly by CurlAsyncHTTPClient #1608

Sjord opened this issue Dec 17, 2015 · 0 comments

Comments

@Sjord
Copy link

Sjord commented Dec 17, 2015

If a server returns a non-ASCII character in a header, the curl client should still handle the response, but it raises an UnicodeDecodeError instead.

Code:

yield CurlAsyncHTTPClient().fetch('https://httpbin.org/response-headers?X-Foo=bl%C3%A9')

Got:

Traceback (most recent call last):
  File "/home/sjoerd/.virtualenvs/protomon/lib/python3.4/site-packages/tornado/curl_httpclient.py", line 453, in _curl_header_callback
    header_line = native_str(header_line)
  File "/home/sjoerd/.virtualenvs/protomon/lib/python3.4/site-packages/tornado/escape.py", line 222, in to_unicode
    return value.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 9: invalid continuation byte

RFC 7230 says:

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset, supporting other charsets only through use of RFC2047 encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant