-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
invalid character in header #5269
Comments
At glance, this looks like it may be a duplicate of #5220 |
@zhaoxuan this needs a reproducer. Without it, we'll have no idea how to trigger this behavior and won't even be able to confirm whether it's a bug. |
@webknjaz it was weird, I added some log in |
@zhaoxuan we need to have a simple aiohttp app that reproduces the issue posted here. Even if it's a "hello world". The simpler the better, with no other deps than aiohttp. P.S. You cannot log what |
@webknjaz may be we could add a debug level logging to |
I'm curious to see the concrete proposal for debug logging. |
@webknjaz My app contains IO operation (MySQL Redis Memcache), so a simple aiohttp app could not reproduce this issue.
I am not sure it would work. |
There are still unresolved issues for this error.
|
@zhaoxuan we need a reproducer without aiocache. |
I do not know if it is related, but i got the same error, with the public server. I want to download all images in process of MyHeritage family tree migration. after i extract all URL i want to use aiohttp client to download them. Here is simplified example with only one (real) URL: import asyncio
import aiohttp
async def adown(session, url):
async with session.get(url) as resp:
if resp.status == 200:
await response.read()
return url, destf
async def main(urls):
tasks = []
async with aiohttp.ClientSession() as http:
for url in urls:
tasks.append(asyncio.create_task(adown(http, url)))
for result in await asyncio.gather(*tasks, return_exceptions=True):
print(result, type(result))
URL = "https://www.myheritageimages.com/P/storage/site181607832/files/50/10/16/501016_030952g968apad0ff5615z.jpg"
asyncio.run(main([URL])) With result (original in one line):
I can download all these URL with wget, it reports these headers (for this one):
Debian testing, Python 3.9.2, aiohttp 3.7.4 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I had such when header \x00Date: ... |
Can confirm that the solution stated here works: #5355 (comment) Basically: Seems that the underlying C http parsing library might be broken. |
I've seen this solution too and tried looking into it. |
As far as I can tell, no functionality is lost. Nonetheless, performance is degraded as mentioned here: https://github.com/aio-libs/aiohttp/blob/09ac1cbb4c07b4b3e1872374808512c9bc52f2ab/CHANGES/3828.feature Using super dumb timeit tests I've seen a drop of about 30% in performance. import asyncio
import os
os.environ.setdefault('AIOHTTP_NO_EXTENSIONS', '0')
from aiohttp import ClientSession
url = 'https://www.java.com/'
async def dorequest(url):
async with ClientSession() as session:
async with session.get(url) as response:
return await response.text()
def main():
result = asyncio.run(dorequest(url))
if __name__ == '__main__':
import timeit
print(timeit.timeit("main()", setup="from __main__ import main", number=150)) |
This works for me if I set is as a environment variable. However, It does not matter whether I set |
HTTP parser is the bottleneck in web apps and so it's implemented as an optional C-extension most of which is coming from This is the reason for the performance drop. |
Has anyone found a workaround for this? I am making a request to an API I myself can not fix. But this works fine with requests lib. How could the "invalid character in header" be fixed? |
Set |
I read somewhere, that it can leads to 30% performance drop, and this is significant drop, thus not solution at all, only workaround for simple cases. |
Man, you have a broken HTTP server. Slow processing is better than nothing in such case, isn't it? |
Encountered this at work. There are a few things which make this particularly frustrating. First, the Second, I had to hack aiohttp to even be able to print out the raw response which was causing the issue. Third, even after getting the raw response, it was very difficult to find what was actually wrong in the response. After quite some time, I found a Set-Cookie header with a Figuring out this information should be much quicker, so we can report the problems to the server owner. For servers where this is not possible, there should probably also be a way to discard any headers with invalid characters and continue parsing the rest of the response. I think changing the 400 output of the error message should be trivially fixed by us, but everything else may need some changes in llhttp. |
Are you setting the lenient headers flag? |
I'm rather short on time, so if anybody wants to pick this up and try setting the lenient option, that would be great. |
I simply want to share that adding |
Because it then doesn't use llhttp, which means it is a lot slower. So, not recommended for production. |
I've added some information to the exceptions to highlight what character is causing a parse error (in 3.8.5). |
After reviewing this again, it should be safe to enable lenient headers in the response parser (i.e. client side only). This should be enabled in the next release (#7490). The parser will revert to being strict when using dev mode ( |
🐞 Describe the bug
💡 To Reproduce
I do not have any idea about it, the CPU pressure was not hight.
📋 Your version of the Python
Python == 3.6.9
📋 Your version of the aiohttp/yarl/multidict distributions
The text was updated successfully, but these errors were encountered: