Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not use readline when reading the content of a part in the multipart reader #1535

Merged
merged 3 commits into from
Jan 26, 2017

Conversation

arthurdarcet
Copy link
Contributor

multipart uploads to an aiohttp server with a "large" file containing no new line fails with a ValueError('Line is too long'):

import aiohttp
import aiohttp.web
import asyncio


async def view(request):
	reader = await request.multipart()
	async for part in reader:
		v = await part.read(decode=True)
		print(part, len(v))
	return aiohttp.web.Response(text="OK")

async def server(loop):
	app = aiohttp.web.Application(loop=loop)
	app.router.add_post('/', view)
	handler = app.make_handler()
	await loop.create_server(handler, '0.0.0.0', 1111)

async def main(loop):
	srv = await server(loop)
	async with aiohttp.post('http://0.0.0.0:1111', data={'val':b'0' * (2 ** 16 - 100)}) as resp:
		print('--->', resp.status)
	print('--')
	async with aiohttp.post('http://0.0.0.0:1111', data={'val':b'0' * (2 ** 16 + 1)}) as resp:
		print('--->', resp.status)

loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))
<aiohttp.multipart.BodyPartReader object at 0x109c10f28> 65436
---> 200
--
Error handling request
Traceback (most recent call last):
  File "/Users/arthur/Documents/reaaad/lib/aiohttp/aiohttp/web_server.py", line 61, in handle_request
    resp = yield from self._handler(request)
  File "/Users/arthur/Documents/reaaad/lib/aiohttp/aiohttp/web.py", line 268, in _handle
    resp = yield from handler(request)
  File "poc.py", line 9, in view
    v = await part.read(decode=True)
  File "/Users/arthur/Documents/reaaad/lib/aiohttp/aiohttp/multipart.py", line 257, in read
    data.extend((yield from self.readline()))
  File "/Users/arthur/Documents/reaaad/lib/aiohttp/aiohttp/multipart.py", line 351, in readline
    line = yield from self._content.readline()
  File "/Users/arthur/Documents/reaaad/lib/aiohttp/aiohttp/streams.py", line 517, in wrapper
    result = yield from func(self, *args, **kw)
  File "/Users/arthur/Documents/reaaad/lib/aiohttp/aiohttp/streams.py", line 587, in readline
    return (yield from super().readline())
  File "/Users/arthur/Documents/reaaad/lib/aiohttp/aiohttp/streams.py", line 251, in readline
    raise ValueError('Line is too long')
ValueError: Line is too long
---> 500

This is because the StreamReader class only accepts lines up to 64KB.
I've updated the MultipartReader class to avoid readline when it doesn't really make sense, and to use read_chunk instead.

As far as I can see, it seems to work, but that might not be the right approach to solve this at all…

@arthurdarcet arthurdarcet force-pushed the multipart-long-line branch 2 times, most recently from fb441fa to f8bddd9 Compare January 9, 2017 16:04
@fafhrd91
Copy link
Member

fafhrd91 commented Jan 9, 2017

@kxepal
you are very popular :)

@codecov-io
Copy link

codecov-io commented Jan 9, 2017

Current coverage is 98.72% (diff: 100%)

Merging #1535 into master will decrease coverage by 0.20%

@@             master      #1535   diff @@
==========================================
  Files            30         30          
  Lines          6995       6989     -6   
  Methods           0          0          
  Messages          0          0          
  Branches       1169       1165     -4   
==========================================
- Hits           6920       6900    -20   
- Misses           37         53    +16   
+ Partials         38         36     -2   

Powered by Codecov. Last update 1d0750e...03101a0

@kxepal
Copy link
Member

kxepal commented Jan 9, 2017

@arthurdarcet
Good catch!

@fafhrd91
Ahaha! Indeed. As ours multipart feature (:

@@ -252,12 +252,8 @@ def read(self, *, decode=False):
if self._at_eof:
return b''
data = bytearray()
if self._length is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -501,6 +499,17 @@ def test_filename(self):
None)
self.assertEqual('foo.html', part.filename)

def test_reading_long_part(self):
# should stay larger than aiohttp.streams.DEFAULT_LIMIT
size = 2 ** 18
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this value be picked from aiohttp.streams.DEFAULT_LIMIT to be in sync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, i've updated the commit

@@ -255,14 +256,13 @@ def test_read_chunk_properly_counts_read_bytes(self):
self.assertEqual(b'.' * size, result)
self.assertTrue(obj.at_eof())

def test_read_does_reads_boundary(self):
def test_read_does_not_read_boundary(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, actually test reads the boundary (: In previous case it just was placed into unread buffer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure, but i think it reads the boundary then put in back in the stream (with unread_data) ; so it's as if it wasn't read

@arthurdarcet arthurdarcet force-pushed the multipart-long-line branch 3 times, most recently from 14f6fed to 279c009 Compare January 10, 2017 09:04
@fafhrd91
Copy link
Member

what is status of this ticket?

@fafhrd91
Copy link
Member

@kxepal ?

@kxepal
Copy link
Member

kxepal commented Jan 26, 2017

@fafhrd91 Sorry, a bit busy times. I have all the tickets you ping me stared in by inbox, will check them in few hours.

@fafhrd91
Copy link
Member

thanks!

@kxepal
Copy link
Member

kxepal commented Jan 26, 2017

+1

@fafhrd91 fafhrd91 merged commit 50f9808 into aio-libs:master Jan 26, 2017
@arthurdarcet arthurdarcet deleted the multipart-long-line branch March 6, 2017 11:21
@lock
Copy link

lock bot commented Oct 29, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 29, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants