Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multipart feature #273

Merged
merged 5 commits into from
Mar 11, 2015
Merged

Multipart feature #273

merged 5 commits into from
Mar 11, 2015

Conversation

kxepal
Copy link
Member

@kxepal kxepal commented Feb 12, 2015

I would like to contribute multipart feature from my aiocouchdb project.

This PR includes introduces following things:

  • Multipart reader. It was designed with stream processing in mind and ability to skip body parts which aren't need to be processed with almost no footprint. This works simple:
resp = yield from aiohttp.request(...)

# First, we need to wrap our response via special classmethod of MultipartReader.
# This needs to keep implementation of MultipartReader separated from response 
# and connection routines, keeping his implementation portable.
reader = MultipartReader.from_response(resp)

while True:
   # Fetching next part
  part = yield from reader.next()
  # The returned BodyPartReader instance holds only headers and response stream 
  # pointed on his content.

  if part is None:
    # no more parts left to process, breaking out
    break

  # filename is a special property which automagically extracts filename param from
  # Content-Disposition header. You may also try to make it by your own though access
  # to part.headers CIMultiDict, but believe me, this header value logic is full of pain.
  # see http://greenbytes.de/tech/tc2231
  if part.filename != 'foo.txt':
      # Here we continue loop without reading part body.
      # On the next `reader.next()` call his body will be read to the void and
      # new part will be returned.
      continue

  # The BodyPartReader API is very similar to ClientResponse one, but with own specifics.
  # The decode flag to let our content passed though decoding routines which are
  # care about values  defined in Content-Encoding and Content-Transfer-Encoding headers.
  # Other methods like `.json`, `.form`, `.text` automagically decodes content.  
  # Additionally there is `.read_chunk` coroutine method to read body by chunks,
  # and obliviously without any decoding magic.
  data = yield from part.read(decode=False)

  # You can also decode data by your own
  data = part.decode(data)
  break

# To release connection and read whole remain response payload to the void
# you may either call
# yield from resp.release()
# or
yield from reader.release()
  • Multipart writer. It was also designed for streaming nature in mind with helping automagic in the box:
with MultipartWriter('mixed') as writer:
  # Add file object as a part. name, content length and type
  # will be automagically recognised for you
  part = writer.append(open(__file__), {ETAG: '0xABCDEFG'}) 
  # You may also set additional headers by your own
  part.headers[CONTENT_ENCODING] = 'gzip'

  # If you're man-of-JSON here is special helper
  # which sets content type and applies encoding for you
  writer.append_json({'foo': 'bar'})


  # Multipart is recursive format so nesting 
  with MultipartWriter('form-data') as subwriter:
    part = writer.append_form({'bar': 'baz'})
    # Content-Disposition is hard header, so there is a helper to work with it
    part.set_content_disposition('form-data', name='abc')
  writer.append(subwriter)

# That's all what you need. All the magic happens internally during serialization.
# All JSON will be encoded properly via json module, all files will be read and encoded
# with respect to specified encoding. Parts with Content-Encoding or Content-Transfer-Encoding
# will get additionally encoded by the specified instructions.
yield from aiohttp.request(data=writer)
  • FormData remains active, but now based on multipart.MultipartWriter. Removing it will break a lot of code and ways aiohttp been used. I don't respect those, but do compatibility.
  • One nasty bug was found in aiohttp.web, but kept it untouched.

cgi.FieldStorage doesn't likes body parts with Content-Length which
were sent via chunked transfer encoding. This fix doesn't saves
aiohttp.web from failure for third party clients, but at least it
continue work correct for aiohttp one.
@asvetlov
Copy link
Member

I like the idea in general.

About .next() method: I don't like returning None on end of post data.
Maybe exception is more appropriate solution?

@asvetlov
Copy link
Member

Should aiohttp.web use new multipart parser?

@kxepal
Copy link
Member Author

kxepal commented Feb 13, 2015

About .next() method: I don't like returning None on end of post data.
Maybe exception is more appropriate solution?

I didn't found any good standard exception that could be used in that place. StopIteration might be an only good candidate, but, imho, it's not what is expected here. TBH, I borrowed that pattern from some python-tulip thread where Guido replied on question "how to iterate over coroutine". In aiocouchdb I used it a lot for iteration over streaming data and didn't found any need to replace it with something. What could you suggest?

Should aiohttp.web use new multipart parser?

It's worth to, but I didn't start work with aiohttp.web much enough (shame on me!) to understand where "to harness horses". Need to implement something on it with multipart data exchange to understand that.

@fafhrd91
Copy link
Member

@kxepal i merged your changes. Would you mind to add new section to docs.

@fafhrd91
Copy link
Member

@kxepal is it possible to do something like without skipping data:

reader = MultipartReader.from_response(resp)
data = {}

part = yield from reader.next()
while part is not None:
   data[part.name] = part
   part = yield from reader.next()

itemData = yield from data["name"].read()

@kxepal
Copy link
Member Author

kxepal commented Mar 11, 2015

Oh, great! Will send PR shortly.

@lock
Copy link

lock bot commented Oct 30, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 30, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants