-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multipart.BodyPartReader appends CRLF #454
Comments
mmm...true. And I know how I avoid that problem: in all my cases binary data falls into read_chunk branch as they always have content length defined while lenghtless parts always fit one line. Eventually. The problem is in readline method which is used when body part length is unknown since that's the only safe way to parse it. However, it also means that to fix this issue it have to look forward on one line in order to strip accidental CRLF from the last data line. I always though that The test therefore is valid for the wrong behaviour (: Will prepare the fix soon. If you have own ideas about feel free to share (: |
Thank, I'll give it a try tomorrow! I also didn't find find a way to read in chunks except by using readline; Thats not really ideal either as there is no guarantee that there are any newlines. The boundary marker has a fixed maximal length (70 bytes) so it should be easy enough to read + scan in chunks. Am I missing something? |
@vbraun That's easy:
The only requirement is that body part must define Content-Length header. Ideally, of course it's possible to read by chunks lengthless parts, but it's hard: boundary may be split across two or, even worse, 3+ chunks and I don't know what the sane algorithm there should be to correctly merge such partial reads, avoid false positive ahead reads and so on. Too many issues, pitfalls and edge cases for one missed header. |
#454: Don't append CRLF for the last read line
Fixed by #455 |
Thanks for the quick fix! And yes, I should have said that Chrome doesn't send a Content-Length header so The boundary is guaranteed to be <= 70 bytes so all you have to do is read ahead by 70 bytes. Just wrapping a stream to stop at given boundaries is straightforward. What makes chunked reads difficult with the current implementation is that this is not abstracted into a stream wrapper but handled across MultipartReader and BodyPartReader, handing _unread pieces around. |
I'm trying to handle binary flies uploaded with a polymer file-upload component (https://github.com/winhowes/file-upload). On the browser side (chrome) that just sticks the file in a
FormData
object, so I believe the sent form is correct. I'm usingto parse it on the server side and always get a trailing CRLF added to
data
.As an example, the browser sends
and data (incorrectly) ends with
[...]\0\0\0\0\0\0\r\n
. Looking at the aiohttp tests, this is by design:But that seems wrong to me, the CRLF is supposed to be part of the boundary and should NOT be included in the body part.
The relevant RFC http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html says: "NOTE: The CRLF preceding the encapsulation line is considered part of the boundary so that it is possible to have a part that does not end with a CRLF (line break). Body parts that must be considered to end with line breaks, therefore, should have two CRLFs preceding the encapsulation line, the first of which is part of the preceding body part, and the second of which is part of the encapsulation boundary."
@kxepal feature #273
The text was updated successfully, but these errors were encountered: