Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

originalname was decoded incorrectly, when using postman to upload a file that filename is chinese #247

Closed
luoxzhg opened this issue Jul 13, 2021 · 9 comments

Comments

@luoxzhg
Copy link

luoxzhg commented Jul 13, 2021

here is request and response

POST /api/upload HTTP/1.1
User-Agent: PostmanRuntime/7.28.1
Accept: */*
Cache-Control: no-cache
Postman-Token: 31902fbb-c1bd-4da6-9c87-c51b5c4ddeca
Host: localhost:9100
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Content-Type: multipart/form-data; boundary=--------------------------078517390448579870852008
Cookie: connect.sid=s%3ATbR1eNmnMHmb8i_3eVHWAhEShpRb5UZ8.h9rs1J1DouLtj0xFhQYtxK4VD2gMQjI5loPgM%2F%2FoxSQ; token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjYwZDVhYmYyY2Y4MGFiMWM1MWJhYzFmYSIsInRlbGVwaG9uZSI6IjE4ODg4ODg4OCIsInJvbGVzIjpbInVzZXIiXSwiaWF0IjoxNjI2MTQ2NDc3LCJleHAiOjE2Mjg3Mzg0Nzd9.9pkxVHUNbIGyPaji0js10-WHMhcKWPUNCoB5pk_sjSA
Content-Length: 15412
 
----------------------------078517390448579870852008
Content-Disposition: form-data; name="file"; filename="劳动合同书.docx"; filename*="UTF-8''%E5%8A%B3%E5%8A%A8%E5%90%88%E5%90%8C%E4%B9%A6.docx"
<劳动合同书.docx>
----------------------------078517390448579870852008--
 
HTTP/1.1 201 Created
X-Powered-By: Express
Vary: Origin
Content-Type: application/json; charset=utf-8
Content-Length: 51
ETag: W/"33-u6i3N3dun6NErX7YzoW7IdStN5E"
Date: Tue, 13 Jul 2021 06:48:00 GMT
Connection: keep-alive
Keep-Alive: timeout=5
 
{"code":0,"data":"188888888/docs/��\b\ff.docx"}
@luoxzhg
Copy link
Author

luoxzhg commented Jul 13, 2021

using curl and browser upload file,without:

filename*="UTF-8''%E5%8A%B3%E5%8A%A8%E5%90%88%E5%90%8C%E4%B9%A6.docx"

there is no wrong!

@denis-manokhin
Copy link

Same issue with russian language.

Content-Disposition: form-data; name="files"; filename="Файл с русскими буквами в названии.jpeg"; filename*="UTF-8''%D0%A4%D0%B0%D0%B8%CC%86%D0%BB%20%D1%81%20%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%BC%D0%B8%20%D0%B1%D1%83%D0%BA%D0%B2%D0%B0%D0%BC%D0%B8%20%D0%B2%20%D0%BD%D0%B0%D0%B7%D0%B2%D0%B0%D0%BD%D0%B8%D0%B8.jpeg"

Decoding result on backend (fastify-multipart):

$08�; A @CAA:8<8 1C:20<8 2 =0720=88.jpeg

@mscdex
Copy link
Owner

mscdex commented Dec 19, 2021

This will be resolved in busboy v1.0.0, which will prefer the encoded filename parameter if it's available.

@cczw2010
Copy link

cczw2010 commented Mar 11, 2022

the newest version is also has this problem . neither postman or in normal form post. i do this work.

import iconv from "iconv-lite"
const filename = iconv.decode(Buffer.from(info.filename, 'latin1'), 'utf8');

because in the source code, the chunk is convert with latin1

@mscdex
Copy link
Owner

mscdex commented Mar 11, 2022

@cczw2010 Can you provide the raw multipart form data that reproduces the issue?

@cczw2010
Copy link

cczw2010 commented Mar 12, 2022

@cczw2010 Can you provide the raw multipart form data that reproduces the issue?
of course,thanks for attention。 i want not like to add code for my problem in both window and mac。

nuxt + vuetify

form direct submit without ajax 。

but i can not see the Content-Disposition: or form data .

POST /upload/office HTTP/1.1 
Host: localhost:8000 
Connection: keep-alive
Content-Length: 147037
Pragma: no-cache
Cache-Control: no-cache
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="98", "Google Chrome";v="98"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Origin: http://localhost:8000
Upgrade-Insecure-Requests: 1
DNT: 1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryKHBSfdBNTDD3OxO2
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Referer: http://localhost:8000/uploadtest
Accept-Encoding: gzip, deflate, br
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-US;q=0.7

Talend APi (like postman)

# reques header
Accept: */*
Accept-Encoding: gzip, deflate, br
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-US;q=0.7
Cache-Control: no-cache
Connection: keep-alive
Content-Length: 103960
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryc8Eaxx4wqC4nxgyJ
DNT: 1
Host: localhost:8000
Pragma: no-cache
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="98", "Google Chrome";v="98"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: none
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36


# payload
------WebKitFormBoundaryc8Eaxx4wqC4nxgyJ
Content-Disposition: form-data; name="file"; filename="3.jpg"
Content-Type: image/jpeg
------WebKitFormBoundaryc8Eaxx4wqC4nxgyJ
Content-Disposition: form-data; name="file1"; filename="中文.jpg"
Content-Type: image/jpeg

@mscdex
Copy link
Owner

mscdex commented Mar 12, 2022

For that form, it makes sense as there is no encoded version (in the style of RFC 8187) of the filename field being sent. So you'd have to use the workaround for re-decoding it as utf8 (although you don't need iconv for that as node has built-in support for utf8: Buffer.from(filename, 'latin1').toString('utf8')), although that's a bit tricky if the string was decoded from an encoded version of the field and was not latin1, so to be absolutely safe you'd need to check if the string is latin1 first.

What I could do though is add a default parameter charset config option specifically for multipart that would allow configuring which charset to use when parsing non-RFC8187-encoded parameters in part headers.

@cczw2010
Copy link

@mscdex

Thanks for the answer, i'd cheked the string first before the changed version.

@luoxzhg
Copy link
Author

luoxzhg commented Aug 15, 2022

if all chars are latin1, then re-decoding

        if (!/[^\u0000-\u00ff]/.test(file.originalname)) {
            file.originalname = Buffer.from(file.originalname, 'latin1').toString('utf8')
        }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants