-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Embracing web native FormData / File where possible #3029
Comments
As per HTTP API docs
I suppose that is the problem, that can't be immediately addresses via browser APIs. However I wonder if it is worth consider browser limitation in HTTP API so that such metadata could be encoded differently. Given that
|
Did some investigation which leads me to believe that directory structures are not a problem. Following code let data = new FormData()
let dir = new File([], 'folderName', {
lastModified: new Date('1995-12-17T03:24:00'),
type: 'x-directory'
})
let file = new File(['could uploded file'], 'folderName/demo.txt', {
lastModified: new Date('1995-12-18T03:24:00'),
type: 'plain/text'
})
data.append('file', dir)
data.append('file', file)
fetch('/post', {
method: "POST",
body: data
}) Produces following request body:
However, it also turns out that |
Now as far as I can tell there are three other headers ( js-ipfs/packages/ipfs-http-client/src/lib/multipart-request.js Lines 29 to 43 in cb5b9ec
I wonder if we could allow providing metadata as prefix of the content e.g. following code let data = new FormData()
let dir = new File([], 'folderName', {
lastModified: new Date('1995-12-17T03:24:00'),
type: 'x-directory'
})
let rawFile = new File(['could uploded file'], 'demo.txt')
let stat = [
`mode: 0666`,
`mtime: ${new Date('1995-12-18T03:24:00').getTime()}`,
`\r\n`
]
let fileWithMeta = new Blob([stat.join('\r\n'), rawFile], {
type: 'x-file'
})
data.append('file', dir)
data.append('file', fileWithMeta, 'folderName/demo.txt')
fetch('/post', {
method: "POST",
body: data
}) Produces following request body
This also would not require breaking change to the HTTP API, but rather backwards compatible extension that recognizes |
Alternatively we could implement our own class FormDataWriter {
parts: Blob[]
constructor() {
this.parts = []
}
append(name, value, filename, headers = {}) {
const partHeaders = []
const partFilename =
filename != null
? encodeURIComponent(filename)
: value instanceof Blob
? "blob"
: value instanceof File
? encodeURIComponent(value.name)
: null
const contentDisposition =
partFilename == null
? `form-data; name="${name}"`
: `form-data; name="${name}"; filename="${partFilename}"`
partHeaders.push(`Content-Disposition: ${contentDisposition}`)
// File is subclass of Blob so we don't need to check for that.
// Also if `ContentType` header is passed no need for adding one.
if (value instanceof Blob && headers["Content-Type"] == null) {
partHeaders.push(`Content-Type: ${value.type}`)
}
for (const [name, value] of Object.entries(headers)) {
partHeaders.push(`${name}: ${value}`)
}
this.parts.push(new Blob([partHeaders.join("\r\n"), "\r\n", value]))
}
toBlob(boundary = `-----------------------------${nanoid()}`) {
const chunks = []
for (const part of this.parts) {
chunks.push(`--${boundary}\r\n`)
}
chunks.push(`\r\n--${boundary}--\r\n`)
return new Blob(chunks)
}
} With that former example could be updated to include custom headers with parts: var data = new FormDataWriter()
let dir = new File([], 'folderName', {
lastModified: new Date('1995-12-17T03:24:00'),
type: 'x-directory'
})
let file = new File(['could uploded file'], 'demo.txt')
data.append('file', dir)
data.append('file', file, 'folderName/demo.txt', {
mode: '0666',
mtime: file.lastModified,
})
let boundary = `-----------------------------${nanoid()}`
fetch('/post', {
method: "POST",
body: data.toBlob(boundary),
headers: {
'Content-Type': `multipart/form-data; boundary=${boundary}`
}
}) Which will produces following request body:
However there is a major downside to this approach:
|
We use The problem of buffering entire thing by My thoughts on the matter:
Does this sound sensible? |
It still possible to get progress reporting by using old |
I do not believe we need to choose between metadata and no buffering. I believe my last two comments illustrate how we could get both. That being said I believe it would require either:
It is worth considering trade-offs between those two, I personally would prefer to keep client as lean & simple possible so that use of HTTP API requires no client library, it's just there for the convenience. |
Breaking changes to HTTP API If support for opt-in But the need for a fix is there.
I believe that, as a bare minimum, we need to improve Importing files with ipfs-webui (over HTTP API) should be fast and work with files >4GB. |
I should point out that changes I mention are not breaking changes, it would just allow passing those options in an alternative way. If users choose to not pass those, or pass them in an old fashion it should still work. |
I will try to drive this effort as my time allows (right now I'm blocked on reviews). With that I intend to do following:
As with #3022 I remain very skeptical of keeping |
Unless I'm missing something I think the simplest thing to do here is to write a browser-specific version of /packages/ipfs-http-client/src/lib/multipart-request.js that probes the input - if it's a Blob or
There's always the age old hack of using it-buffer-stream or something similar to generates buffers filled with 0s - 4 gigs of data will be transferred but IPFS block de-duping will only write one block to the blockstore and no data files need to be added to the tests. |
i could have this working in Safari, with near 300mb video, crash in chrome, brave, firefox... i think FormData is a must for browser... all the buffer stream method won't work in browser when it reach 300mb.
|
For anyone following along, there are two PRs that aim to (among other things) improve performance of
|
Was fixed by #3184 |
As discussed at #2838 (comment) mainstream browsers today do not yet support passing
ReadableStream
as request body. To workaround this limitation currently JS-IPFS encodes stream of content (on ipfs.add / ipfs.write) into multipart/form-data by buffering it:js-ipfs/packages/ipfs-http-client/src/add.js
Lines 13 to 23 in cb5b9ec
js-ipfs/packages/ipfs-http-client/src/lib/multipart-request.js
Lines 58 to 64 in cb5b9ec
js-ipfs/packages/ipfs-http-client/src/lib/to-stream.browser.js
Lines 11 to 21 in cb5b9ec
This is problematic for webui for instance as some files may not fit into memory or exceed browser quota.
Modern browsers do have native
FormData
andFile
/Blob
primitives that if used carefully can facilitate streaming. It appears that in the past browser loaded all of the FormData / File / Blob content duringfetch
despite not reading it. However that evidently had being fixed and modern browsers (Tested with Firefox, Chrome, Safari) no longer exhibit that problem.Following example was used to verify this behavior
https://ipfs.io/ipfs/QmWVrTAeA1FqRSK3owpCfvB69wsz2Dbr2THK6ehn5uEKbp
On my Mac I do not observe visible increase memory when I uploading 5 Gig file (across mentioned browsers). I also observe streaming behavior as as file size change in WebUI is changing during upload:
This investigation suggests that there is an opportunity to avoid buffering overhead in browsers. It also appears that doing it for
ipfs.files.write
should be fairly straight forward since it always writes a single file.For
ipfs.add
things appear more complicated, as headers need to be added to each multipart part (E.g. unix mode & mtime). Some investigation is required to see how visible that is with nativeFormData
. There are also some concerns about supporting directory structures.The text was updated successfully, but these errors were encountered: