-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix: send blobs when running ipfs-http-client in the browser #3184
Conversation
To support streaming of native types with no buffering, normalise add input to async iterators in node and Blobs in the browser. That is, if the user passes a blob in the browser leave it alone as enumerating blob contents cause the files to be read. Browser FormData objects do not allow you to specify headers for each multipart part which means we can't pass unixfs metadata via the headers so we turn the metadata into a querystring and append it to the field name for each multipart part as a workaround. BREAKING CHANGES: - Removes the `mode`, `mtime` and `mtime-nsec` headers from multipart requests - Passes `mode`, `mtime` and `mtime-nsec` as querystring parameters appended to the field name of multipart requests
cc @Gozala I think this is a simpler approach than #3151, the PR is less than a quarter of the size, uses more native browser features and means we don't have to maintain custom implementations of Blob, File and FormData. We do change how we pass metadata to the server but so far the only implementation that supports metadata is js-IPFS so I think the impact will be low. |
@achingbrain I think there are some tradeoffs between two implementations that I would like to call out so they are considered.
Ultimately it is your decision to make and I hope above notes will inform it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Provided me feedback in the comments.
// Blob|File | ||
if (isBytes(input) || isBloby(input)) { | ||
return (async function * () { // eslint-disable-line require-await | ||
yield toFileObject(input, normaliseContent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above comment
browserStreamToIt | ||
} = require('./utils') | ||
|
||
module.exports = function normaliseInput (input, normaliseContent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have comment stating input and outputs of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not async function * normaliseInput(input, normaliseContent)
and yield from inside instead of returning all those self invoking generators. If there is a good reason it would be good to have comment otherwise I find this very counter intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No good reason, other than changing the method signature didn't seem relevant to solving the problem, which is that we currently buffer blob contents in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does make the code much smaller and we can remove a bunch of artificially induced async. I'm sold, have made the change.
// String | ||
if (typeof input === 'string' || input instanceof String) { | ||
return (async function * () { // eslint-disable-line require-await | ||
yield toFileObject(input, normaliseContent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a lot easier to follow this code it was yield { content: await normaliseContent(input) }
or yield toFileObject({ content: input }, normaliseContent)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can split the single/multiple codepaths in a future PR, that should make the logic easier to follow.
packages/ipfs-core-utils/src/files/normalise-input/normalise-content.js
Outdated
Show resolved
Hide resolved
blobToIt | ||
} = require('./utils') | ||
|
||
function toAsyncIterable (input) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know node does not have built-in ReadbleStream
s (although some libraries have them), but by splitting code paths passing one in node would error. Maybe that is ok, but then not sure why blobs are accounted for and not the streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably an oversight. We can split the single/multiple codepaths in a future PR too, that should make the logic easier to follow.
* ``` | ||
* | ||
* @param input Object | ||
* @return AsyncInterable<{ path, content: AsyncIterable<Buffer> }> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return type annotation is misleading. In browser context return type is AsyncInterable<{ path:string, content:Blob }>
it would be nice to reflect that in a comment or change annotation to:
@typedef {AsyncIterable<Buffer> & Blob} Content
@param {any} input
@returns {AsyncInterable<{ path:string, content: Content }>}
That would mean that return type implements both Blob
& AsyncIterable<Buffer>
interface which is still not true but it is better and makes tools like vscode able to assist you. If you do want to be true this would be
@type {AsyncInterable<{ path: string, content: AsyncIterable<Buffer> }>|AsyncInterable<{ path: string, content:Blob }>}
But that degrades inference because content
ends up being AsyncIterable<Buffer>|Blob
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module doesn't convert content to Blob
, only to AsyncIterable<Buffer>
. index.browser.js
converts to Blob
.
@@ -92,6 +79,21 @@ async function * parseEntry (stream, options) { | |||
} | |||
|
|||
const disposition = parseDisposition(part.headers['content-disposition']) | |||
const query = qs.parse(disposition.name.split('?').pop()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is go implementation ok with added query params ? As in is it going to ignore them ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test for this, it ignores them.
To support streaming of native types with no buffering, normalise add input to blobs and upload using native FormData when the http client is run in the browser. That is, if the user passes a blob to the http client in the browser leave it alone as enumerating blob contents cause the file data to be read. Browser FormData objects do not allow you to specify headers for each multipart part which means we can't pass UnixFS metadata via the headers so we turn the metadata into a querystring and append it to the field name for each multipart part as a workaround. Fixes #3138 BREAKING CHANGES: - Removes the `mode`, `mtime` and `mtime-nsec` headers from multipart requests - Passes `mode`, `mtime` and `mtime-nsec` as querystring parameters appended to the field name of multipart requests
To support streaming of native types with no buffering, normalise add input to blobs and upload using native FormData when the http client is run in the browser.
That is, if the user passes a blob to the http client in the browser leave it alone as enumerating blob contents cause the file data to be read.
Browser FormData objects do not allow you to specify headers for each multipart part which means we can't pass UnixFS metadata via the headers so we turn the metadata into a querystring and append it to the field name for each multipart part as a workaround.
Fixes #3138
BREAKING CHANGES:
mode
,mtime
andmtime-nsec
headers from multipart requestsmode
,mtime
andmtime-nsec
as querystring parameters appended to the field name of multipart requests