fix: send blobs when running ipfs-http-client in the browser #3184

achingbrain · 2020-07-20T15:34:58Z

Context:

This PR aims to improve performance of ipfs.add in ipfs-http-client in browser context by addressing findings from Embracing web native FormData / File where possible #3029.

Alternative approach can be found in feat: non-bufferring multipart body encoder #3151

To support streaming of native types with no buffering, normalise add input to blobs and upload using native FormData when the http client is run in the browser.

That is, if the user passes a blob to the http client in the browser leave it alone as enumerating blob contents cause the file data to be read.

Browser FormData objects do not allow you to specify headers for each multipart part which means we can't pass UnixFS metadata via the headers so we turn the metadata into a querystring and append it to the field name for each multipart part as a workaround.

Fixes #3138

BREAKING CHANGES:

Removes the mode, mtime and mtime-nsec headers from multipart requests
Passes mode, mtime and mtime-nsec as querystring parameters appended to the field name of multipart requests

To support streaming of native types with no buffering, normalise add input to async iterators in node and Blobs in the browser. That is, if the user passes a blob in the browser leave it alone as enumerating blob contents cause the files to be read. Browser FormData objects do not allow you to specify headers for each multipart part which means we can't pass unixfs metadata via the headers so we turn the metadata into a querystring and append it to the field name for each multipart part as a workaround. BREAKING CHANGES: - Removes the `mode`, `mtime` and `mtime-nsec` headers from multipart requests - Passes `mode`, `mtime` and `mtime-nsec` as querystring parameters appended to the field name of multipart requests

achingbrain · 2020-07-21T10:48:31Z

cc @Gozala I think this is a simpler approach than #3151, the PR is less than a quarter of the size, uses more native browser features and means we don't have to maintain custom implementations of Blob, File and FormData.

We do change how we pass metadata to the server but so far the only implementation that supports metadata is js-IPFS so I think the impact will be low.

packages/ipfs-core-utils/src/files/normalise-input/utils.js

Gozala · 2020-07-21T17:33:30Z

cc @Gozala I think this is a simpler approach than #3151, the PR is less than a quarter of the size, uses more native browser features and means we don't have to maintain custom implementations of Blob, File and FormData.

@achingbrain I think there are some tradeoffs between two implementations that I would like to call out so they are considered.

fix: send blobs when running ipfs-http-client in the browser #3184 File / Blob polyfills were introduced in node because that prevents having two different code paths based on runtime.
- In my experience that tends to be better for long term maintenance of the code base.
- Two are less likely to diverge or exhibit behavioral differences.
fix: send blobs when running ipfs-http-client in the browser #3184 is indeed is larger in size, because:
1. It adds ton of comments jsdoc style and inline explaining what code does or why (as someone new to codebase I really wish code was more commented because it's not always obvious what it does or why e.g. I'm still not sure what the feal with self executable async generators)
2. It introduces ton of new tests to ensure that regressions aren't introduced.
3. It uses custom FromData encoder (which contributes to the size as well), however that is because that is what was decided in our design review meeting.
fix: send blobs when running ipfs-http-client in the browser #3184 Will not buffer AsyncIterables's into Blobs in the normalization (unlike this implementation) so that js-ipfs (not the http-client) will not do the buffering before e.g. adding content to the node.
fix: send blobs when running ipfs-http-client in the browser #3184 It correctly handles iterators (as per Implementation bug in normaliseInput #3138).
fix: send blobs when running ipfs-http-client in the browser #3184 It opts most changed files into TS type-checking. _I know it's not exactly the goal of that patch, but it did uncover Implementation bug in normaliseInput #3138 gradually added types really help in working with complex code base)

Ultimately it is your decision to make and I hope above notes will inform it.

Gozala

Provided me feedback in the comments.

Gozala · 2020-07-21T20:17:05Z

packages/ipfs-core-utils/src/files/normalise-input/normalise-input.js

+  // Blob|File
+  if (isBytes(input) || isBloby(input)) {
+    return (async function * () { // eslint-disable-line require-await
+      yield toFileObject(input, normaliseContent)


Same as above comment

packages/ipfs-core-utils/src/files/normalise-input/utils.js

Gozala · 2020-07-21T20:19:51Z

packages/ipfs-core-utils/src/files/normalise-input/normalise-input.js

+  browserStreamToIt
+} = require('./utils')
+
+module.exports = function normaliseInput (input, normaliseContent) {


Would be nice to have comment stating input and outputs of this.

Why not async function * normaliseInput(input, normaliseContent) and yield from inside instead of returning all those self invoking generators. If there is a good reason it would be good to have comment otherwise I find this very counter intuitive.

No good reason, other than changing the method signature didn't seem relevant to solving the problem, which is that we currently buffer blob contents in memory.

It does make the code much smaller and we can remove a bunch of artificially induced async. I'm sold, have made the change.

Gozala · 2020-07-21T20:23:19Z

packages/ipfs-core-utils/src/files/normalise-input/normalise-input.js

+  // String
+  if (typeof input === 'string' || input instanceof String) {
+    return (async function * () { // eslint-disable-line require-await
+      yield toFileObject(input, normaliseContent)


It would be a lot easier to follow this code it was yield { content: await normaliseContent(input) } or yield toFileObject({ content: input }, normaliseContent).

We can split the single/multiple codepaths in a future PR, that should make the logic easier to follow.

packages/ipfs-core-utils/src/files/normalise-input/normalise-content.js

Gozala · 2020-07-21T21:21:03Z

packages/ipfs-core-utils/src/files/normalise-input/normalise-content.js

+  blobToIt
+} = require('./utils')
+
+function toAsyncIterable (input) {


I know node does not have built-in ReadbleStreams (although some libraries have them), but by splitting code paths passing one in node would error. Maybe that is ok, but then not sure why blobs are accounted for and not the streams.

It's probably an oversight. We can split the single/multiple codepaths in a future PR too, that should make the logic easier to follow.

Gozala · 2020-07-21T21:36:05Z

packages/ipfs-core-utils/src/files/normalise-input/index.js

+ * ```
+ *
+ * @param input Object
+ * @return AsyncInterable<{ path, content: AsyncIterable<Buffer> }>


Return type annotation is misleading. In browser context return type is AsyncInterable<{ path:string, content:Blob }> it would be nice to reflect that in a comment or change annotation to:

@typedef {AsyncIterable<Buffer> & Blob} Content @param {any} input @returns {AsyncInterable<{ path:string, content: Content }>}

That would mean that return type implements both Blob & AsyncIterable<Buffer> interface which is still not true but it is better and makes tools like vscode able to assist you. If you do want to be true this would be

@type {AsyncInterable<{ path: string, content: AsyncIterable<Buffer> }>|AsyncInterable<{ path: string, content:Blob }>}

But that degrades inference because content ends up being AsyncIterable<Buffer>|Blob.

This module doesn't convert content to Blob, only to AsyncIterable<Buffer>. index.browser.js converts to Blob.

packages/ipfs-core-utils/src/files/normalise-input/utils.js

Gozala · 2020-07-21T22:05:05Z

packages/ipfs/src/http/utils/multipart-request-parser.js

@@ -92,6 +79,21 @@ async function * parseEntry (stream, options) {
    }

    const disposition = parseDisposition(part.headers['content-disposition'])
+    const query = qs.parse(disposition.name.split('?').pop())


is go implementation ok with added query params ? As in is it going to ignore them ?

I've added a test for this, it ignores them.

To support streaming of native types with no buffering, normalise add input to blobs and upload using native FormData when the http client is run in the browser. That is, if the user passes a blob to the http client in the browser leave it alone as enumerating blob contents cause the file data to be read. Browser FormData objects do not allow you to specify headers for each multipart part which means we can't pass UnixFS metadata via the headers so we turn the metadata into a querystring and append it to the field name for each multipart part as a workaround. Fixes #3138 BREAKING CHANGES: - Removes the `mode`, `mtime` and `mtime-nsec` headers from multipart requests - Passes `mode`, `mtime` and `mtime-nsec` as querystring parameters appended to the field name of multipart requests

achingbrain added 8 commits July 20, 2020 16:27

chore: add missing dep

efc443b

chore: split content normalisation out

541b6c2

chore: fix up tests

0d6fdb5

fix: turn blob to iterator in test

95b6b8b

chore: fix http tests

5f91a2e

chore: only normalise to blob in the http client

634400b

chore: fix up tests

9ae9e77

achingbrain requested review from Gozala, jacobheun, lidel and vasco-santos July 21, 2020 10:48

Gozala reviewed Jul 21, 2020

View reviewed changes

packages/ipfs-core-utils/src/files/normalise-input/utils.js Outdated Show resolved Hide resolved

Gozala reviewed Jul 21, 2020

View reviewed changes

Gozala mentioned this pull request Jul 22, 2020

feat: non-bufferring multipart body encoder #3151

Closed

lidel mentioned this pull request Jul 22, 2020

Embracing web native FormData / File where possible #3029

Closed

achingbrain added 5 commits July 22, 2020 10:53

chore: add test for passing metadata in field name to go-ipfs

8352346

chore: remove redundant buffer check

4407e45

chore: rename isbloby to isblob

044e275

chore: rename isbloby to isblob

9a6d98b

chore: convert normalise input to just yield from the function

d0e04db

achingbrain changed the title ~~fix: normalise add input to blob in the browser~~ fix: normalise add input to blob when running ipfs-http-client in the browser Jul 22, 2020

achingbrain added 2 commits July 22, 2020 14:37

chore: add tests around browser readable streams of things

078bd4a

chore: adds tests for more input types

b8a869c

achingbrain changed the title ~~fix: normalise add input to blob when running ipfs-http-client in the browser~~ fix: send blobs when running ipfs-http-client in the browser Jul 23, 2020

achingbrain merged commit 6b24463 into master Jul 23, 2020

achingbrain deleted the fix/normalise-add-input-to-blob branch July 23, 2020 09:01

lidel mentioned this pull request Aug 4, 2020

Streaming versions of ipfs.files.add buffer entire data in memory before doing HTTP POST #2863

Closed

Gozala mentioned this pull request Aug 10, 2020

fix: file upload without buffering ipfs/ipfs-webui#1534

Merged

2 tasks

github-actions bot mentioned this pull request Feb 4, 2022

chore: release master #4041

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: send blobs when running ipfs-http-client in the browser #3184

fix: send blobs when running ipfs-http-client in the browser #3184

achingbrain commented Jul 20, 2020 •

edited

Loading

achingbrain commented Jul 21, 2020 •

edited

Loading

Gozala commented Jul 21, 2020 •

edited

Loading

Gozala left a comment

Gozala Jul 21, 2020

Gozala Jul 21, 2020

Gozala Jul 21, 2020

achingbrain Jul 22, 2020

achingbrain Jul 22, 2020 •

edited

Loading

Gozala Jul 21, 2020

achingbrain Jul 22, 2020

Gozala Jul 21, 2020

achingbrain Jul 22, 2020

Gozala Jul 21, 2020

achingbrain Jul 22, 2020 •

edited

Loading

Gozala Jul 21, 2020

achingbrain Jul 22, 2020

fix: send blobs when running ipfs-http-client in the browser #3184

fix: send blobs when running ipfs-http-client in the browser #3184

Conversation

achingbrain commented Jul 20, 2020 • edited Loading

achingbrain commented Jul 21, 2020 • edited Loading

Gozala commented Jul 21, 2020 • edited Loading

Gozala left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

achingbrain Jul 22, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

achingbrain Jul 22, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

achingbrain commented Jul 20, 2020 •

edited

Loading

achingbrain commented Jul 21, 2020 •

edited

Loading

Gozala commented Jul 21, 2020 •

edited

Loading

achingbrain Jul 22, 2020 •

edited

Loading

achingbrain Jul 22, 2020 •

edited

Loading