-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Quirks of ipfs.add
API
#3137
Comments
I think
I think current API is great for the case 3, but at the expense of 1 and 2. Same code path also makes it really difficult to special case and optimize cases 1 and 2 because to decide the code path input needs to be probed, sometimes asynchronously, so it gets all normalized into import style input. |
ipfs.add(asyncIterable([Buffer.from('hi\n'), 'bye')]) The user should pass homogenous input. I don't think it's reasonable to try to cater for this sort of use. If the input is an (async)iterable we could watch the type of the contents of the and throw if it changes or take some other action but so far no-one has asked for this and if they did, we'd certainly recommend that they pass homogenous input instead.
I'm not massively against this, it's just trivial to accomplish the same thing with: const { cid } = await last(ipfs.add(blob)) ..which is why it never made it into the API in the first place. If you dislike using the In fact all of the examples above are (to my eye anyway) quite straightforward:
const { cid } = await last(ipfs.add(blob))
const added = await all(ipfs.add(blobs)) Or if you really just want an array of CIDs: const cids = await all(map(ipfs.add(blobs), added => added.cid))
// some sort of filter criteria
const onlyFiles = (file) => Boolean(file.content)
const added = await all(ipfs.add(filter(globSource('/dir/**/*'), onlyFiles)))
Having |
It is trivial, if you import library and will be even more so once JS engines provide a built-in equivalnts... However having to use library just to get a result of the API call seems like an incidental complexity. To me this feels like if async function were returning Another sign of this complexity is that collection can be empty, which should never happen here, but good APIs make impossible states impossible and not considering empty case is both discomforting & something that type-checkers (if you use one) will point out every time. Even if implementation under the hood would just do
Problem I'm hinting on is not that it is hard to get a result you want, but rather that API is tailored to a specific use case and all others (more common) use cases suffer with that extra bit of added complexity (be it importing a library, doing other mapping, etc...). It is also shows in the implementation. In #3022 I end up using different encoding strategies for each of these use case, now I'm facing very same issue with #3029 and logic to differentiate between them is really complex. Implementation complexity alone would not be a good argument, however if you consider that users also need to do little bit extra to go from async collections back to single result (in use case 1) or a sync collection (in use case 2) I really don't get what is the value. |
My problem is that dealing with homogenous input would have streamlined the logic, however current implementation would work even if inputs are homogenous and I'm guessing that is by a chance because all the If we're ok with changes that would assume homogenous inputs and start breaking when they are not please let me knew, in that case I would like to reflect that both in code and docs. |
AFAIK we do assume homogenous inputs and it'd be a case of garbage-in, garbage-out if they are not. That is, we don't have tests for non-homogenous inputs and the docs do not say they are supported so I think that's ok. Please submit PRs with tests & fixes where the docs can be improved or there are bugs in the implementation, but changing this API is not a priority. |
This is imo not straightforward for the average developer.
Sure, this does the job. But it's imo rather awkward to have a loop here. Sticking to this example, I then found an alternative way in the tests: https://github.com/ipfs/js-ipfs/blob/master/packages/interface-ipfs-core/src/add.js#L75:
That results in more elegant code. But adding an unknown dependency just to "fix" that lack of elegance of the API doesn't look like a great choice either. It adds complexity and cognitive overhead (e.g. devs reading over this may have a hard time telling what exactly is happening here). This API reminds me of v0.x of web3.js, which had a lot of magic (individual endpoints trying to cover a plethora of possible uses based on the parameters given) like this. |
I seriously miss the old IPFS release where you could simply add a single file easily without importing async iterators. In my 10 years of development I don't think I've struggled so much and I've previously been using IPFS since 2017. In the current example from the API:
You need to serve a file from urlSource, or you can iterate a It's almost seems as if the current version of |
Like I said, I'm not massively against this. There's even precedent in our own codebase in that interface-datastore and ipld have separate methods for single/multiple inputs. Anyway thanks, this feedback is useful.
Where does it say that? There's no mention of JSON on the page you linked to. |
@achingbrain it actually does seem to work with a string input. No need to focus on the word
For example, my current upload script contains:
which is working fine to produce content like: https://cloudflare-ipfs.com/ipfs/QmVFqnDeGZVMS6aK8QuRcWH6xyWmyR2rnTR8VPHw4o32nU |
I'm working on #3029 and having a really hard time making changes to
ipfs.add
without changing it's current behavior. In the process I'm discovering some quirks that I wanted to point out & hopefully find a way to address:So it takes arbitrary input and normalizes it. However rules for single vs multiple files are of particular interest here.
js-ipfs/packages/ipfs-core-utils/src/files/normalise-input.js
Lines 30 to 38 in 8cb8c73
So all
AsyncIterator
s represent input with multiple files, except if it is iterator ofArrayBufferView|ArrayBuffer
. This leads to some interesting questions:Is this supposed to produce single file with
hi\nbye
content or two files ?According to docs
AsyncIterable<Bytes>
is interpreted as single file, howeverAsyncIterable<string>
is interpreted as multiple files. However implementation only checks first the rest are just yielded from content, so I'm guessing it would produce single file.And how about if we switch those around ?
According to the documentation
AsyncIterable<string>
is interpreted as multiple, files however since only first chunk is checked I'm guessing this one would produce two files.Even more interesting would be if we did this:
Which would produce error, although one might expect two files.
Maybe this is not as confusing as I find it to be, but I really wish there was more way to differentiate multiple file add from single file add so implementation would not need to probe async input to decide.
I think things would be so much simpler for both user and implementation if
ipfs.add
just always worked with multiples, that would mean that user would have to wrapBytes|Blob|string|FileObject|AsyncIterable<Bytes>
into array[content]
but it would make API much cleaner alsoAsyncIterable<*>
as result would make more sense as it's many to many.Alternatively there could be
ipfs.add
andipfs.addAll
APIs to deal with single files and multiple files respectively. That would also reduce complexity when adding a single fileconst cid = (await ipfs.add(buffer).next()).value.cid
The text was updated successfully, but these errors were encountered: