-
Notifications
You must be signed in to change notification settings - Fork 124
RFC: unite the files APIs #552
base: master
Are you sure you want to change the base?
Conversation
License: MIT Signed-off-by: Alan Shaw <alan.shaw@protocol.ai>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like these changes!
|
||
#### 1.2 Changes to returned values | ||
|
||
1. Importing a single file will now yield two entries, one for the imported file and one for the containing directory. Note this change can be considered almost backwards compatible; in the current API you'd receive an array of one value which you would access like `files[0]`. If you collect the entries in the new API you'd still access it like that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: in terms of backwards compat, I'd like to just switch to a new API version and create a shim to maintain support for v0.
#### 1.2 Changes to returned values | ||
|
||
1. Importing a single file will now yield two entries, one for the imported file and one for the containing directory. Note this change can be considered almost backwards compatible; in the current API you'd receive an array of one value which you would access like `files[0]`. If you collect the entries in the new API you'd still access it like that. | ||
2. Instead of a `hash` property, entries will instead have a `cid` property. In entries yielded from core it will be a CID instance, not a string (as agreed in [ipfs/interface-js-ipfs-core#394](https://github.com/ipfs/interface-js-ipfs-core/issues/394)). In the HTTP API/CLI it will necessarily be a string, encoded in base32 by default or whatever `?cid-base`/`--cid-base` option value was requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to consider returning namespaced paths instead of CIDs:
- In theory, it allows us to import into other filesystem formats (e.g., swarm, git?, etc.).
- If we support embedding small files in directory entries, not all files will have CIDs (unixfs 2.0 should, ideally, support that).
| `ipfs stat` | ✅ | ✅ | | ||
| `ipfs write` | ❌ | ✅ | | ||
|
||
The `/ipfs` directory in MFS problem can simply be avoided by either assuming IPFS path (the current solution) or by denying writes to a directory of this name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to join mfs
with everything else, I'd like to namespace MFS under something like /local
or `/files. That way, we can continue adding new namespaces without clobbering user-defined directories.
- /ipfs
- /ipns
- /ipld?
- /p2p?? (might be nice to expose this as sockets when mounted with fuse).
- /git/Cid/... ???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also like to have clean namespacing to not worry about edge cases and vulnerabilities related to that.
I'd go with /local
because
- no cognitive overhead
- it emphasizes data being "local to the node"
- things named
localhost
and.local
already exist in other systems, and familiarity reduces cognitive overhead
|
||
The file system APIs will be streaming by default. Due to the way we store and retrieve data it makes sense for our API methods to stream content when retrieving it locally or over the network. Buffering APIs can cause OOM issues, give no feedback to the user on progress and they can be trivially wrapped to collect all items in order to achieve the same effect as a buffering API. | ||
|
||
Streaming APIs will use a language native / standard library feature that is supported in all runtimes that IPFS is actively targeting. This prevents bloat and by only supporting one streaming mechanism it reduces API surface area. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Let's also stop streaming bytes. If we switch to only streaming objects, we can stream metadata (progress, etc.).
| `ipfs files read` | `ipfs read` | | ||
| `ipfs files rm` | `ipfs rm` | | ||
| `ipfs files stat` | `ipfs stat` | | ||
| `ipfs files write` | `ipfs write` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ this proposal, and have only one concern:
While I like the symmetry of import
/ export
, I have a problem with write
if it uses different DAG builder than import
, because people will continue to be confused why file added with import
has different CID than write
.
Current ipfs add
creates balanced dag, while ipfs files write
creates trickle-dag that supports adding data at the end, so is more like append
operation.
Loose idea: does it make sense to have write
which creates the same DAG as import
and additional append
command that is a special version of write
that only adds data at the end and builds a trickle dag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The js files.write
implementation takes a strategy
parameter that can be balanced
or trickle
(because ipfs.add
and ipfs.files.write
both use the same unixfs-importer
which receives the parameter).
If you files.write
a file, it'll get converted into a trickle
DAG, unless you pass strategy=balanced
in which case it'll convert it into a balanced
DAG (rebalancing as necessary).
I think we could standardise on one and allow the user to override as they see fit.
If we could then extend the UnixFS v1.5 metadata proposal to include the originally selected DAG strategy we'd be able to optimise for appending to trickle DAGs.
YES! ❤️ Will review more carefully soon. Prev work for context to others at ipfs/specs#98 && https://github.com/ipfs/interface-js-ipfs-core/issues/284 |
|
||
Adding files to MFS removes any performance overhead of creating/maintaining pinset DAG nodes, unburdens the user from understanding pinning (for the most part), improves visibility of added files and makes it significantly easier to find or remove files that were previously imported. | ||
|
||
The `ipfs import` command _optionally_ takes a MFS path option `--dest`, a directory into which imported files are placed. Note the destination directory is automatically created (but not any parents). If the destination directory exists already then an error is thrown, unless the `--overwrite` flag is set. This causes any existing files with the same name as the imported files to be overwritten. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small point, but maybe --force
instead of --overwrite
? Feels a bit more unixy.
Not finished yet!