-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: IPLD spec #37
WIP: IPLD spec #37
Conversation
Another TODO:
It may need to be stored in the serialized format, since decoding it will yield a multicodec. maybe just in the in memory logical representation, so serialization happens correctly. (With the exception for old style |
|
||
TODO: | ||
- [ ] list path resolving restrictions | ||
- [ ] show examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@willglynn could you help me fill this out?
the more concise here the better, but i suspect this section may be a bit large.
see rendered doc here: https://github.com/ipfs/specs/blob/ipld-spec/merkledag/ipld.md
More TODOs:
|
} | ||
} | ||
|
||
> ipld cat --json QmBBB...BBB/author |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this resolve to
"mlink": "QmAAA...AAA" // links to the node above.
? Same for the YAML example below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right and this was a copy/paste typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few nitpicks. On the whole I found this doc hugely helpful in understanding IPLD and its intent. |
- `ipfs` is a protocol namespace (to allow the computer to discern what to do) | ||
- `QmUmg7BZC1YP1ca66rRtWKxpXp77WgVHrnv263JtDuvs2k` is a cryptographic hash. | ||
- `a/b/c/d` is a path _traversal_, as in unix. | ||
- this link traverses five objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily.
We could have the object QmUmg7BZC1YP1ca66rRtWKxpXp77WgVHrnv263JtDuvs2k
having a link named a/b/c/d
directly pointing to the final object (or any combination in between).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mildred ah that's right we did say we would allow sparse. one question remains re ordering of links there-- do we want to take them by lexicographic order? or in order in the serialized fmt?
ordering based on the serialized fmt will be needed if links have same name/no name (someone WILL do it so ipld implementations should be written to handle the case even if we say people should not do it)
but ordering lexicographically when links do have names is useful for getting users to expect the same behavior.
how do we handle this?
> ipld cat --fmt yml $h1
---
foo: {mlink: $h2}
foo/bar: {mlink: $h3}
> ipld cat --fmt yml $h2
---
bar:
hello: h2bar1
> ipld cat --fmt yml $h3
---
hello: h3bar2
> ipld cat --fmt yml $h1/foo/bar
# ??? should it be
---
hello: h2bar1
# or should it be
---
hello: h3bar2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not expect this to be an issue because what you describe is the compact form of the following object:
> ipld cat --fmt yml $h1
---
foo:
mlink: $h2
bar:
mlink: $h3
And in this object, only the "foo" link will be considered valid, not "foo/bar" because :
-
keys should not be allowed to have
/
character (so we will never have "foo/bar" verbatim in a key) for the same reasons/
is not allowed in a unix filename. -
in the previous object, "foo/bar" will not be considered as a link as per the "Duplicate property keys" section :
Note that having two properties with the same name IS NOT ALLOWED, but actually impossible to prevent (someone will do it and feed it to parsers), so to be safe, we define the value of the path traversal to be the first entry in the serialized representation. For example, suppose we have the object:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ipld cat --fmt yml $h1/foo/bar
should be h2bar1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- keys should not be allowed to have
/
character
i agree, but unlike unix pathnames, there already are datastructs out there that we should be able to store, even if the resolution through them is not perfect. I.e. if we define how the resolution would work even in this case, we avoid the problem of forcing users to change their data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say, don't resolve links for keys that have /
in them. We can still store data structure which have those keys, we just can't resolve them through paths. I don't see a problem in that. We would have a separate API to parse the local data structure without following links.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not escape slashes, so:
> ipld cat --fmt yml $h1/foo/bar
---
hello: h2bar1
> ipld cat --fmt yml $h1/foo\/bar
---
hello: h3bar2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that you meant to put the path in simple quotes, else your bourne compatible shell will replace \/
by /
and both commands are identical.
> ipld cat --fmt yml '$h1/foo\/bar'
---
hello: h3bar2
The kernel doesn't know escaping so what the linux kernel will understand when presented with the path foo\/bar
is the entry bar
enclosed in a directory called foo\
. Backslashes in file names are valid and this notation would prevent using them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, you wouldn't be able to access keys like that with the fuse mount, but I don't see why that doesn't mean the ipld command couldn't support escaping? If you wanted a literal backslash, you'd write \\
In that case, we should either provision key escaping to avoid clashing with someone wanting to have |
👍 on this whole thing. Still a major TODO for me is the description of the By the way, why choose the |
- **IPLD Serialized Formats**: a set of formats in which IPLD objects can be represented, for example JSON, CBOR, CSON, YAML, Protobuf, XML, RDF, etc. | ||
- **IPLD Canonical Format**: a deterministic description on a serialized format that ensures the same _logical_ object is always serialized to _the exact same sequence of bits_. This is critical for merkle-linking, and all cryptographic applications. | ||
|
||
In short: JSON documents with named merkle-links that can be traversed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
traversed or resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both -- i think one (traversed) implies the other (resolved)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, my initial thought was that merkle links can be resolved and something like a smart bitswap can traverse
I don't see any specific mention to the data types the links point to, have we decided that isn't IPLD turf (and just json-ld)? |
Also, instead of adding more TODO here, can we before go issue by issue here https://github.com/ipfs/go-ipld/issues and ipld/js-ipld-dag-cbor#2 and check what is already included in the spec, add what is missing, explicitly add what was accepted and then see what's left, bringing all of that conversation to this PR? So that we all talk over the same page :) |
It is possible to describe the merkle link in a JSON-LD context easily, but the context must also describe other parts of the JSON document. So it can't be a single context for all IPLD documents. So, you'll have different contexts for small files, chunked files, directory, git blob, tree, commit, ..., and each of these contexts will be able to describe the merkle links the same way. Also, we could imagine allowing any arbitrary key name instead of |
From IRC: < kandinski> jbenet: the IPLD spec doesn't define where the "IPLD" initials come from In any case, origin of initials should be referenced at the top. Since it's going to be pronounced "Eye Pee Ell Dee" anyway, we could say something like: "The IPLD (short for InterPlanetary Linked DirectedAcyclicGraph) is...". We do call it a "thin-waist merkle dag" anyway, which is a killer description, by the way. |
Same as above with the first mention of CDRTs. "Conflict-free replicated data type" is easier to understand for people coming in anew. |
cc @mekarpeles |
I like these options as well. If the filesystem layer contracts |
IPLD CBOR tagging
IPLD merkle-path improvements
Relationship with Protocol Buffers legacy IPFS node format
OK! #59 #61 #62 #64 are all merged! 👍 MASSIVE thanks to @mildred for pushing it through. We all are very thankful :) What issues remain here? I think I will merge this (FINALLY!) and let's continue to iron it out with future PRs against master. I think we have a solid spec, and now https://github.com/ipfs/go-ipld/ and https://github.com/diasdavid/js-ipld/ can match it. |
Note: let's keep the branch, as there's many links to that specifically. |
🎆 |
This PR adds a new IPLD spec.
Some things TODO:
@mildred @diasdavid could you review?