annotate hashes to be tagged pointers? does this lead to schemas? #21

dominictarr · 2014-09-06T11:16:14Z

I've been thinking about this a lot since discussing on irc with @pfraze the other day,
ssb has 3 types of links:

hash of pubkey (a Feed id)
hash of a message (a Message id - i.e. previous message hash on every message)
hash of an arbitary blob (an attachment)

If hashes where tagged with the type of thing they point to you get a really nice ability:
the system can index relationships automatically: that feed X is linked to feed Y (example, because they are friends), that message B was created strictly after message A (to which it was a reply) or that some message refers to an attachment J. Without tags, then the specific
meaning of each hash must be interpreted from it's context, or the hashed object must be
retrived and parsed.

Another idea that we have discussed recently is identifying messages via the hash of their schema. There are certainly nice things about this idea, but also unknowns.
The idea would be to have a canonical representation of each schema, and then id that schema with it's hash. this would allow objects to be tagged like in git, but also to allow
user applications to create new types, and to reflect and parse the documents without
running those applications.

To combine these ideas, we would need links to have be a hash and a type hash.
That would make each link 64 bytes long, which wouldn't fit on a 80 char terminal line.

Maybe we could just use a 1 byte tag for messages, feeds and attachments,
An attachment could be tagged by having the hash of it's schema at the beginning.
Of course, this would be incompatible with most standard mimetypes, so we'd need a raw
blob as well... so that would be 4 id types? (feed, message, attachments: tagged and raw?)

What should links be like?

We could get away with {T}{hash} but I think there is a strong case for including other metadata in the link, such as the size of an attachment, or the feed id and sequence of a message? or the ip address of a relay, as part of a feed id. Sometimes this extra metadata
might be unnecessary or unwarranted or would just create a token that is too long.

@jbenet has a similar idea over here: jbenet/random-ideas#1

The text was updated successfully, but these errors were encountered:

pfrazee · 2014-09-06T17:43:39Z

My first proposal: a base-256 numbering system that goes 0-9, A-Z, and then uses emojis for the remaining 214 numbers. Now 64 bytes fits in the 80-char terminal. ✌️ victory declared

The goal is context-free processing... what I'm wondering is, do we have the structure to do that? Links would be embedded in message bodies, right? If there's no standard schema, then we can only do contextual processing (by the apps that manage the message type).

jbenet · 2014-09-06T21:45:12Z

I don't recommend putting in types of objects into the hash value itself. IPFS handles it this way:

https://github.com/jbenet/go-ipfs/blob/master/merkledag/merkledag.go#L19-L40

We're still discussing whether the Link struct will carry a type, or the type will be in the Data portion.

dominictarr · 2014-09-07T08:58:20Z

@pfraze yeah, for this idea to work we'd need a standard encoding for messages (for example, msgpack) If an app really needs something special then it can use a buffer inside message pack.

@jbenet I would love to hear your thoughts on why tagged links are bad?
It would be more complicated to support this in ipfs, because you can create arbitary document types. In our case, we only have 3 main types of object, so tagging them seems straightforward.

@pfraze <3 your idea for unicode'd encoding. it would actually be longer, but would look awesome. I think @substack would approve of this idea also.

dominictarr · 2014-09-10T18:42:42Z

Instead of using tag+hash it might be better to have {link: hash, meta: metadata, type: linktype}

That way the metadata can be kept in the index. this means we can link to a message and also optionally include the author id of that message (which may be useful).

maybe we could put a type in the message. so if a message contains a link to a author with a type: 'follow' then they followed that key -- we'd have permissions about what sort of links that app was allowed to create. This scheme would be highly flexible because nodes could create multiple links of different types if they needed.

Since messages are already size limited, it's not really a problem if metadata is large.

pfrazee · 2014-09-10T19:00:55Z

I'm in favor that. It brings us back to our issue of type semantics and assigning unique names to types (same issue as with schemas)

jbenet · 2014-09-10T19:39:42Z

Instead of using tag+hash it might be better to have {link: hash, meta: metadata, type: linktype}

Yes, at least do this :)

maybe we could put a type in the message. so if a message contains a link to a author with a type: 'follow' then they followed that key -- we'd have permissions about what sort of links that app was allowed to create. This scheme would be highly flexible because nodes could create multiple links of different types if they needed.

Yep, this is roughly the model we're following.

A slightly modified version is to think of two classes of links (roughly map to raw ptrs and smart ptrs):

link -- just the hash
link with metadata -- an object with metadata about the link.

you implement 2 on top of 1:

  // given
  p1 = {name: "foo"}
  p2 = {name: "bar"}

  // metalink outside of the file (in the links changing / Ted Nelson / TBL 2.0 friendly way)
  // p1 and p2 don't change with creation of links.
  follows = {person: Hash(p1), follows: Hash(p2)}
  // or even straight up triple
  follows = {source: Hash(p1), target: Hash(p2), type: Hash(followRelationship)} 

  // metalink inside of the file (TBL 1.0 style)
  m1 = {text: "o hai @dominictarr!", sender: Hash(p1), recipient: Hash(p2)}

TBL 1.0 = http web
TBL 2.0 = semantic web

jbenet · 2014-09-10T20:25:44Z

((a thought alongside is that IPFS proposes files do belong IN the file, but that meaningful links are also files, so Link => Objects is a thing.))

dominictarr · 2014-09-11T08:46:54Z

@jbenet so you are saying that the link itself needs to be an object that can be linked to?
can you describe a usecase for this?

@pfraze you are correct about the names... maybe the solution is to make any link revokable?
then we can handle cases where problems arise?

pfrazee · 2014-09-11T15:05:30Z

@dominictarr We just need a global namespace, and I think that means we either use DNS or something GUIDlike-- maybe the idea where we publish a type definition on the feed as a message and do author_hash + typedef_message_hash. That's nice because it's immutable, but it's also 64 bytes. Maybe we could get away with just typedef_message_hash but that does have a non-zero collision risk.

dominictarr · 2014-09-12T11:51:46Z

Maybe we can just use names for now, and then change to hashed schemas when we figure that out. if the type can be up to 32 bytes long, then that will be possible.

dominictarr · 2014-09-12T12:48:16Z

Maybe we could just whitelist link types for now, and then switch to hashed schema types.

pfrazee · 2014-09-18T21:44:23Z

Maybe we should take the same stance on types for messages that we do with links -- don't ever try to enforce a global namespace, and trust developers to coordinate with each other and come up with good identifiers.

You're going to have to validate messages no matter what, and if you want something stronger to disambiguate the semantics, you can use your own identifier: { type: 'foomsg', message: { paulFooType: 'v2' }}. In practice, you can avoid most collisions with a dash: orgname-type or projectname-type. This is what the HTML custom elements do -- custom elements have to use a dash in their names.

dominictarr · 2014-09-18T22:09:43Z

yeah. well this will have to do for now anyway.

jbenet · 2014-09-18T23:04:52Z

Thoughts on JSON-LD?

Part of me wants to force it, since it's trivial addition of a context. Could really really help.

Maybe the right thing to do for me is define a way to do It that doesn't force json (you can see JSON-LD as a Tree-LD, protobufs are trees)

Make sure you watch the JSON-LD video before dismissing it.
—
Sent from Mailbox

On Thu, Sep 18, 2014 at 3:09 PM, Dominic Tarr notifications@github.com
wrote:

yeah. well this will have to do for now anyway.

Reply to this email directly or view it on GitHub:
#21 (comment)

pfrazee · 2014-09-18T23:14:31Z

Which video?

jbenet · 2014-09-19T07:20:51Z

JSON-LD https://www.youtube.com/watch?v=vioCbTo3C-4
Linked Data https://www.youtube.com/watch?v=4x_xzT5eF5Q

msporny's explanations are really good.

dominictarr · 2014-09-19T07:30:58Z

@jbenet what do you think are the key points? it would help if your links contained more context (!)

jbenet · 2014-09-19T07:36:49Z

@dominictarr watch the JSON-LD one, he explains how JSON-LD works. (pro tip: bump it up to 2x speed), i really can't do his explanation justice. He makes the semantic web actually tractable.

dominictarr · 2014-09-19T08:29:36Z

if it's so simple, why can't you explain it in a sentence or two?

dominictarr · 2014-09-19T08:44:17Z

Okay I watched the videos, but to be honest, all I got from it was that you have a link with properties and a context... it sounds like the context is a schema of some sort.

dominictarr · 2014-10-07T23:00:58Z

okay so we went with this, closing.

dominictarr mentioned this issue Sep 14, 2014

encode message content with msgpack #25

Merged

dominictarr closed this as completed Oct 7, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

annotate hashes to be tagged pointers? does this lead to schemas? #21

annotate hashes to be tagged pointers? does this lead to schemas? #21

dominictarr commented Sep 6, 2014

pfrazee commented Sep 6, 2014

jbenet commented Sep 6, 2014

dominictarr commented Sep 7, 2014

dominictarr commented Sep 10, 2014

pfrazee commented Sep 10, 2014

jbenet commented Sep 10, 2014

jbenet commented Sep 10, 2014

dominictarr commented Sep 11, 2014

pfrazee commented Sep 11, 2014

dominictarr commented Sep 12, 2014

dominictarr commented Sep 12, 2014

pfrazee commented Sep 18, 2014

dominictarr commented Sep 18, 2014

jbenet commented Sep 18, 2014

yeah. well this will have to do for now anyway.

pfrazee commented Sep 18, 2014

jbenet commented Sep 19, 2014

dominictarr commented Sep 19, 2014

jbenet commented Sep 19, 2014

dominictarr commented Sep 19, 2014

dominictarr commented Sep 19, 2014

dominictarr commented Oct 7, 2014

annotate hashes to be tagged pointers? does this lead to schemas? #21

annotate hashes to be tagged pointers? does this lead to schemas? #21

Comments

dominictarr commented Sep 6, 2014

pfrazee commented Sep 6, 2014

jbenet commented Sep 6, 2014

dominictarr commented Sep 7, 2014

dominictarr commented Sep 10, 2014

pfrazee commented Sep 10, 2014

jbenet commented Sep 10, 2014

jbenet commented Sep 10, 2014

dominictarr commented Sep 11, 2014

pfrazee commented Sep 11, 2014

dominictarr commented Sep 12, 2014

dominictarr commented Sep 12, 2014

pfrazee commented Sep 18, 2014

dominictarr commented Sep 18, 2014

jbenet commented Sep 18, 2014

yeah. well this will have to do for now anyway.

pfrazee commented Sep 18, 2014

jbenet commented Sep 19, 2014

dominictarr commented Sep 19, 2014

jbenet commented Sep 19, 2014

dominictarr commented Sep 19, 2014

dominictarr commented Sep 19, 2014

dominictarr commented Oct 7, 2014