Add version info to Hashes #466

pchiusano · 2019-04-19T14:17:20Z

pchiusano · 2019-04-29T18:07:32Z

I'm moving this off of M1 (but open to PRs from new contributors!!). The codebase format is already going to be versioned separately, so it's less important that the base58-encoded hashes of that format include Unison version information (and the hash algorithm) now. We could choose to add this info in a later version of the codebase format, or not, since it will be redundant - within a version of the codebase format, all the hashes will be of the same type.

I see this as being more useful when displaying hashes to users and when sharing copy-pastable hashes that can have an unambiguous meaning even as Unison evolves. It might also prove useful in the implementation of the Unison inter-node protocol so we can keep that in mind, too.

Here's a proposed self-describing hash, using multiformats:

<multibase><unison-multicodec-id><unison-version-id><multihash>

Notes:

<multibase> will just be z for bitcoin base 58 if we're rendering the hash as text.
<multihash> format is just <hash-algo><hash-len-in-bytes><hash-value>
The <unison-multicodec-id> is added to the community table. This is the only thing that will be added to that table. The idea is we'd like to avoid spamming that community table every time there's a new version of Unison and we don't want that to be a bottleneck for doing releases of Unison.
The <unison-version-id> references a Unison application-specific multicodec table. (Initially, the "table" will just have one entry in it, Hash.unisonVersion1 = 1 :: Word8, just stored in the Unison source itself).

I'd be open to a PR for this, I would just edit the Unison.Hash module. Some implementation notes (assuming the above sounds good):

Open a PR against https://github.com/multiformats/multicodec#multicodec-table to add an entry for Unison. You can reference this issue.
I dunno if I'd bother with the multihash dependency, these formats are so simple, it's like 3 LOC...
The Hash type could still be a Hash ByteString, but those bytes will be:
- <multibase><unison-multicodec-id><unison-version-id><multihash>
- <multihash> is just <hash-algo><hash-len-in-bytes><hash-value>
- So, basically, just don't include the multibase prefix, it's assumed to be binary.
Then modify the base58 and fromBase58 functions accordingly.
And also modify the Accumulate instance here to prepend:
- <unison-multicodec-id><unison-version-id><hash-algo><hash-len-in-bytes> to the raw bytes produced by the hash.
If we need varint serialization in Haskell, that's here. But I don't think that will be needed yet until we have more than 128 versions of Unison. 😀 The community table will just have a constant in it that we'll reference in the Haskell code.

tysonzero · 2020-01-28T22:49:34Z

Will this be a path towards allowing people to seamlessly store all their public Unison code on IPFS?

I'm a huge fan of all these CAS-focused projects, as I think it is absolutely the future, however I think a huge chunk of the benefit is being able to store all public CAS content on a single decentralized network.

zipper97412 · 2020-03-30T11:52:47Z

Hi, newcomer here! I am also a huge fan of CAS (ipfs and ipld mostly) as I see things, the AST could be represented as an ipld object, linking with other code by CID, we could use an ipld store as backend (ipfs) for ucm to handle burden of storing artefacts. Also, we get code sync and tests results sync for free just by resolving CIDs on ipfs first
Todo:

Use ipld (cbor or pb) as AST storing format
Use ipfs as main ipld store, codebase, types, eval, namespace etc... will be stored and published by ipfs
Also provide other store implementations that does not depend on ipfs but still use ipld as format, ex: local store in folder (like current implementation)

Ipld also defines an archive format that can be used in future as binary format for standalone executables and/or library just by providing an ipld store implementation for ucm

I will probably open an issue for this later for comments :)

jphastings · 2021-11-04T18:21:54Z

Did you get anywhere with this @zipper97412?

solomon-b · 2021-12-10T07:06:27Z

I can try to take a stab at this if @zipper97412 is busy.

tysonzero · 2021-12-10T18:43:18Z

@pchiusano What's the reason for using a custom unison-multicodec over dag-cbor? Filecoin for example uses dag-cbor. This will give you a lot more interop with existing and future tooling, as dag-cbor is more or less the preferred multicodec outside of some dag-pb for files and folders.

pchiusano added this to the M1 milestone Apr 19, 2019

pchiusano mentioned this issue Apr 29, 2019

[Question] Using CID/multihash #459

Closed

pchiusano removed this from the M1 milestone Apr 29, 2019

pchiusano added good first issue A good first issue for new contributors help wanted labels Apr 29, 2019

mitchellwrosen removed the good first issue A good first issue for new contributors label Nov 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add version info to Hashes #466

Add version info to Hashes #466

pchiusano commented Apr 19, 2019

pchiusano commented Apr 29, 2019

tysonzero commented Jan 28, 2020 •

edited

Loading

zipper97412 commented Mar 30, 2020

jphastings commented Nov 4, 2021

solomon-b commented Dec 10, 2021

tysonzero commented Dec 10, 2021

Add version info to Hashes #466

Add version info to Hashes #466

Comments

pchiusano commented Apr 19, 2019

pchiusano commented Apr 29, 2019

tysonzero commented Jan 28, 2020 • edited Loading

zipper97412 commented Mar 30, 2020

jphastings commented Nov 4, 2021

solomon-b commented Dec 10, 2021

tysonzero commented Dec 10, 2021

tysonzero commented Jan 28, 2020 •

edited

Loading