Migration of Blockstore to use multihash instead of CID as key

This is part of #1440 endeavor. 

### Motivation

Currently, Block-store (key-value store) uses CID as a key for the block's data. As CIDv1 can have different bases used for encoding, it can happen that the same data will be duplicated several times because of CID with different base encodings. The main motivation to tackle this problem is the shift from CIDv0 (eq. base58) to CIDv1 (eq. default base32, yet as mentioned any other encoding is also possible).

### Solution

Use CID's multihash as a key in Block-store.

### Parts affected

This change will ripple through several commands/packages. Here is a list of things I have discovered in analysis. The main parts affected will be related to parts of code that uses `query` on the repo's blockstore.

 1. Garbage Collection - lists all stored blocks CIDs, compare it with pinned CIDs and remove those not pinned.
	* This will have to be changed to not depend on comparing CIDs but multihashes. Eq. take only pinset, extract multihashes out of those and do GC based on those.

### Problems and possible solutions

 1. `ipfs refs local` - returns list of CIDs of locally stored objects
    * **Constructing new CIDs** - We could return new CIDs with base32 wrapped around the stored multihashes. But if somebody would store a block under different encoding, then they won't find it in this listing.
	* **Retain original CID** - We could wrap the data in object to keep the original CID as metadata something like: `{ cid: key, data: buff }` and store that in datastore. Or have different Map stored aside to track this, yet there will be a possibility of "conflicts" (eq. several CIDs having the same multihash), how should that be handled?

 1. Class `Block` (in `js-ipfs-block`) has `cid` property, should it be changed to `multihash`? This is used heavily in many packages though, so I guess not, but then it won't be always possible to create a Block with CID (eq. see the previous problem).  

### Questions

As discussed in [weekly call](https://cryptpad.fr/code/#/2/code/edit/qbcGi7JVeZnzMUJkT0ZlBh-+/), @Stebalien mentioned that "provider records need to use raw multihashes". Is this related to Bitswap and `ipfs dht findprovs`? @Stebalien? If so, then does it mean that Bitswap should be changed to use/negotiate/exchange around multihashes instead of CIDs? How far should this ripple? Content routing? I am not so familiar with this part of the codebase, so I will need some guidance on this. Also I am not sure if this needs to happen right away? I feel like this is related but not required for what we are doing here right now.

@alanshaw please also provide your input.
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migration of Blockstore to use multihash instead of CID as key #2415

Motivation

Solution

Parts affected

Problems and possible solutions

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Migration of Blockstore to use multihash instead of CID as key #2415

Description

Motivation

Solution

Parts affected

Problems and possible solutions

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions