Multiaddr Based Content Routing #11

aschmahmann · 2019-04-11T17:47:30Z

Context

A standard IPFS data request causes the Exchange (i.e. Bitswap) to search the Content Routing system for a set of PeerInfo objects (which are just PeerIDs + their multiaddrs). The Exchange then takes these PeerInfo objects and requests data from the peers.

This causes us to need libp2p peers to proxy all data available to the Content Routing system, but if the data is available elsewhere shouldn't we be able to access it?

Proposal

I would like to be able to request data from multiaddrs that do not correspond to libp2p peers. For example, if we want to store data with some cloud storage provider like AWS S3 we could put a provide record in the DHT that Hash(Data) lives at /http/mybucket.s3.amazonaws.com/Data.

Motivation

While we could also run an a compute node, like EC2, with a set of IPFS cluster daemons on them with an S3-backed datastore it's certainly more costly. This is even more interesting if we can "draft" data that's publicly available over HTTP into IPFS.

Implications for future work

While the first iteration of this idea is conceptually fairly simple, it has implications for some of our ongoing endeavors. For instance, if we have 1000 small blocks hosted on /http/mysite.com/Data1-1000 that are all part of a single IPLD object we wouldn't be able to just provide the root IPLD node since there's no peer that will be able to tell us where the other 999 blocks are. There are various ways we could extend the protocol to allow us to tell retrievers where the other 999 blocks are, but it's not as simple as with the existing peer based retrieval.

Additionally, we would likely face increasing demand to support large files that are available over HTTP. Since we don't want users to download a lot of data before it's verified we'd probably want to extend the protocol with some ability for the peers advertising the content in the DHT to add (references to) hashes of chunks of the large file that could be verified. Similarly, we'd want to add the ability to download ranges of bytes when presented with a multiaddr that supports that functionality.

I think implementing this functionality could make running "pinning" services much easier and less expensive as well as greatly increasing the amount of content accessible via IPFS. But what about you @Stebalien @raulk @bigs ?

mikeal · 2019-04-12T18:54:50Z

This aligns well with a project I'm working on in IPLD for centralized Block storage over HTTP.

Also, this is similar to a prior discussion I started for adding an equivalent to Bittorrent's webseed feature ipld/ipld#57

You'll need to settle on a base encoding for the CIDs, I'm planning on base32 to align with the rest of our shift to base32.

aschmahmann mentioned this issue Apr 25, 2019

Centralized seed services, inspired by WebSeeds ipld/ipld#57

Open

aschmahmann mentioned this issue Mar 14, 2024

feat: add sessions to trustless gateways ipfs/helia#459

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiaddr Based Content Routing #11

Multiaddr Based Content Routing #11

aschmahmann commented Apr 11, 2019 •

edited

Loading

mikeal commented Apr 12, 2019 •

edited

Loading

Multiaddr Based Content Routing #11

Multiaddr Based Content Routing #11

Comments

aschmahmann commented Apr 11, 2019 • edited Loading

Context

Proposal

Motivation

Implications for future work

mikeal commented Apr 12, 2019 • edited Loading

aschmahmann commented Apr 11, 2019 •

edited

Loading

mikeal commented Apr 12, 2019 •

edited

Loading