From 9d2e6e64fa73205c352586ec8793c34ec265ba5c Mon Sep 17 00:00:00 2001 From: David Dias Date: Fri, 26 Jun 2015 14:58:48 +0100 Subject: [PATCH 1/6] add initial DHT spec with some questions --- protocol/routing/README.md | 80 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) create mode 100644 protocol/routing/README.md diff --git a/protocol/routing/README.md b/protocol/routing/README.md new file mode 100644 index 000000000..b3f0b4f1c --- /dev/null +++ b/protocol/routing/README.md @@ -0,0 +1,80 @@ +IPFS Routing Protocol Spec +========================== + +Authors: David Dias + +Reviewers: + +TODOS: + +----------------------- + +> This spec defines the routing protocol spec, covering `Peer discovery`, `Routing` and the `DHT`. The spec is a **Work In Progress**. + +## Supports + +- Peer discovery through + - mdns + - custom peers list + - random walking on the network +- Routing primitives + - Publish and fetch content (also providing) +- Maintaining partial state of the network + - DHT + - kbucket + +### Overview + +The Routing Protocol is divided in three major components, these are: +- Peer Discovery: Responsible for filling our kbucket with best candidates. +- Interface: Our routing primitives that are offered for the user, such as finding and publishing content, including the storage and lookup of metadata (Providers). +- Peer-to-peer Structured Overlay Network: Algorithm for the implicit network organization, based on [Coral](http://iptps03.cs.berkeley.edu/final-papers/coral.pdf) and [mainlineDHT](http://www.bittorrent.org/beps/bep_0005.html) + +Bootstrapping the routing happens by connecting to a predefined "railing" peers list, shipped with the go-ipfs release and/or by discovery through mDNS. Once at least one peer is found and added to the kbucket, the routing changes to an active state and our peer becomes able to route and receive messages. + +### Peer Discovery + +#### bootstrap peer list + +List with known and trusted peers shipped with IPFS. + +- _How is this list updated?_ +- _Is this list updated periodically_? + +#### random walk + +IPFS issues random Peer lookups periodically to refresh our kbucket if needed. For impl reference, see: https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_bootstrap.go#L88-L109. + +#### mDNS + +In addition to known peers and random lookups, IPFS also performs Peer Discovery through mDNS ([MultiCast DNS](https://tools.ietf.org/html/rfc6762)) + +-_How offen do we issue this searches?_ + +### Routing + +For impl reference, check: https://github.com/ipfs/go-ipfs/blob/master/routing/routing.go#L19-L49 + +#### Find a peer + +_When searching for a peer, do we fetch the kbucket from a peer and see which peer we want to ping next or do we ask for a given Id to a peer and that peer replies to us with the best candidate (or itself if it is the case)?_ + +#### Ping + +#### Provide + +#### Get value + +#### Put value + +1. find peer +2. transfer +3. provide + +### DHT + +explain: +- dht/coral, how the algo works +- kbucket +- each time a contact is made with a new peer, we check to see if it is a better candidate for our kbucket +- xor metric From 1feb2cf7ccab75e6042d3c57a9ba741f46c82029 Mon Sep 17 00:00:00 2001 From: David Dias Date: Fri, 26 Jun 2015 14:59:00 +0100 Subject: [PATCH 2/6] add initial DHT spec with some questions --- protocol/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/protocol/README.md b/protocol/README.md index 9559a2f14..17eea286d 100644 --- a/protocol/README.md +++ b/protocol/README.md @@ -122,7 +122,7 @@ of implementations. For example: to one of a set of supernodes. This is roughly like federated routing. - **dns:** ipfs routing could even happen over dns. -See more in the routing spec (TODO). +See more in the [routing spec](https://github.com/ipfs/specs/tree/master/protocol/routing). ### Block Exchange -- transfering content-addressed data From 213573677832a7083d56ad1b5137464bb8c6ba08 Mon Sep 17 00:00:00 2001 From: David Dias Date: Fri, 26 Jun 2015 15:07:55 +0100 Subject: [PATCH 3/6] add ping --- protocol/routing/README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/protocol/routing/README.md b/protocol/routing/README.md index b3f0b4f1c..076859902 100644 --- a/protocol/routing/README.md +++ b/protocol/routing/README.md @@ -61,6 +61,10 @@ _When searching for a peer, do we fetch the kbucket from a peer and see which pe #### Ping +Ping mechanism (for heartbeats). Ping a peer and log the time it took to answer. + +_what if the Id doesn't exist? Is there any rule for non existing peers? Should we log time for best matches as well?_ + #### Provide #### Get value From 751c906bf8e1feafabf673f7f218e2307be3ea61 Mon Sep 17 00:00:00 2001 From: David Dias Date: Fri, 26 Jun 2015 15:15:28 +0100 Subject: [PATCH 4/6] provinding segment --- protocol/routing/README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/protocol/routing/README.md b/protocol/routing/README.md index 076859902..80cb3192a 100644 --- a/protocol/routing/README.md +++ b/protocol/routing/README.md @@ -67,6 +67,15 @@ _what if the Id doesn't exist? Is there any rule for non existing peers? Should #### Provide +Providing is the process of storing/updating the metadata (pointers) of where the blocks of a given file are stored/available in the IPFS network. What this means is that the DHT is not used for block discovery, but for the metadata which identifies where they are, instead. +When a node advertises a block available for download, IPFS stores a record in the DHT with its own Peer.ID. This is termed "providing". the node becomes a "provider". Requesters who wish to retrieve the content, query the DHT (or DSHT) and need only to retrieve a subset of providers, not all of them. (this works better with huge DHTs, and latency-aware DHTs like coral). + +We provide once per block, because every block (even sub-blocks) are independently addressable by their hash. (yes, this is expensive, but we can mitigate the cost with better DHT + record designs, bloom filters, and more) + +There is an optimistic optimization -- which is that if a node is storing a node that is the parent (root/ancestor) of other nodes, then it is much more likely to also be storing the children. So when a requester attempts to pull down a large dag, it first queries the DHT for providers of the root. Once the requester finds some and connects directly to retrieve the blocks, bitswap will optimistically send them the "wantlist", which will usually obviate any more dht queries for that dag. we haven't measured this to be true yet -- we need to -- but in practice it seems to work quite well, else we wouldnt see as quick download speeds. (one way to look at it, is "per-dag swarms that overlap", but it's not a fully correct statement as having a root doesn't necessarily mean a node has any or all children.) + +Providing a block happens as it gets added. Reproviding happens periodically, currently 0.5 * dht record timeout ~= 12 hours. + #### Get value #### Put value From 0e260f7b43ab583aadd500a540404af0b1d9749b Mon Sep 17 00:00:00 2001 From: David Dias Date: Fri, 26 Jun 2015 19:01:51 +0100 Subject: [PATCH 5/6] put question --- protocol/routing/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/protocol/routing/README.md b/protocol/routing/README.md index 80cb3192a..4fa8f1373 100644 --- a/protocol/routing/README.md +++ b/protocol/routing/README.md @@ -78,11 +78,11 @@ Providing a block happens as it gets added. Reproviding happens periodically, cu #### Get value + + #### Put value -1. find peer -2. transfer -3. provide +_not 100% about this happens exactly. From what I understand, the IPFS node that is adding the file, breaks the file into blocks, creates the hashes and provides each single one of them. When do we execute a Put? Replicas are done through "Get", right?_ ### DHT From dad0bb74735f4b9b473b088abb668d9d583bd31c Mon Sep 17 00:00:00 2001 From: David Dias Date: Tue, 30 Jun 2015 12:30:42 -0700 Subject: [PATCH 6/6] separate DHT, discovery and routing specs --- protocol/routing/DHT.md | 28 ++++++++ protocol/routing/README.md | 61 ++++++------------ protocol/routing/discovery.md | 41 ++++++++++++ .../routing/figs/routing-overview.monopic | Bin 0 -> 920 bytes protocol/routing/figs/routing-overview.txt | 6 ++ 5 files changed, 95 insertions(+), 41 deletions(-) create mode 100644 protocol/routing/DHT.md create mode 100644 protocol/routing/discovery.md create mode 100644 protocol/routing/figs/routing-overview.monopic create mode 100644 protocol/routing/figs/routing-overview.txt diff --git a/protocol/routing/DHT.md b/protocol/routing/DHT.md new file mode 100644 index 000000000..0f9ef8935 --- /dev/null +++ b/protocol/routing/DHT.md @@ -0,0 +1,28 @@ +IPFS DHT Protocol Spec +====================== + +Authors: David Dias + +Reviewers: + +TODOS: + +----------------------- + +> + +## Supports + +- Maintaining partial state of the network + - DHT + - kbucket + +## Overview + + + +explain: +- dht/coral, how the algo works +- kbucket +- each time a contact is made with a new peer, we check to see if it is a better candidate for our kbucket +- xor metric diff --git a/protocol/routing/README.md b/protocol/routing/README.md index 4fa8f1373..8f51666c7 100644 --- a/protocol/routing/README.md +++ b/protocol/routing/README.md @@ -9,63 +9,49 @@ TODOS: ----------------------- -> This spec defines the routing protocol spec, covering `Peer discovery`, `Routing` and the `DHT`. The spec is a **Work In Progress**. +> This spec defines the routing protocol spec. Routing offers an interface for the features exposed by `Peer discovery` and `DHT`. The spec is a **Work In Progress**. ## Supports -- Peer discovery through - - mdns - - custom peers list - - random walking on the network - Routing primitives - Publish and fetch content (also providing) - Maintaining partial state of the network - DHT - kbucket -### Overview +## Overview -The Routing Protocol is divided in three major components, these are: -- Peer Discovery: Responsible for filling our kbucket with best candidates. +The Routing Protocol is composed by three componenets, these are: - Interface: Our routing primitives that are offered for the user, such as finding and publishing content, including the storage and lookup of metadata (Providers). -- Peer-to-peer Structured Overlay Network: Algorithm for the implicit network organization, based on [Coral](http://iptps03.cs.berkeley.edu/final-papers/coral.pdf) and [mainlineDHT](http://www.bittorrent.org/beps/bep_0005.html) - -Bootstrapping the routing happens by connecting to a predefined "railing" peers list, shipped with the go-ipfs release and/or by discovery through mDNS. Once at least one peer is found and added to the kbucket, the routing changes to an active state and our peer becomes able to route and receive messages. - -### Peer Discovery - -#### bootstrap peer list - -List with known and trusted peers shipped with IPFS. - -- _How is this list updated?_ -- _Is this list updated periodically_? - -#### random walk - -IPFS issues random Peer lookups periodically to refresh our kbucket if needed. For impl reference, see: https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_bootstrap.go#L88-L109. - -#### mDNS +- Peer Discovery: Responsible for filling our kbucket with best candidates. +- Peer-to-peer Structured Overlay Network (DHT): Algorithm for the implicit network organization, based on [Coral](http://iptps03.cs.berkeley.edu/final-papers/coral.pdf) and [mainlineDHT](http://www.bittorrent.org/beps/bep_0005.html) -In addition to known peers and random lookups, IPFS also performs Peer Discovery through mDNS ([MultiCast DNS](https://tools.ietf.org/html/rfc6762)) +``` +┌──────────────┐ +│ routing │ +└──────────────┘ +┌─────────┐┌───┐ +│discovery││DHT│ +└─────────┘└───┘ +``` --_How offen do we issue this searches?_ +In order for routing to work, we first have to pass the bootstrap state. Bootstrapping happens by connecting to a predefined "railing" peers list, shipped with the go-ipfs release and/or by discovery through mDNS. Once at least one peer is found and added to the kbucket, the routing changes to an active state and our peer becomes able to route and receive messages. -### Routing +## Routing For impl reference, check: https://github.com/ipfs/go-ipfs/blob/master/routing/routing.go#L19-L49 -#### Find a peer +### Find a peer _When searching for a peer, do we fetch the kbucket from a peer and see which peer we want to ping next or do we ask for a given Id to a peer and that peer replies to us with the best candidate (or itself if it is the case)?_ -#### Ping +### Ping Ping mechanism (for heartbeats). Ping a peer and log the time it took to answer. _what if the Id doesn't exist? Is there any rule for non existing peers? Should we log time for best matches as well?_ -#### Provide +### Provide Providing is the process of storing/updating the metadata (pointers) of where the blocks of a given file are stored/available in the IPFS network. What this means is that the DHT is not used for block discovery, but for the metadata which identifies where they are, instead. When a node advertises a block available for download, IPFS stores a record in the DHT with its own Peer.ID. This is termed "providing". the node becomes a "provider". Requesters who wish to retrieve the content, query the DHT (or DSHT) and need only to retrieve a subset of providers, not all of them. (this works better with huge DHTs, and latency-aware DHTs like coral). @@ -76,18 +62,11 @@ There is an optimistic optimization -- which is that if a node is storing a node Providing a block happens as it gets added. Reproviding happens periodically, currently 0.5 * dht record timeout ~= 12 hours. -#### Get value +### Get value -#### Put value +### Put value _not 100% about this happens exactly. From what I understand, the IPFS node that is adding the file, breaks the file into blocks, creates the hashes and provides each single one of them. When do we execute a Put? Replicas are done through "Get", right?_ -### DHT - -explain: -- dht/coral, how the algo works -- kbucket -- each time a contact is made with a new peer, we check to see if it is a better candidate for our kbucket -- xor metric diff --git a/protocol/routing/discovery.md b/protocol/routing/discovery.md new file mode 100644 index 000000000..88fac3897 --- /dev/null +++ b/protocol/routing/discovery.md @@ -0,0 +1,41 @@ +IPFS Peer Discovery Protocol Spec +================================= + +Authors: David Dias + +Reviewers: + +TODOS: + +----------------------- + +> + +## Supports + +- Peer discovery through + - mdns + - custom peers list + - random walking on the network + +## Overview + + +### bootstrap peer list + +List with known and trusted peers shipped with IPFS. + +- _How is this list updated?_ +- _Is this list updated periodically_? + +### random walk + +IPFS issues random Peer lookups periodically to refresh our kbucket if needed. For impl reference, see: https://github.com/ipfs/go-ipfs/blob/master/routing/dht/dht_bootstrap.go#L88-L109. + +### mDNS + +In addition to known peers and random lookups, IPFS also performs Peer Discovery through mDNS ([MultiCast DNS](https://tools.ietf.org/html/rfc6762)) + +-_How offen do we issue this searches?_ + + diff --git a/protocol/routing/figs/routing-overview.monopic b/protocol/routing/figs/routing-overview.monopic new file mode 100644 index 0000000000000000000000000000000000000000..631136ca66cc20e3d0fede6347a0cdb531ad8507 GIT binary patch literal 920 zcmV;J184mIO;1iwP)S1pABzY8000000t4k(%W|7A6#bQq)?@~WM{IA~Mc19~IvGZQ z3~DgA5_Xb$^6$M8k_^~MoCo7cvOrc8=|0YV9KjAX=^vK05-oOWdKsum+K8Nqq~&+4 zN?L&y2zjVwcWpu|UeqgAw`k#eC~N8#=1E$yI%hdrv|_`|PF_o%GrJ^9nm4zd#7-P- zTbh*=j@mr~t*Ox}z&KhLn~I90+kh6Khn9k_SVHTpY(yg15;iOAF+n3amF(nHO?IO! zT+&K{uDr?m$6!Or^PJUggYuy}nbPd05Y48}llFcM))H{MEn%f7-S*xlRx79WV9-Ki zrOqZCmWl7+XqHRKlp5djk#$P)f1m_m>;?B6lPJ2#ex!1QtYaODE9vc`R#t>Bl+|=>D znGl{XHLwbpSkjCoVCXm%hOSqK>9wy84aln5GFNLVItj~p+e4@?(1l`VkIP%eI>+>% z7CF6gSXM)09}H-|LdT^po*kywc-Lk8?d@AAYVMjBzui^3a@|3!QJq%bwpj?v)`Xy<}Qev;ZrxkxcN5MMo!! zhDv!0qPam)!jl(5DGL+VHQ0$$-%5SbL1>qqwBCaL_ zlYDHlntvLyiq6he!O2`j!t-*~+z=A+k?S^QVrh&7UxTYi^p0Fbf*1Fh`Ke8N{<~B(URw)yjOJe;m*c?3c1gp+S>n)XoEcJl_JR74o z5Y_=3Ibw-l3