Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add minimal Kademlia DHT spec #108

Merged
merged 40 commits into from
Jun 29, 2021
Merged

Add minimal Kademlia DHT spec #108

merged 40 commits into from
Jun 29, 2021

Conversation

raulk
Copy link
Member

@raulk raulk commented Nov 7, 2018

This was a complicated birth.

TODO:

  • Specify that we haven't implemented PINGs.
  • Deviations in bucket behaviour from baseline Kademlia. Since we don't use PINGs, we don't test the least recently seen peer. We evict it blindly, thus causing a high degree of bucket thrashing and not observing the heuristic "the longer a peer has been around, the more likely it is to stay around".
  • Revisit RPC messages section. Copy protobufs and explain each field meticulously.
  • Resolve all feedback below.

Update by @mxinden on 2021-05-14.

This pull request adds a minimal specification for the libp2p Kademlia DHT protocol. It is a minimal version and thus while striving for correctness it does not strive for completeness. The version proposed in this pull request sets a base layer specification to be expanded in the future.

Areas worth expanding in future pull requests:

@ghost ghost assigned raulk Nov 7, 2018
@ghost ghost added the in progress label Nov 7, 2018
@raulk raulk requested a review from jhiesey November 7, 2018 18:49
@raulk raulk changed the title WIP Initial Kademlia DHT spec WIP Kademlia DHT spec Nov 7, 2018
Copy link
Contributor

@jhiesey jhiesey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, just a few comments. Do you want me to add the WIP (jhiesey) sections before or after merging this?

kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
Copy link
Member

@vasco-santos vasco-santos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a good initial version of this spec! Thanks @raulk

Left just a comment

kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
@tomaka
Copy link
Member

tomaka commented Nov 14, 2018

We should also mention how the substreams that use the protocol behave.
Is it expected to open one substream per request? Or should implementations make an effort to only use one at a time and send requests one by one? Should endpoints close their substream after a successful response, or can they continue sending more requests?
(note: I actually have these questions right now for Rust, as I have no clue)

tomaka
tomaka previously requested changes Nov 14, 2018
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
kad-dht/README.md Outdated Show resolved Hide resolved
@tomaka
Copy link
Member

tomaka commented Nov 14, 2018

Kademlia is by far the protocol for which I've suffered the most when implementing into Rust, because of all the differences between the well-documented Kademlia algorithm and the way it is implemented in libp2p. I think we should focus more on the latter and not copy-paste what people can already find on Google.

@tomaka
Copy link
Member

tomaka commented Nov 14, 2018

Also, writing the Kademlia specs is a tremendous task. I don't expect a complete spec to be less than ~2k lines, and expecting a single person to write them is very optimistic.

@raulk
Copy link
Member Author

raulk commented Nov 14, 2018

@tomaka

Kademlia is by far the protocol for which I've suffered the most when implementing into Rust, because of all the differences between the well-documented Kademlia algorithm and the way it is implemented in libp2p. I think we should focus more on the latter and not copy-paste what people can already find on Google.

That was the spirit of this spec: to focus on differential areas vs. regurgitating how Kademlia works (as conveyed in the spec intro). Hence it covers provider records, public key records, conflict resolution, peer correction, etc. which is specific to libp2p Kad DHT.

I don't expect a complete spec to be less than ~2k lines

Could you enumerate what other aspects are worth covering? Aspects that are unique in libp2p. We don't want to clone the Kademlia and friends literature.


Regarding all the data model comments, there's a request in the doc for @jhiesey to replace these descriptions with the protobuf.

@tomaka
Copy link
Member

tomaka commented Nov 15, 2018

Could you enumerate what other aspects are worth covering? Aspects that are unique in libp2p. We don't want to clone the Kademlia and friends literature.

Well, the 2k would include actually covering Kademlia, which I think we should do anyway, just not urgently.

I think there should be more explanation as to what actually happens on the network, rather than just dumping a protobuf definition file.
For example, which fields need to be field for which kind of RPC query? The format of the fields (bytes doesn't mean much)? Should nodes relay multiaddresses in their raw format, or produce error/ignore the ones it can't decode? Is there a limit to the number of multiaddresses that a peerinfo should contain?
That's just from the top of my head.
Also, the reason why I didn't implement record store in rust-libp2p at the time is that the Record definition was way too vague to be helpful. I'd expect more help from a specs.

@raulk
Copy link
Member Author

raulk commented Nov 17, 2018

@tomaka I agree that the RPC section needs redoing. The idea is to copy the protobufs, as these are normative for interoperability, and explain how each field is used and serialised (especially for the bytes types, which can be anything). @jhiesey, are you planning to tackle this?

I do recognise you've implemented this from scratch, and therefore your feedback is valuable. However, in all honesty, I don't see the value in reinventing the wheel and re-specifying the Kademlia baseline in this spec. I rather make it pre-required reading (like I've done), and build on top of it.

In a nutshell, I think of this spec as a diff between baseline Kademlia and our implementation, which:

  • deviates from Kademlia in some aspects, e.g. the way we manage buckets, the lack of PINGs, etc. (this needs specifying!)
  • cherry-picks ideas from Coral, mainlineDHT, etc.

Maybe you can compile a list of all the areas you tripped over, and we can make sure to cover them?

Also, Kademlia is abstract, in the sense that it doesn't specify wire messages, RPCs, timeouts, etc. So our spec should define those aspects very clearly.

@ghost ghost mentioned this pull request Nov 19, 2018
16 tasks
@tomaka
Copy link
Member

tomaka commented Nov 28, 2018

cc the second part of this comment: #111 (comment)

@raulk
Copy link
Member Author

raulk commented Dec 17, 2018

@jhiesey tagged you in the comments that need your attention; let's try to push this through! ;-)

@jhiesey
Copy link
Contributor

jhiesey commented Dec 21, 2018

@raulk sorry for not getting to this before now! Will work on this today.

@jhiesey
Copy link
Contributor

jhiesey commented Dec 21, 2018

I've addressed the issues I see. Left some more comments too.

@jhiesey
Copy link
Contributor

jhiesey commented Jan 10, 2019

What's the status on this @raulk? Anything I can do to help?

kad-dht/README.md Outdated Show resolved Hide resolved
@jhiesey
Copy link
Contributor

jhiesey commented Jan 11, 2019

After reading @anacrolix 's feedback on this and my refactor proposal, I think we should simplify this DHT node spec substantially and move a bunch of the hairier stuff into separate discovery modules with their own specs.

@Mikerah
Copy link
Contributor

Mikerah commented Feb 6, 2019

I noticed that some of the links the the bibliography were behind Springer's paywall. It would be awesome to provide the links to these papers from the author's website for example. I think this would increase the accessibility of the spec.


### Kademlia routing table

The data structure backing this system is a k-bucket routing table, closely
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me what is this document a spec of? This paragraph describes our implementation, not the spec of the protocol (which is implementation-agnostic). E.g. it is an implementation detail that we use kbucket data structure. Other places in our stack use XOR-tries instead (which is generally recommended). So, what is this a spec of? Or is this document a description of an implementation?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A protocol spec would be worded differently. It would say: An implementation of a Kademlia node must maintain K peers with shared prefix of length L, for every L.

Copy link
Member

@mxinden mxinden Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what is this a spec of?

Good point. This should be a specification of the protocol. Not a description of a specific implementation.

E.g. it is an implementation detail that we use kbucket data structure.

Thanks. 9355a8f refines this section, treating the data structure as an implementation detail. The section now contains your generic description using key prefix lengths.

Other places in our stack use XOR-tries instead (which is generally recommended).

I was not aware of XOR-tries. I will take a deeper look. With 9355a8f the specification suggests XOR-tries as one possible data structure.

The libp2p Kademlia DHT offers the following types of routing operations:

- **Peer routing** - _Finding_ the closest nodes to a given key (`FIND_NODE`).
- **Value routing** - _Putting_ a value to the nodes closest to the value's key
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a routing operation (as the section title suggests). Also the description is not accurate.
PUT_VALUE puts a value on the node it is called on. (Not on the "closest node to the key").

Copy link
Member

@mxinden mxinden Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed via 072360f. Let me know what you think @petar.

- **Value routing** - _Putting_ a value to the nodes closest to the value's key
(`PUT_VALUE`) and _getting_ a value by its key from the nodes closest to that
key (`GET_VALUE`).
- **Content routing** - _Adding_ oneself to the list of providers for a given
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

Copy link
Member

@mxinden mxinden Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed via 072360f. Let me know what you think @petar.

### Peer routing

The below is one possible algorithm to find nodes closest to a given key on
the DHT. Implementations may diverge from this base algorithm as long as they
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Implementations may diverge from this base algorithm": so what's the point of describing this algorithm if our implementation is different and other implementations don't have to follow this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation for including this basic algorithm is to enable readers to write a minimal implementation compatible with the existing networks (e.g. IPFS and Polkadot) without having to read e.g. the go-libp2p or rust-libp2p source code.

That said, I don't feel strongly about this. If you prefer I am happy to remove it. In my opinion we should add descriptions of the more advanced algorithms from go-libp2p in future pull requests.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I would just elaborate slightly, something like: "may diverge as long as they adhere to the wire format and make progress towards the target key"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in b074091.


We keep track of the set of peers we've already queried (`Pq`) and the set of
next query candidates sorted by distance from `Key` in ascending order (`Pn`).
At initialization `Pn` is seeded with the `α` peers from our routing table we
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our implementation, is seeded with K peers.

Copy link
Member

@mxinden mxinden Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Adjusted via c4d4b53.

Then we loop:

1. > The lookup terminates when the initiator has queried and gotten responses
from the k (see [#replication-parameter-k]) closest nodes it has seen.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our implementation, a lookup actually terminates when a given percent (e.g. 80%) of the beta (another parameter) peers closest to the target have been contacted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. That might be something worth exploring for rust-libp2p as well.

As mentioned above, I would suggest keeping this sample algorithm minimal for now, extending it with various optimization descriptions in future pull requests. Again, I don't feel strongly about this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine. What you've written will work. It'll just wait quite a bit before it is ready to terminate, but it is correct.


The lookup might terminate early in case the local node queried all known
nodes, with the number of nodes being smaller than `k`.
2. Pick as many peers from the candidate peers (`Pn`) as the `α` concurrency
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do you do the picking? Our implementation does this after a prior request completes and the state is updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow. Steps 1 - 4 are executed in a loop, thus picking happens at the very start as well as each time a response is received (3) and the termination criterion (1) is still false. As far as I can tell, this is inline with what you describe above. Does that make sense @petar?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right... I overlooked. This looks fine.

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @petar for the review! Very much appreciated.

I have addressed all your comments. Would you mind taking another look?

### Peer routing

The below is one possible algorithm to find nodes closest to a given key on
the DHT. Implementations may diverge from this base algorithm as long as they
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation for including this basic algorithm is to enable readers to write a minimal implementation compatible with the existing networks (e.g. IPFS and Polkadot) without having to read e.g. the go-libp2p or rust-libp2p source code.

That said, I don't feel strongly about this. If you prefer I am happy to remove it. In my opinion we should add descriptions of the more advanced algorithms from go-libp2p in future pull requests.


We keep track of the set of peers we've already queried (`Pq`) and the set of
next query candidates sorted by distance from `Key` in ascending order (`Pn`).
At initialization `Pn` is seeded with the `α` peers from our routing table we
Copy link
Member

@mxinden mxinden Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Adjusted via c4d4b53.

Then we loop:

1. > The lookup terminates when the initiator has queried and gotten responses
from the k (see [#replication-parameter-k]) closest nodes it has seen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. That might be something worth exploring for rust-libp2p as well.

As mentioned above, I would suggest keeping this sample algorithm minimal for now, extending it with various optimization descriptions in future pull requests. Again, I don't feel strongly about this.


The lookup might terminate early in case the local node queried all known
nodes, with the number of nodes being smaller than `k`.
2. Pick as many peers from the candidate peers (`Pn`) as the `α` concurrency
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow. Steps 1 - 4 are executed in a loop, thus picking happens at the very start as well as each time a response is received (3) and the termination criterion (1) is still false. As far as I can tell, this is inline with what you describe above. Does that make sense @petar?


### Kademlia routing table

The data structure backing this system is a k-bucket routing table, closely
Copy link
Member

@mxinden mxinden Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, what is this a spec of?

Good point. This should be a specification of the protocol. Not a description of a specific implementation.

E.g. it is an implementation detail that we use kbucket data structure.

Thanks. 9355a8f refines this section, treating the data structure as an implementation detail. The section now contains your generic description using key prefix lengths.

Other places in our stack use XOR-tries instead (which is generally recommended).

I was not aware of XOR-tries. I will take a deeper look. With 9355a8f the specification suggests XOR-tries as one possible data structure.

The libp2p Kademlia DHT offers the following types of routing operations:

- **Peer routing** - _Finding_ the closest nodes to a given key (`FIND_NODE`).
- **Value routing** - _Putting_ a value to the nodes closest to the value's key
Copy link
Member

@mxinden mxinden Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed via 072360f. Let me know what you think @petar.

- **Value routing** - _Putting_ a value to the nodes closest to the value's key
(`PUT_VALUE`) and _getting_ a value by its key from the nodes closest to that
key (`GET_VALUE`).
- **Content routing** - _Adding_ oneself to the list of providers for a given
Copy link
Member

@mxinden mxinden Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed via 072360f. Let me know what you think @petar.

@mxinden
Copy link
Member

mxinden commented Jun 8, 2021

@petar friendly ping. Would you mind taking another look?

@aschmahmann could you give this pull request a review as well?

@petar
Copy link

petar commented Jun 8, 2021

@mxinden Looks much better. Thank you!

Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is much improved. Left a few comments/questions.

Comment on lines +94 to +95
- Getting providers for a given key from the nodes closest to that key via
`GET_PROVIDERS`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if worth pointing out here that GET_PROVIDERS and GET_VALUE both have FIND_NODE semantics too (i.e. they return values and/or closest peers to the target) and so we can optimize by not needing to do FIND_NODE's first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if worth pointing out here that GET_PROVIDERS and GET_VALUE both have FIND_NODE semantics too (i.e. they return values and/or closest peers to the target)

This is detailed in the corresponding GET_PROVIDER and GET_VALUE message descriptions in ## RPC messages:

* `GET_VALUE`: In the request `key` is an unstructured array of bytes. If `key`
  is a public key (begins with `/pk/`) and the key is known, the response has
  `record` set to that key. Otherwise, `record` is set to the value for the
  given key (if found in the datastore) and `closerPeers` is set to the `k`
  closest peers.

* `GET_PROVIDERS`: In the request `key` is set to a CID. The target node
  returns the closest known `providerPeers` (if any) and the `k` closest known
  `closerPeers`.

What do you think @aschmahmann? Is this good enough?

and so we can optimize by not needing to do FIND_NODE's first.

I am not quite sure why one should do a FIND_NODE before a GET_PROVIDER or GET_VALUE in the first place? As far as I know this is neither suggested in the Kademlia paper nor in this specification (e.g. see #### Content provider discovery section). Am I missing something @aschmahmann?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this good enough?

Probably?

Am I missing something @aschmahmann?

Nope, I guess it's just an obvious sort of thing. It could have been designed that GET_VALUE only returned the value/error instead of also sending back closest peers but this way is more efficient.

Comment on lines +75 to +77
- **Peer routing**

- Finding the closest nodes to a given key via `FIND_NODE`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be a bit careful with the wording here. FIND_NODE is unfortunately an overloaded function that does two things:

  1. Tells you the closest DHT server peers to the target
  2. Tells you about a peer you explicitly asked for even if they are not a DHT server

The second thing is frequently referred to as peer routing since it's helping route you to a peer. I'm not sure what to call the first, but it's routing you towards the target key in the peer-space (not quite the kademlia/xor space since the peerIDs need to be SHA256'd first).

Maybe just call this section FIND_NODE and define both behaviors.

If your curious this issue (libp2p/go-libp2p-kad-dht#584) describes my frustration with special casing around FIND_NODE and ADD/GET_PROVIDERS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be a bit careful with the wording here. FIND_NODE is unfortunately an overloaded function that does two things:

1. Tells you the closest DHT server peers to the target

2. Tells you about a peer you explicitly asked for even if they are not a DHT server

Let me paraphrase the above to make sure I correctly understand your argument:

  • On FIND_NODE a DHT server returns the closest other DHT servers to the given target.
  • On FIND_NODE a DHT server returns a single DHT client if and only if the DHT client's peer ID matches the target.

In my eyes documenting this distinction without documenting the DHT client / server mode extension (see your comment on rust-libp2p) would be inconsistent.

I would prefer to tackle both in follow-up pull requests. What do you think @aschmahmann?

I added your comment to the list at the top of the pull request to keep track of it.

Copy link
Contributor

@aschmahmann aschmahmann Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On FIND_NODE a DHT server returns a single DHT client if and only if the DHT client's peer ID matches the target.

Not quite, if we happen to know about a peer whose peerID is an exact match for the FIND_NODE query key then we add them to our list of servers to return. https://github.com/libp2p/go-libp2p-kad-dht/blob/6fff2a3b391f73da7a1c7558e27725ebe730e7ba/handlers.go#L256

In my eyes documenting this distinction without documenting the DHT client / server mode extension (see your comment on rust-libp2p) would be inconsistent.

I guess that's fair. If all nodes were servers this distinction wouldn't really matter. A follow up PR is fine.

Note: client mode isn't just for unreachable peers, it's generally recommended for "low quality" peers. This could mean low power, low bandwidth, low network uptime, etc. Basically peers that if they were servers would do more harm then good probably shouldn't be servers but may still want to make queries.

Comment on lines 137 to 139
to _put_ and _get_ the public keys of nodes. Node public keys are stored in
records under the `/pk` namespace. That is, the entry `/pk/<peerID>` will store
the public key of peer `peerID`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we referencing the /pk space as part of the libp2p spec, is this just an example?

Two things:

  1. This namespace shouldn't be required
  2. It's effectively deprecated in IPFS

For context the main reason why it was deprecated was that it straddled this weird space between:

  • Your key is too big to include in whatever other system actually needs your key
  • Your key is small enough a DHT server will store it for you

Eventually IPFS decided that the distance between those two was effectively zero and started including the keys that were previously deemed too large (i.e. RSA) in IPNS records (the place they needed them). They also switched to Ed25519 keys by default since they're much smaller so they don't even need to embed them anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both for the hint and the background. Removed with a065aac.

Comment on lines 162 to 164
When _gettting_ a value in the DHT, the implementor should collect at least `Q`
(quorum) responses from distinct nodes to check for consistency before returning
an answer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point here is basically to determine if/when it's safe to shortcut your query and there are multiple ways to determine reliability.

A plain quorum is just one way. For example, you could also choose to adjust your view based on how close to the target the peer who gave you the content is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor for this specification to be extended with more advanced procedures, though I would like to keep it simple in this iteration only mentioning the simple Quorum style. Is this ok for you @aschmahmann? If so, would you still like the wording to be adjusted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked your wording "Implementations may diverge from this base algorithm as long as they adhere
to the wire format and make progress towards the target key." in that it indicated what we care about vs what we don't.

If you think it's obvious that all of this quorum stuff is optional client behavior and people can do whatever they want then feel free to leave it as is. If you think it's not really obvious, then it might be worth adding some wording around how when getting values you can abort whenever you feel confident you have the latest value, for example using a quorum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

324f915 extends the section, stressing the fact that quorum is one mechanism among many to determine when a query is finished. Let me know what you think.

Comment on lines 200 to 202
terminate early and return `best`. In either case we notify the peers holding
an outdated value (`Po`) of the best value we discovered, by sending
`PUT_VALUE(Key, best)` messages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably also want to send the PUT_VALUE to the closest peers who should've had it. i.e. you're sending to closest peers - peers who already have the latest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, thanks! Adjusted in 20b3b73.

Comment on lines 284 to 285
For performance reasons, a node may prune expired advertisements only
periodically, e.g. every hour.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is generic to all Puts/Gets. Your implementation can do whatever as long as you only return valid data.

Your network may have implied and/or explicit expirations times per record type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artifact of past versions of this pull request. Thinking about it some more, I am in favor of removing it. To me it is implementation specific. 1dcb218 removes it.

@aschmahmann let me know if you disagree.

Comment on lines 293 to 299
On every run, we generate a random peer ID and we look it up via the process
defined in [peer routing](#peer-routing). Peers encountered throughout the
search are inserted in the routing table, as per usual business.

This process is repeated as many times per run as configuration parameter
`QueryCount` (default: 1). Every repetition is subject to a `QueryTimeout`
(default: 10 seconds), which upon firing, aborts the run.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't what go-libp2p does. We can slap on the usual "here's one way to do it disclaimer".

Unfortunately, this way is kind of bad because the bigger the network is the harder it is to find peers closer to you like this. At the very least you want to search for yourself in the network.

I can give the brief approach taken in go-libp2p-kad-dht if you're interested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 c755a41 includes self lookup.

This isn't what go-libp2p does.

Neither does rust-libp2p. It first does a lookup for the local key and then tries to probabilistically generate a key for each bucket.

I can give the brief approach taken in go-libp2p-kad-dht if you're interested.

Yes.

Comment on lines +332 to +337
// Note: These fields were removed from the Record message
//
// Hash of the authors public key
// optional string author = 3;
// A PKI signature for the key+value+author
// optional bytes signature = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to keep these in, maybe we should just denote the numbers as reserved/dead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept them in to give some background on what these numbers were originally used for. In my eyes, with the note at the top // Note: These fields were removed from the Record message this shouldn't confuse anyone.

Not a strong opinion. Would you prefer this to be changed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your call. Was thinking it might make the long spec a little shorter was all.

Comment on lines 408 to 409
node to be found. `closerPeers` is set in the response; for an exact match
exactly one `Peer` is returned; otherwise `k` closest `Peer`s are returned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you by exact match here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in the ID of the peer matches the target key in the original FIND_NODE request. Does that make sense @aschmahmann?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure why we don't always return the k closest peers but by https://github.com/libp2p/go-libp2p-kad-dht/blob/6fff2a3b391f73da7a1c7558e27725ebe730e7ba/handlers.go#L256

the only time we return a single peer is when Peer A asks Peer B "do you know peer B" and they say "yes, it's me". If Peer A asks Peer B "do you know peer C" they will say "yes here's there address along with k peers closer to them"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking this up. dab4549 simplifies the specification, always returning the k closest peers. I don't think we should document / require the peculiarity of go-libp2p-kad-dht returning a single peer when oneself was looked up. Sounds good?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's right, or if it's important we can catch it in the follow up PR. We should investigate historical context here and see if a change is merited in go-libp2p-kad-dht.

kad-dht/README.md Outdated Show resolved Hide resolved
Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aschmahmann for the thorough review, especially with the additional background.

I replied to all of your comments. Would you mind taking another look?

kad-dht/README.md Outdated Show resolved Hide resolved
Comment on lines +75 to +77
- **Peer routing**

- Finding the closest nodes to a given key via `FIND_NODE`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be a bit careful with the wording here. FIND_NODE is unfortunately an overloaded function that does two things:

1. Tells you the closest DHT server peers to the target

2. Tells you about a peer you explicitly asked for even if they are not a DHT server

Let me paraphrase the above to make sure I correctly understand your argument:

  • On FIND_NODE a DHT server returns the closest other DHT servers to the given target.
  • On FIND_NODE a DHT server returns a single DHT client if and only if the DHT client's peer ID matches the target.

In my eyes documenting this distinction without documenting the DHT client / server mode extension (see your comment on rust-libp2p) would be inconsistent.

I would prefer to tackle both in follow-up pull requests. What do you think @aschmahmann?

I added your comment to the list at the top of the pull request to keep track of it.

Comment on lines +94 to +95
- Getting providers for a given key from the nodes closest to that key via
`GET_PROVIDERS`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if worth pointing out here that GET_PROVIDERS and GET_VALUE both have FIND_NODE semantics too (i.e. they return values and/or closest peers to the target)

This is detailed in the corresponding GET_PROVIDER and GET_VALUE message descriptions in ## RPC messages:

* `GET_VALUE`: In the request `key` is an unstructured array of bytes. If `key`
  is a public key (begins with `/pk/`) and the key is known, the response has
  `record` set to that key. Otherwise, `record` is set to the value for the
  given key (if found in the datastore) and `closerPeers` is set to the `k`
  closest peers.

* `GET_PROVIDERS`: In the request `key` is set to a CID. The target node
  returns the closest known `providerPeers` (if any) and the `k` closest known
  `closerPeers`.

What do you think @aschmahmann? Is this good enough?

and so we can optimize by not needing to do FIND_NODE's first.

I am not quite sure why one should do a FIND_NODE before a GET_PROVIDER or GET_VALUE in the first place? As far as I know this is neither suggested in the Kademlia paper nor in this specification (e.g. see #### Content provider discovery section). Am I missing something @aschmahmann?

Comment on lines 137 to 139
to _put_ and _get_ the public keys of nodes. Node public keys are stored in
records under the `/pk` namespace. That is, the entry `/pk/<peerID>` will store
the public key of peer `peerID`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both for the hint and the background. Removed with a065aac.

Comment on lines 162 to 164
When _gettting_ a value in the DHT, the implementor should collect at least `Q`
(quorum) responses from distinct nodes to check for consistency before returning
an answer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor for this specification to be extended with more advanced procedures, though I would like to keep it simple in this iteration only mentioning the simple Quorum style. Is this ok for you @aschmahmann? If so, would you still like the wording to be adjusted?

Comment on lines 269 to 271
Each peer that receives the `ADD_PROVIDER` RPC should validate that the
received `PeerInfo` matches the sender's `peerID`, and if it does, that peer
must store the `PeerInfo` in its datastore.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, appreciate the additional background. 6ec65b5 loosens the requirement.

Comment on lines 284 to 285
For performance reasons, a node may prune expired advertisements only
periodically, e.g. every hour.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artifact of past versions of this pull request. Thinking about it some more, I am in favor of removing it. To me it is implementation specific. 1dcb218 removes it.

@aschmahmann let me know if you disagree.

Comment on lines 293 to 299
On every run, we generate a random peer ID and we look it up via the process
defined in [peer routing](#peer-routing). Peers encountered throughout the
search are inserted in the routing table, as per usual business.

This process is repeated as many times per run as configuration parameter
`QueryCount` (default: 1). Every repetition is subject to a `QueryTimeout`
(default: 10 seconds), which upon firing, aborts the run.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 c755a41 includes self lookup.

This isn't what go-libp2p does.

Neither does rust-libp2p. It first does a lookup for the local key and then tries to probabilistically generate a key for each bucket.

I can give the brief approach taken in go-libp2p-kad-dht if you're interested.

Yes.

Comment on lines +332 to +337
// Note: These fields were removed from the Record message
//
// Hash of the authors public key
// optional string author = 3;
// A PKI signature for the key+value+author
// optional bytes signature = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept them in to give some background on what these numbers were originally used for. In my eyes, with the note at the top // Note: These fields were removed from the Record message this shouldn't confuse anyone.

Not a strong opinion. Would you prefer this to be changed?

Comment on lines 408 to 409
node to be found. `closerPeers` is set in the response; for an exact match
exactly one `Peer` is returned; otherwise `k` closest `Peer`s are returned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in the ID of the peer matches the target key in the original FIND_NODE request. Does that make sense @aschmahmann?

Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one more comment, and I'm sure there are nits to be had in places but this is way better than nothing and given we are planning a follow up PR to iron out more details of the spec (e.g. support for clients) I'm happy here. Thanks for all the hard work 😄

My 2c are that as the specs evolve, including making an IPFS DHT spec, we can start to separate out the segments on, basic wire formats/rules ("the spec"), how to build compatible networks ("how to use the spec"), and how client implementations might work ("how to work with spec compliant networks").

Comment on lines 274 to 287
The bootstrap process is responsible for keeping the routing table filled and
healthy throughout time. It runs once on startup, then periodically with a
configurable frequency (default: 5 minutes).

On every run, we generate a random peer ID and we look it up via the process
defined in [peer routing](#peer-routing). Peers encountered throughout the
search are inserted in the routing table, as per usual business.

This is repeated as many times per run as configuration parameter `QueryCount`
(default: 1). In addition, to improve awareness of nodes close to oneself,
implementations should include a lookup for their own peer ID.

Every repetition is subject to a `QueryTimeout` (default: 10 seconds), which
upon firing, aborts the run.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should include our usual, here's one simple way of doing this. Especially since neither go nor rust libp2p's implementations actually do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 added via 3e6f8f5.

@mxinden
Copy link
Member

mxinden commented Jun 28, 2021

@tomaka @alanshaw you are both still tracked as "requesting changes". I have addressed your comments and suggestions above. Could you either remove your request for changes or give this pull request another review?

@BigLep
Copy link
Contributor

BigLep commented Jun 28, 2021

@mxinden : given this has had lots of eyeballs and iterations, I would suggest we give a cutoff date for comments (e.g., EOD 2021-06-29). Otherwise we merge. Further improvements can always be made after if necessary.

@tomaka tomaka dismissed their stale review June 29, 2021 07:29

No attention span

@mxinden
Copy link
Member

mxinden commented Jun 29, 2021

🎉 After more than two years, this is ready to be merged. Thanks to everyone involved! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.