DHT provider records: adopt sloppiness, rejection and backtracking from Coral #163

raulk · 2019-04-10T15:03:53Z

Context

Provider records in our Kad DHT implementation are equivalent to the concept of pointers as proposed in the Coral DHT [0].

Unfortunately they are implemented half-way in go-libp2p, and probably in other languages too. That makes them scale poorly.

Problem statement

Nodes close to a popular hash H get swamped with provider records. Every time a new provider appears for the same hash H, it advertises on exactly the same nodes. We currently advertise at the KValue=20 nodes closest to the hash H. Nodes that download hash H become providers in turn of that hash, therefore compounding the problem when hashes become very popular.

There is no concept of backpressure or spill over for provider records. No limits, throttling policies or validations are enforced. Peers swallow any provider record they receive. This ends up causing all kinds of local negative effects.

Proposal

Adopt these features of Coral:

rejection/throttling,
sloppy hashing / spill over,
backtracking.

Rejection/throttling: nodes restrict the amount of provider records stored per hash, and the rate of registration. The limit can be locally enforced and need not be synchronised across the network. Attacks by injecting uncollaborative nodes (with low or zero local limits) are nullified by spill-over.

Sloppy hashing/spill over: on a rejection, the requester backtracks along the traversal path and tries to store the record in earlier nodes.

This has the effect of expanding the radius of advertisement of popular hashes, such that subsequent lookups for that content will result in faster resolutions.

However, depending on how lookups are implemented, this could end up overloading new providers, and weaning traffic off elder providers. I feel this is an oversight in Coral.

Backtracking: the process by which we walk back our traversal trail hitting earlier nodes along the way, attempting to register our record further from the desired target, thus expanding the radius where the hash is advertised.

Coral's approach is insecure, insofar a single adversary along the path rejecting your record will derail you and cause you to backtrack, even if closer peers to the key had spare capacity.

To resist such attacks, we can continue traversing forward a few more steps to confirm the rejection, or we can traverse disjoint paths.

Additionally it does not play well with TTLs and disappearing content. If a hash has spilled over several layers, and early registrants fail to reprovide (if they're offline or they pruned the hash), farther nodes holding recent registrations who happen to be over-capacity could preclude us from finding closer nodes that now have vacancies due to expired and unrenewed earlier records. This could result in progressive expansion of the radius, with a hollow centre.

See these instructional videos for reference:

[0] Freedman, Michael & Mazieres, David. (2003). Sloppy Hashing and Self-Organizing Clusters. Lecture Notes in Computer Science. 2735. 10.1007/978-3-540-45172-3_4.

The text was updated successfully, but these errors were encountered:

anacrolix · 2019-04-11T00:12:03Z

@raulk It's my belief that a lot of benefits of this approach are provided by improved traversal code, which I was alluding to in our talk about plans for Q2. In particular with better control over termination and concurrency in traversals, it will be easier to adopt some of these approaches. I'm not sure if an issue exists for it (better traversal logic).

raulk · 2019-04-11T09:39:39Z

@anacrolix there's an issue for termination/cutoff: libp2p/go-libp2p-kad-dht#290. Could you file an issue in kad-dht for your proposals on concurrency?

As much as we need to fix the query traversal logic in go-libp2p, I'm afraid it won't alleviate the provider overload problem described herein.

yusefnapora · 2019-04-11T17:34:05Z

could you elaborate a bit on

This could result in progressive expansion of the radius, with a hollow centre.

Is this because the spill-over always backs further away from the query target when a provider is over capacity? And a more optimal pattern would be for an overburdened node to ask nodes closer to the target if they can re-provide?

raulk · 2019-04-16T14:04:21Z

@yusefnapora

Is this because the spill-over always backs further away from the query target when a provider is over capacity?

Yes, eventually the radius where a key is advertised key expands the more popular it gets (and the more providers are willing to serve it). If older providers fail to reprovide, and the logic to select the placement for an advertisement backs as soon as we encounter the first rejection, this will end up creating a hollow centre with an ever-growing radius around it. (Note that my reasoning is out of intuition, not formal modelling).

anacrolix · 2019-04-16T23:53:56Z

Note that the most robust and long-living nodes should have preference in routing tables, so that generally your search for nodes to store data on will favour those stronger nodes. i.e. You're most likely to store data on the best nodes, right from the get-go, with a correct implementation. That may be obscured at the moment by libp2p/go-libp2p-kad-dht#283 and libp2p/go-libp2p-kad-dht#254, which allow the property to be broken by transient failures in nodes or networks.

raulk · 2019-04-17T00:26:22Z

@anacrolix I believe you’re mixing up things. This is about solving provider record overload affecting neighbours of popular hashes. It has no relation with the routing table whatsoever. Your routing table is a means to get you to places, not a prescription of where data is stored.

Stebalien · 2019-04-25T01:11:51Z

However, depending on how lookups are implemented, this could end up overloading new providers, and weaning traffic off elder providers. I feel this is an oversight in Coral.

Peers generally take different paths to find content so this should be fine. However, we may need to add some randomness into how we put peers into our routing table (or maybe add some randomness into how we tell peers about the next N closer peers).

To resist such attacks, we can continue traversing forward a few more steps to confirm the rejection, or we can traverse disjoint paths.

We have to use disjoint paths here. Traversing further will only yield sibyls.

Additionally it does not play well with TTLs and disappearing content. If a hash has spilled over several layers, and early registrants fail to reprovide (if they're offline or they pruned the hash), farther nodes holding recent registrations who happen to be over-capacity could preclude us from finding closer nodes that now have vacancies due to expired and unrenewed earlier records. This could result in progressive expansion of the radius, with a hollow centre.

Won't they just tell us about closer nodes? That is, when putting a record, we'll first find the 20 closest nodes, then see that these nodes are overloaded and start backtracking.

raulk changed the title ~~DHT provider records: adopt sloppiness and backtracking from Coral~~ DHT provider records: adopt sloppiness, rejection and backtracking from Coral Apr 10, 2019

raulk assigned vyzo, Stebalien, jhiesey, anacrolix and Kubuxu Apr 10, 2019

raulk mentioned this issue Apr 10, 2019

Limit the size of provider sets libp2p/go-libp2p-kad-dht#316

Closed

jacobheun mentioned this issue Apr 17, 2019

Limit scope of queries to k closest peers libp2p/js-libp2p-kad-dht#97

Merged

vasco-santos mentioned this issue Apr 18, 2019

Scope of queries libp2p/js-libp2p-kad-dht#103

Open

tomaka mentioned this issue Apr 25, 2019

Have a sharded Kademlia protocol libp2p/rust-libp2p#1087

Closed

raulk mentioned this issue May 30, 2019

Critical path towards DHT efficiency and performance libp2p/go-libp2p-kad-dht#345

Open

7 tasks

anacrolix removed their assignment Aug 5, 2019

aarshkshah1992 mentioned this issue Aug 29, 2019

Cache Provider Records in the network libp2p/go-libp2p-kad-dht#386

Open

jhiesey removed their assignment Jun 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DHT provider records: adopt sloppiness, rejection and backtracking from Coral #163

DHT provider records: adopt sloppiness, rejection and backtracking from Coral #163

raulk commented Apr 10, 2019 •

edited

Loading

anacrolix commented Apr 11, 2019 •

edited

Loading

raulk commented Apr 11, 2019

yusefnapora commented Apr 11, 2019

raulk commented Apr 16, 2019

anacrolix commented Apr 16, 2019

raulk commented Apr 17, 2019 •

edited

Loading

Stebalien commented Apr 25, 2019

DHT provider records: adopt sloppiness, rejection and backtracking from Coral #163

DHT provider records: adopt sloppiness, rejection and backtracking from Coral #163

Comments

raulk commented Apr 10, 2019 • edited Loading

Context

Problem statement

Proposal

anacrolix commented Apr 11, 2019 • edited Loading

raulk commented Apr 11, 2019

yusefnapora commented Apr 11, 2019

raulk commented Apr 16, 2019

anacrolix commented Apr 16, 2019

raulk commented Apr 17, 2019 • edited Loading

Stebalien commented Apr 25, 2019

raulk commented Apr 10, 2019 •

edited

Loading

anacrolix commented Apr 11, 2019 •

edited

Loading

raulk commented Apr 17, 2019 •

edited

Loading