Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Every node should be a rendezvous node #145

Open
raulk opened this issue Feb 15, 2019 · 4 comments
Open

Proposal: Every node should be a rendezvous node #145

raulk opened this issue Feb 15, 2019 · 4 comments

Comments

@raulk
Copy link
Member

raulk commented Feb 15, 2019

Problem statement

Our definition of the rendezvous subsystem currently suffers from centralisation, as it relies on "known rendezvous points" to query and provide the discovery and announcement services.

The rendezvous spec briefly touches on the concept of federation but not from an angle to decentralise, rather as a mechanism to achieve:

  • eventual data replication (rendezvous nodes share their records with one another), and
  • real-time updates (which, OTOH, an attacker can maliciously exploit to track new providers of service XYZ to attack them promptly if a known attack vector exists).

Added to the DHT bootstrapping, it feels like we are doubling down on centralising vs. striving to decentralise, as both services (DHT and rendezvous) require trusted endpoints for an initial bind.

A natural response would be "let's merge both endpoints into a single node", but that:

a. creates deployment coupling between services,
b. hinges on assumptions of how the operators will operate these nodes,
c. increases the prominence and responsibility of these nodes, and makes them even more attractive for attackers, i.e. effectively turning them into "supernodes"

Discussion

I'd like to introduce the notion of making every node a rendezvous node by default. As libp2p nodes walk the network (either via DHT periodic refresh, or other means) they will discover these rendezvous points and bind with them if they need to.

This is aligned with the ideas @fjl brings to the table in his concept of Ethereum discovery v5 topics. In libp2p land, we have two concepts that are similar in nature: (a) libp2p rendezvous records, (b) libp2p DHT provider records.

However, there are two main features I'd like to point out: advertisement placement and throttling.

Advertisement placement

In the discv5 model, every node is a venue for advertising services, and the placement of ads is at random and ad libitum throughout the network. This makes it substantially different to our provider records, which rely on Kademlia XOR distance metric to find the suitable nodes for advertisements.

[Note: I reason about provider records as "symlinks": instead of placing the content itself (via PutValue), we place a symlink in the nodes we know are responsible for that content.]

discv5 is predicated on the idea that, if a service is popular enough (the proposal talks about 1% of the network size), by walking at random you will find advertisements for the service you're interested in (sounds like birthday paradox). This assertion needs to be tested.

Nevertheless, it also acknowledges that, if the network is too large, agreeing on a convention to select the advertisements venues (proximity and radius) is valuable. This brings that model closer to our model for provider records.

Throttling service registrations

discv5 introduces a mechanism to throttle service registrations based on popularity, which is nice.

My understanding: every time a node accepts a registration for a service, it throttles further registrations for that service by increasing the "ticket wait time".

Aside from DoS attack mitigation, this has another nice property: it encourages ad dissemination and decentralisation, as it incentivises the ad placer to look for another node in the network who will accept its registration quicker.

Further thoughts

  • Some aspects of discv5 rely on nodes knowing what the "network size" is; it is not clear to me how they sense that.
  • I'd like to introduce the notion of "challenges"; rendezvous nodes would be able to test if a node really provides the service they claim, and blacklist them locally or globally by providing proof of non-conformance to the network.
    • In practice, this could require small snippets of "challenge code" to be attached to protocols/service namespaces via IPLD+WebAssembly, et al. This idea is super speculative; just a train of thought.

@fjl – I'd love you to join this discussion and provide more colour/reasoning/ideas. Also to correct any of my misunderstandings ;-)

cc @djrtwo on convergence

@raulk
Copy link
Member Author

raulk commented Feb 15, 2019

cc @vyzo @jacobheun

@fjl
Copy link

fjl commented Feb 15, 2019

About 'throttling': the purpose of tickets and their waiting time is to ensure that registration waiting time converges to a time set by a protocol constant. Estimation of network size is not needed, but registrants can (and must) estimate the size (radius) of the topic they are registering for by looking at those waiting times.

@fjl
Copy link

fjl commented Feb 15, 2019

The placement of ads isn't random across the whole network, it's random within a portion of node ID space (hash(topic) +- radius). In practice, the radius of a sufficiently popular topic will be the whole network though.

@fjl
Copy link

fjl commented Feb 15, 2019

libp2p rendezvous and discv5 topic registration are very similar, the important differences are:

  • ttl is not required in discv5 because registrations drop out of topic queues naturally when new registrations are added
  • since discv5 has topics baked into the DHT at fundamental level, everyone participates
  • discv5 is UDP only, placing registrations on many nodes is easy to do and fast (no TCP or overlay network overhead)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

No branches or pull requests

2 participants