-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rendezvous protocol #44
Changes from all commits
696ca34
dcbd0b4
2ba1ad5
3330b69
00fa97b
b19d493
6dbbd4b
219dc40
ed7c068
af5ba41
3423e15
7036720
ec7a579
9d061e1
81e5078
9bdee4f
4059338
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,201 @@ | ||
# Rendezvous Protocol | ||
|
||
The protocol described in this specification is intended to provide a | ||
lightweight mechanism for generalized peer discovery. It can be used | ||
for bootstrap purposes, real time peer discovery, application specific | ||
routing, and so on. Any node implementing the rendezvous protocol can | ||
act as a rendezvous point, allowing the discovery of relevant peers in | ||
a decentralized fashion. | ||
|
||
## Use Cases | ||
|
||
Depending on the application, the protocol could be used in the | ||
following context: | ||
- During bootstrap, a node can use known rendezvous points to discover | ||
peers that provide critical services. For instance, rendezvous can | ||
be used to discover circuit relays for connectivity restricted | ||
nodes. | ||
- During initialization, a node can use rendezvous to discover | ||
peers to connect with the rest of the application. For instance, | ||
rendezvous can be used to discover pubsub peers within a topic. | ||
- In a real time setting, applications can poll rendezvous points in | ||
order to discover new peers in a timely fashion. | ||
- In an application specific routing setting, rendezvous points can be | ||
used to progressively discover peers that can answer specific queries | ||
or host shards of content. | ||
|
||
### Replacing ws-star-rendezvous | ||
|
||
We intend to replace ws-star-rendezvous with a few rendezvous daemons | ||
and a fleet of p2p-circuit relays. Real-time applications will | ||
utilize rendezvous both for bootstrap and in a real-time setting. | ||
During bootstrap, rendezvous will be used to discover circuit relays | ||
that provide connectivity for browser nodes. Subsequently, rendezvous | ||
will be utilized throughout the lifetime of the application for real | ||
time peer discovery by registering and polling rendezvous points. | ||
This allows us to replace a fragile centralized component with a | ||
horizontally scalable ensemble of daemons. | ||
|
||
### Rendezvous and pubsub | ||
|
||
Rendezvous can be naturally combined with pubsub for effective | ||
real-time discovery. At a basic level, rendezvous can be used to | ||
bootstrap pubsub: nodes can utilize rendezvous in order to discover | ||
their peers within a topic. Contrariwise, pubsub can also be used as | ||
a mechanism for building rendezvous services. In this scenerio, a | ||
number of rendezvous points can federate using pubsub for internal | ||
real-time distribution, while still providing a simple interface to | ||
clients. | ||
|
||
|
||
## The Protocol | ||
|
||
The rendezvous protocol provides facilities for real-time peer | ||
discovery within application specific namespaces. Peers connect to the | ||
rendezvous point and register their presence in one or more | ||
namespaces. | ||
|
||
Peers registered with the rendezvous point can be discovered by other | ||
nodes by querying the rendezvous point. The query specifies the | ||
namespace for limiting application scope and optionally a maximum | ||
number of peers to return. The namespace can be omitted in the query, | ||
which asks for all peers registered to the rendezvous point. | ||
|
||
The query can also include a cookie, obtained from the response to a | ||
previous query, such that only registrations that weren't included in | ||
the previous response will be returned. This allows peers to | ||
progressively refresh their network view without overhead, which | ||
greatly simplifies real time discovery. It also allows for pagination | ||
of query responses, so that large numbers of peer registrations can be | ||
managed. | ||
|
||
### Registration Lifetime | ||
|
||
Registration lifetime is controlled by an optional TTL parameter in | ||
the `REGISTER` message. If a TTL is specified, then the registration | ||
persists until the TTL expires. If no TTL was specified, then a default | ||
of 2hrs is implied. There may be a rendezvous point-specific upper bound | ||
on TTL, with a minimum such value of 72hrs. If the TTL of a registration | ||
is inadmissible, the rendezvous point may reject the registration with | ||
an `E_INVALID_TTL` status. | ||
|
||
Peers can refresh their registrations at any time with a new | ||
`REGISTER` message; the TTL of the new message supersedes previous | ||
registrations. Peers can also cancel existing registrations at any | ||
time with an explicit `UNREGISTER` message. | ||
|
||
### Interaction | ||
|
||
Clients `A` and `B` connect to the rendezvous point `R` and register for namespace | ||
`my-app` with a `REGISTER` message: | ||
|
||
``` | ||
A -> R: REGISTER{my-app, {QmA, AddrA}} | ||
R -> A: {OK} | ||
B -> R: REGISTER{my-app, {QmB, AddrB}} | ||
R -> B: {OK} | ||
``` | ||
|
||
Client `C` connects and registers for namespace `another-app`: | ||
``` | ||
C -> R: REGISTER{another-app, {QmC, AddrC}} | ||
R -> C: {OK} | ||
``` | ||
|
||
Another client `D` can discover peers in `my-app` by sending a `DISCOVER` message; the | ||
rendezvous point responds with the list of current peer reigstrations and a cookie. | ||
``` | ||
D -> R: DISCOVER{ns: my-app} | ||
R -> D: {[REGISTER{my-app, {QmA, Addr}} | ||
REGISTER{my-app, {QmB, Addr}}], | ||
c1} | ||
``` | ||
|
||
If `D` wants to discover all peers registered with `R`, then it can omit the namespace | ||
in the query: | ||
``` | ||
D -> R: DISCOVER{} | ||
R -> D: {[REGISTER{my-app, {QmA, Addr}} | ||
REGISTER{my-app, {QmB, Addr}} | ||
REGISTER{another-app, {QmC, AddrC}}], | ||
c2} | ||
``` | ||
|
||
If `D` wants to progressively poll for real time discovery, it can use | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unclear to me the intended use so perhaps this comment is moot, but: might the progressive poller also want information about unregisters? If they don't get that information then it seems like they'll incrementally build a registration set of non-decreasing size and decreasing usefulness because the list the caller keeps will start to include many stale peers who have unregistered over time. Depending on churn rate the number of stale registrations the caller has could potentially dwarf the number of actually registered peers. There are obvious solutions on the caller side but unclear to a reader like me what's intended here. Might be worth a comment at least. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reporting the unregistrations in a discovery response is problematic for a couple of reasons:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also note that explicit unregisters are expected to be rare events. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another points is that the progressive discovery mechanism is not intended to build a consistent replica of the rendezvou's point internal state, but rather discover new peers since the last discovery. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think that explicit unregisters might be useful when a node does a clean shutdown. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, of course! The question is whether it is worth the trouble propagating this in discover queries, or simply let the TTLs expire stale entries in clients. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could leave it up to the clients to decide when they want to disconnect. If they want to decrease the amount of registrations, they can purge it locally (checking if ttl expired and remove that registration would be one way, bad ping could be another). Seems to simplify by not having the unregister. |
||
the cookie obtained from a previous response in order to only ask for | ||
new registrations. | ||
|
||
So here we consider a new client `E` registering after the first query, | ||
and a subsequent query that discovers just that peer by including the cookie: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems like might not be a good idea to require registries to keep cookies forever (and probably not even across restarts) so the language here should probably indicate that what is returned when you pass the cookie is only likely an increment set; it could be the full set if the cookie is too old, unknown, the peer reset, whatever. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The registries shouldn't keep state for cookies at all, that's why they are client-side cookies. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that a stale cookie could elicit an arbitrary set in response. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added an |
||
``` | ||
E -> R: REGISTER{my-app, {QmE, AddrE}} | ||
R -> E: {OK} | ||
D -> R: DISCOVER{ns: my-app, cookie: c1} | ||
R -> D: {[REGISTER{my-app, {QmE, AddrE}}], | ||
c3} | ||
``` | ||
|
||
### Protobuf | ||
|
||
```protobuf | ||
message Message { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there need to be error types. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So you want a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. Also There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, although it's utility would be rather limited. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Let's not do this then.
The errors that it would throw would be:
What do you think of that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
that could be generalized to
Maybe a generic Another one we want is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes
Good idea, although I don't know if we should validate those as the clients might have newer incompatible protocol tables There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's have |
||
enum MessageType { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For messages with enums it is easy to to create messages that accidentally say something they don't intend to. The way I've seen this happen is that if an enum field has a default value that carries some semantics (name of a message, or a status) then the reader can't distinguish between a programmer explicitly setting that value versus not setting it at all. Or at least the API I was using a few years ago didn't distinguish between the two by default. Anyway, in order to avoid accidentally having semantics for enum fields the pattern was to have the default enum value always be something like UNSET or NOT_SET so that it was clear when someone explicitly set the value versus not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can certainly see some value in defensive programming, but I am not sure I like polluting the protocol with something that protects against PEBKAC-style programming errors. It would also be rather inconsistent as I don't think we do this anywhere else. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also note that the casual programmer, who is more likely to suffer from such PEBKAC, should never have to touch these protobufs directly. |
||
REGISTER = 0; | ||
REGISTER_RESPONSE = 1; | ||
UNREGISTER = 2; | ||
DISCOVER = 3; | ||
DISCOVER_RESPONSE = 4; | ||
} | ||
|
||
enum ResponseStatus { | ||
OK = 0; | ||
E_INVALID_NAMESPACE = 100; | ||
E_INVALID_PEER_INFO = 101; | ||
E_INVALID_TTL = 102; | ||
E_INVALID_COOKIE = 103; | ||
E_NOT_AUTHORIZED = 200; | ||
E_INTERNAL_ERROR = 300; | ||
} | ||
|
||
message PeerInfo { | ||
optional bytes id = 1; | ||
repeated bytes addrs = 2; | ||
} | ||
|
||
message Register { | ||
optional string ns = 1; | ||
optional PeerInfo peer = 2; | ||
optional int64 ttl = 3; // in seconds | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if this were named ttlSec it would be unambiguous and the comment would be unnecessary There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like it (it's ugly -- what is ttlSec? It's ttl in seconds. Is there a ttl not in seconds?), but I also don't feel strongly about it -- so if people would prefer ttlSec as the field, we can change it. |
||
} | ||
|
||
message RegisterResponse { | ||
optional ResponseStatus status = 1; | ||
optional string statusText = 2; | ||
} | ||
|
||
message Unregister { | ||
optional string ns = 1; | ||
optional bytes id = 2; | ||
} | ||
|
||
message Discover { | ||
optional string ns = 1; | ||
optional int64 limit = 2; | ||
optional bytes cookie = 3; | ||
} | ||
|
||
message DiscoverResponse { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might you want a way to signal errors? For example namespace too long or internal server error? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a ResponseStatus for reporting errors. |
||
repeated Register registrations = 1; | ||
optional bytes cookie = 2; | ||
optional ResponseStatus status = 3; | ||
optional string statusText = 4; | ||
} | ||
|
||
optional MessageType type = 1; | ||
optional Register register = 2; | ||
optional RegisterResponse registerResponse = 3; | ||
optional Unregister unregister = 4; | ||
optional Discover discover = 5; | ||
optional DiscoverResponse discoverResponse = 6; | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: there are several comments referencing "namespace ownership" but that concept does not appear to be present in this document. If it's a part of the protocol or caller contract it's probably worth explicitly mentioning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Namespace ownership is beyond the scope of the protocol I think -- it quickly gets into ACLs and that's a rathole I don't want to go down :)