Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rendezvous protocol #44

Closed
wants to merge 17 commits into from
Closed
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 136 additions & 0 deletions rendezvous/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Rendezvous Protocol

The protocol described in this specification is intended to provide a
lightweight mechanism for generalized peer discovery. It can be used
for bootstrap purposes, real time peer discovery, application specific
routing, and so on. Any node implementing the rendezvous protocol can
act as a rendezvous point, allowing the discovery of relevant peers in
a decentralized fashion.

## Use Cases

Depending on the application, the protocol could be used in the
following context:
- During bootstrap, a node can use known rendezvous points to discover
peers that provide critical services. For instance, rendezvous can
be used to discover circuit relays for connectivity restricted
nodes.
- During initialization, a node can use rendezvous to discover
peers to connect with the rest of the application. For instance,
rendezvous can be used to discover pubsub peers within a topic.
- In a real time setting, applications can poll rendezvous points in
order to discover new peers in a timely fashion.
- In an application specific routing setting, rendezvous points can be
used to progressively discover peers that can answer specific queries
or host shards of content.

### Replacing ws-star-rendezvous

We intend to replace ws-star-rendezvous with a few rendezvous daemons
and a fleet of p2p-circuit relays. Real-time applications will
utilize rendezvous both for bootstrap and in a real-time setting.
During bootstrap, rendezvous will be used to discover circuit relays
that provide connectivity for browser nodes. Subsequently, rendezvous
will be utilized throughout the lifetime of the application for real
time peer discovery by registering and polling rendezvous points.
This allows us to replace a fragile centralized component with a
horizontally scalable ensemble of daemons.

### Rendezvous and pubsub

Rendezvous can be naturally combined with pubsub for effective
real-time discovery. At a basic level, rendezvous can be used to
bootstrap pubsub: nodes can utilize rendezvous in order to discover
their peers within a topic. Contrariwise, pubsub can also be used as
a mechanism for building rendezvous services. In this scenerio, a
number of rendezvous points can federate using pubsub for internal
real-time distribution, while still providing a simple interface to
clients.


## The Protocol

The rendezvous protocol provides facilities for real-time peer
discovery within application specific namespaces. Peers connect to the
rendezvous point and register their presence in one or more
namespaces. Registrations persist until the peer disconnects or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requiring that nodes be connected to the rendezvous in order for their data to be served by it is strange. Maybe don't specify that bit? Maybe values could have a timeout or something similar?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe value lifetime deserves its own section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it has upsides for real time discovery -- just keep the node visible while it's still connected.
But you are right, there are use cases where we just want to register with some TTL and not keep a connection open.

I think we should just add an optional TTL in the REGISTER message -- if it's omitted, keep the registration until the node disconnects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add an extra paragraph about the lifetime of the registration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A section is better actually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done; we have an optional TTL in the REGISTER message now.

explicitly unregisters.

Peers registered with the rendezvous point can be discovered by other
nodes by querying the rendezvous point. The query specifies the
namespace for limiting application scope and optionally a maximum
number of peers to return. The namespace can be omitted in the query,
which asks for all peers registered to the rendezvous point.

### Interaction

Clients `A` and `B` connect to the rendezvous point `R` and register for namespace
`my-app` with a `REGISTER` message:

```
A -> R: REGISTER{my-app, {QmA, AddrA}}
B -> R: REGISTER{my-app, {QmB, AddrB}}
```

Client `C` connects and registers for namespace `another-app`:
```
C -> R: REGISTER{another-app, {QmC, AddrC}}
```

Another client `D` can discover peers in `my-app` by sending a `DISCOVER` message; the
rendezvous point responds with the list of current peer reigstrations.
```
D -> R: DISCOVER{my-app}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add a way of providing a limit of peers and a timestamp of when they connected.

DISCOVER:{[my-app], [LIMIT, SINCE]}, where:

  • LIMIT (integer) - optional, the max peer we're willing to receive (should prevent various attacks as well as allow peers/apps adjusting this depending on env/conditions). If none provided return all peers.
  • SINCE (timestamp) - optional, return peers connected from this time onward, allows returning fresh peers. If none provided, return all peers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of SINCE, as it is useful indeed. But we will have to correlate this with a rendezvous point timestamp, so we should probably include one in the DiscoverResponse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we should! Great idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

R -> D: [REGISTER{my-app, {QmA, Addr}}
REGISTER{my-app, {QmB, Addr}}]
```

If `D` wants to discover all peers registered with `R`, then it can omit the namespace
in the query:
```
D -> R: DISCOVER{}
R -> D: [REGISTER{my-app, {QmA, Addr}}
REGISTER{my-app, {QmB, Addr}}
REGISTER{another-app, {QmC, AddrC}}]
```

### Protobuf

```protobuf
message Message {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there need to be error types.
For ex. when a peer tries to register for an ID that the peer does not own.
Or when the peer tries to register a namespace that is way too long (say a big .jpg in base64).

Copy link
Contributor Author

@vyzo vyzo Apr 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you want a RegisterResponse?
Sure, we can do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Also DiscoverResponse could include an optional error, too

Copy link
Contributor Author

@vyzo vyzo Apr 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, although it's utility would be rather limited.
What is considered an error in this case? A non-existent namespace is equivalent to empty, and namespace name that is too long will always be empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, although it's utility would be rather limited.

Let's not do this then.

So you want a RegisterResponse?

The errors that it would throw would be:

  • E_NAMESPACE_TOO_LONG
  • E_OWNERSHIP_NOT_TRUSTED (or similar)

What do you think of that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E_NAMESPACE_TOO_LONG

that could be generalized to E_INVALID_NAMESPACE to convey rejected characters for some implementation. Or it could be a different error altogether.

E_OWNERSHIP_NOT_TRUSTED

Maybe a generic E_NOT_AUTHORIZED is more apt?

Another one we want is E_INVALID_MULTIADDR perhaps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a generic E_NOT_AUTHORIZED is more apt?

Yes

Another one we want is E_INVALID_MULTIADDR perhaps.

Good idea, although I don't know if we should validate those as the clients might have newer incompatible protocol tables

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have E_INVALID_PEER_INFO to signal generic rejection of the peer info, and we can see what kind of validation makes sense.

enum MessageType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to merge register and unregister requests? Or dropping the message type completly?
The client could simply check the length of the array for all types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want unregister so that we can depart from a namespace without having to disconnect.
The message type is needed to disambiguate the protocol.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For messages with enums it is easy to to create messages that accidentally say something they don't intend to. The way I've seen this happen is that if an enum field has a default value that carries some semantics (name of a message, or a status) then the reader can't distinguish between a programmer explicitly setting that value versus not setting it at all. Or at least the API I was using a few years ago didn't distinguish between the two by default. Anyway, in order to avoid accidentally having semantics for enum fields the pattern was to have the default enum value always be something like UNSET or NOT_SET so that it was clear when someone explicitly set the value versus not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can certainly see some value in defensive programming, but I am not sure I like polluting the protocol with something that protects against PEBKAC-style programming errors.

It would also be rather inconsistent as I don't think we do this anywhere else.

Copy link
Contributor Author

@vyzo vyzo Apr 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that the casual programmer, who is more likely to suffer from such PEBKAC, should never have to touch these protobufs directly.
These should be well abstracted by implementation libraries, and bugs of this type should be quickly weeded out there.

REGISTER = 0;
UNREGISTER = 1;
DISCOVER = 2;
DISCOVER_RESPONSE = 3;
}

message PeerInfo {
optional string id = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to use the smaller non-encoded version of the peer-id instead of the b58 string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, that's just the data type -- we use string for unencoded addresses in go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no crypto-challenge for the id the only way to prove ownership is the SECIO challenge, which is only done for the current peer-id. Therefore this field seems useless to me...
Why not just drop the PeerInfo msg and use repeated addrs on register?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registration is forwarded in the DiscoverResponse (modulo ttl) and it would thus be ambiguous without the peer id.

You also are missing the potential use case of rendezvous points sharing registrations (say in federation of daemons, using pubsub, etc).
More generally, having a peer id allows registrations to be passed around and reconstruct full PeerInfo objects from them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also are missing the potential use case of rendezvous points sharing registrations (say in federation of daemons, using pubsub, etc).

Then the data must be somehow authenticated, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can authenticate the sharing peer instead of the data.
Authenticating the registration itself would require a signature in them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can authenticate the sharing peer instead of the data.

If the data isn't authenticated, wouldn't that allow for replay attacks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but you'd get that too if you were authenticating with a nonce.

I don't think that replay attacks are a big concern in this context, as the sharing peers can establish a trust model with each other.

Note that signatures along are not sufficient to prevent replay attacks in a shared setting either, and trying to add a signature timestamp gets complicated quick.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

repeated bytes addrs = 2;
}

message Register {
optional string ns = 1;
optional PeerInfo peer = 2;
}

message Unregister {
optional string ns = 1;
}

message Discover {
optional string ns = 1;
optional int limit = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be theLIMIT I'm proposing in the comment above...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's already there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error: Could not resolve int: JS still complains

}

message DiscoverResponse {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might you want a way to signal errors? For example namespace too long or internal server error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.
We briefly discussed this with @mkg20001 and it seemed of limited utility for namespace errors, but internal server errors is probably something we want reported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a ResponseStatus for reporting errors.

repeated Register registrations = 1;
}

optional MessageType type = 1;
optional Register register = 2;
optional Unregister unregister = 3;
optional Discover discover = 4;
optional DiscoverResponse discoverResponse = 5;
}