-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(webrtc): add WebRTC (prev. browser-to-browser) spec #497
Conversation
Isn't this "any-to-browser"? "Servers" will also be able to use the same mechanism to dial browsers, right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting!
Even two servers could use if for some reason they wanted to use WebRTC to communicate to each other, right?. Isn't it more like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some high level thougths. This PR is mostly a description of what other protocols do. We need to specify how the libp2p-specific parts work here! This means especially that we need to specify how the signaling channel works.
In particular, (1) - (3) is just a description of Circuit v2, (4) - (6) is just a rehearsal of how a WebRTC handshake looks like between two browsers.
To make this an actual spec of how libp2p does browser-to-browser WebRTC, the following needs to be taken into account.
STUN
If I understand browser WebRTC correctly, it is required that browsers use a STUN server. The spec is completely silent about this. At the very least, we should acknowledge that implementations will need to pick a STUN server, and leave it up to the implementation to pick one (or to expose a configuration option).
Most people would then probably Google's public (and free) STUN service, as this seems to be the most popular service around. This might be considered problematic, as we might not want to have a large fraction of STUN traffic sent to Google. We could also mention that setting up a dedicated STUN server yourself (as a project maintainer) is within the realm of possibilities, and we might even consider doing that for the IPFS network.
The trade-offs need to be mentioned in the specification.
TURN
Not every (WebRTC-initiated) hole punching attempt will be successful. Many video-conferencing tools therefore set up TURN servers to help those 5-15% of unlucky users connect throught that TURN server. This is a problem, since we (as PL) can't and won't provide a (centralized) TURN server.
If you need a TURN server or not depends very much on the use case. It's useful for video conferencing, but not so much for WebTorrent, for example. I found the discussion of the tradeoffs invovled in this podcast episode very interesting.
Anyway, this should be mentioned and discussed in this specification.
Signaling Channel
DCUtR, again ☹️
I’m a bit sad to see that this PR still contains the original proposal to use DCUtR. I really don’t understand where the desire to shoehorn DCUtR into something that it’s never been designed to do is coming from. I had already flagged this in my review of the original PR in September.
DCUtR is an exceptionally bad fit here:
- DCUtR is a two-step protocol, consisting of a SYNC and a CONNECT step. We don’t need that here, we only need to exchange the peers’ SDPs, unless... we trickle.
- In libp2p hole punching, there's no need to trickle address candidates, as nodes learn about their addresses (often times long) before the connection attempt. In WebRTC, as we’re restricted by the browser API, address discovery happen during the connection attempt, and trickling addresses can greatly speed up the handshake (at least this is claimed by various sources around the internet).
- DCUtR sends addresses as multiaddrs. WebRTC on the other hand needs to exchange entire SDPs.
I also don't understand how identify push would solve the problem here.
libp2p protocol identifiers are not a limited resource by any means. Small, single-purpose protocols are a much better design pattern than big, bloated protocols.
What we should do instead
To understand how signaling works, I found it instructive to go through this WebRTC tutorial (with code) to see how signaling actually works.
We should:
- Define an identifier for the libp2p WebRTC protocol. Proposal:
/webrtc-direct
, but feel free to bike-shed this. - This protocol is used on a (bidirectional) stream, peers send each other WebRTC messages, and close the stream once the WebRTC connection has been established (or the connection attempt has failed).
- Define how messages are serialized on the stream.
While it would be possible to define a Protobuf serialization, this would be a quite cumbersome task. Some of the data structures exchanged are pretty large, for example the RTCIceCandidate
. It probably makes more sense to either serialize the entire message to JSON (like in the linked example), or to define a protobuf for the message type (offer, answer, ICE candidate, etc.) that then carries the actual message as a JSON string.
Authentication
The SDPs are exchanged over an (authenticated) libp2p connection, and they contain the certificate hash of their respective endpoint (not sure if that's what Chinmay was pointing out). So technically, there’s no need to run a Noise handshake, given that the browser checks the certificate hashes. Nodes can just “transfer” pubkeys / peer IDs from that connection to the newly established WebRTC connection and expose them on the connection API.
Agreed with both of you @Menduist and @thomaseizinger. That said, I don't think "any-to-any" or "direct-webrtc-connection-upgrade-through" is very intuitive. Thus I suggest we stick with browser-to-browser, as it is the most intuitive name I can think of. |
@marten-seemann thanks for the review. 29d166b addresses some of your comments. Still lots more to come. |
Experimenting with whether this makes reviewing and git diffs simpler.
Maybe just "webrtc-holepunching"? I'd rather have something correct than something catchy (and B2B isn't that catchy anyway, since server to browser is as important as B2B, imo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what we should use as the protocol id. webrtc-direct
is already in use, but I have been using webrtc-peer
in my experiments.
Signaling
Signaling here can be very simple. For eg. I have been using the following protobuf:
syntax = "proto2";
message Message {
enum Type {
OFFER = 0;
ANSWER = 1;
CANDIDATE = 2;
}
required Type type = 1;
required string data = 2;
}
The types are self explanatory. In the data section:
- If encoding an offer/answer, the data field contains the sdp string from the
RTCSessionDescription
. - If encoding a candidate, I have been using the JSON created by using the
toJSON
method on theRTCIceCandidate
. This allows the receiver to simply deserialize the JSON and pass it to theRTCIceCandidate
constructor. If the candidate is null or undefined, we can just send an empty string.
STUN/TURN
For browsers, it seems out of scope of this spec to define STUN or TURN server usage. It should be left to implementations, and depends a lot on the individual network situation of a node. If a node is publicly accessible, it doesn't need either. STUN works in a general case, and users can choose between running their own STUN server or using a public one. Some special cases may need a TURN server.
Is it? The Is it too confusing to use |
Unless I am missing something, the discussion about protocol names referred to the name of the stream protocol, not the name of the Multiaddr protocol component. That said, you are raising a good point, namely whether we need a Multiaddr protocol for this specification as well. The rational is documented in 78fae26:
|
Done in consistency with DCUtR protocol.
Allows future versions to deprecate and remove fields in a two step process. For more details see #465 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall ACK with a few nits
Thanks for the reviews @thomaseizinger and @marten-seemann. @marten-seemann mind taking another look? |
Friendly ping @marten-seemann. |
Co-authored-by: Chad Nehemiah <chad.nehemiah94@gmail.com>
@mxinden : what's remaining given libp2p/js-libp2p-webrtc#90 was merged and there are approvals here? |
- Bump webrtc-direct revision to r1 given the multiaddr change - Move webrtc to lifecycle 2A Candidate Recommendation - Updates dates
With |
With libp2p#412 and libp2p#497 merged, this roadmap item can be moved to "done".
The libp2p WebRTC (prev. browser-to-browser) specification.
Open items: