-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
autorelay #454
autorelay #454
Conversation
cc @whyrusleeping @mgoelzer |
rebased on master |
cc @dignifiedquire |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good, but I'm not exactly into this code
options.go
Outdated
func Routing(rt config.RoutingC) Option { | ||
return func(cfg *Config) error { | ||
if cfg.Routing != nil { | ||
return fmt.Errorf("cannot specified multiple routing options") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/specified/specify
p2p/host/relay/autorelay.go
Outdated
h.mx.Unlock() | ||
|
||
limit := 20 | ||
for ; need > limit; limit *= 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like there should be a better way to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could just take some factor of need (say 2x) and go from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplified in f495b8a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite familiar with this part of libp2p, so this should probaby get at least some more review from someone who is.
@vyzo, this is coming along really nicely. Before final merge, can you add some docs about how it works with usage examples for downstreams? Also, I’m thinking we should get a reviewer from Filecoin to verify a build off this branch fixes the problem they were having that raised the priority of this? (or if it doesn’t, we should open new issues to cover what else they need for their use case) Maybe @phritz? |
We still need to bubble up some recent changes in multiaddr-net before we can merge, but that's just procedural at this point. Re: documentation Note that an autonat service instance is needed to detect the presence of NAT -- see libp2p/go-libp2p-autonat-svc#1 (which is also waiting for the bubble up before merge). So, in a nutshell:
|
So just a status update: the code is pretty much ready, but needs some further review (poking @Stebalien @raulk @bigs). |
I think we want to rename the |
@vyzo I think @mgoelzer was proxying the request from go-filecoin around docs. Basically when we went to try to implement relaying ourselves we spent a lot of time trying to figure out how to do it. We had confusion around:
We ended up spending many hours trying to get this working, even with the benefit of early input from why and stebalien, and relied mostly on trying to find places in the code that used this stuff and trial and error. It would be great to have a little more explanation of how this works (#454 (comment) is a great start -- promote it to documentation!) along with some sample code. The sample code would be most helpful! This is the case we were after:
The case of talking to each other via a set of discoverable relay nodes is a nice to have, as is auto-detection of whether my peer is reachable or needs to use a relay (I think you have this WIP). Docs we used:
|
The way this will work with autorelay:
There is no need to have the fixed relays known to the end hosts ahead of time, they will be discovered through the DHT. |
rebased on master |
I will add a edit: done in 326dc68 |
@vyzo Thanks for adding the docs PR, and @phritz thanks for engaging in this discussion to clarify what the questions are. Long term, I'd like us to find a way that each major PR has documentation independent of its source code files or Github PR comments page. We could either house this in a single location like https://docs.libp2p.io (which we could drive off GH pages or something) or we need to put it in each module's repo (e.g., https://github.com/libp2p/go-libp2p-circuit/blob/master/README.md). We can hire a professional docs writer to clean up the writing and index it somewhere, as long as we capture the facts somewhere. Since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really great. some small feedback, some of which i suspect can be ignored. generally thrilled with how readable/grokable this all is.
p2p/host/relay/autorelay.go
Outdated
} | ||
|
||
dctx, cancel := context.WithTimeout(ctx, 60*time.Second) | ||
pis, err := discovery.FindPeers(dctx, h.discover, "/libp2p/relay", limit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's factor this magic string into a constant (should it be versioned?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will do. I don't think we need a version for this, relay is pretty fixed functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
called it RelayRendezvous
and it's a constant now.
p2p/host/relay/autorelay.go
Outdated
|
||
// TODO better relay selection strategy; this just selects random relays | ||
// but we should probably use ping latency as the selection metric | ||
shuffleRelays(pis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we factor this out into a RelaySelection
interface or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it could be a function passed in at configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more like ordering of relays. I don't think I want to make this configurable, just make it smarter and use ping latency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved the relay selection out of line, into a selectRelays
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, I think we also need to take into account relay load in the selection strategy.
The selection metric could be latency * load
, and select to minimize.
But this will need some changes in the relay protocol (ie ability to measure load and report that), so I am going to leave it as a big fat TODO and stay with randomized selection for initial deployment.
@mgoelzer Sure, but let's merge first. Then we can add pretty comprehensive documentation there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch of comments I'd like addressed.
p2p/host/relay/autorelay.go
Outdated
) | ||
|
||
const ( | ||
RelayRenezvous = "/libp2p/relay" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
|
||
func (h *AutoRelayHost) background(ctx context.Context) { | ||
select { | ||
case <-time.After(autonat.AutoNATBootDelay + BootDelay): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need an event bus. These cascading time bombs make me nervous. AutoRelay depends on AutoNAT, which in turn depends on Identify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
..shrug :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we do eventually need a better solution. Having to know that if you try doing X before Y finishes means it will fail is kinda rough.
p2p/host/relay/autorelay.go
Outdated
|
||
// add relay specific addrs to the list | ||
for _, pi := range h.relays { | ||
circuit, err := ma.NewMultiaddr(fmt.Sprintf("/ipfs/%s/p2p-circuit", pi.ID.Pretty())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use /p2p
which is equivalent for /ipfs
in multiaddr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
} | ||
|
||
for _, addr := range pi.Addrs { | ||
if !manet.IsPrivateAddr(addr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow this logic. The resulting self addresses via relays would end up taking the form: /ipfs/QmRelay/p2p-circuit/{maddr of *relay* if not private}
Are we sure we want to encapsulate the relay's addrs instead of our own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing after the p2p-circuit
part, it's just a transport signifier.
The relayed multiaddr will be of the form /public-maddr-of-relay/ipfs/QmRelay/p2p-circuit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So just to be clear, it's the non-private address of the relay that is encapsulating the /p2p/QmRelay/p2p-circuit
part, not the other way around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks.
} | ||
|
||
// and the actual test! | ||
func TestAutoRelay(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be a stickler for testing, but I think a component like this deserves more exhaustive testing at the time of its submission of the PR, or at least a commitment from the author to add more tests shortly thereafter.
Particularly I'd like to see coverage regarding:
- our actual own advertised addresses as a result of autorelay (we are only testing that there's a P_CIRCUIT component, but we should be testing it's entirety, especially because we're manipulating maddrs)
- behaviour with unspecific relays.
- multiple rounds of autorelay.
- timeouts.
- different AutoNAT statuses.
- connection terminations.
- etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is overtesting stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good use case for a testbed/daemon environment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I would very much see these integration tested in a testbed than having to write reams of borderline useless testing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't tell you how many times what appeared to be overtesting at the time caught regressions later on ;-)
// verify that we advertise relay addrs | ||
haveRelay := false | ||
for _, addr := range h3.Addrs() { | ||
_, err := addr.ValueForProtocol(circuit.P_CIRCUIT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's assert the entire multiaddr, not only that a component exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a lot of work that replicates the logic in the code, not sure I see the point (it is tested for dialability later on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I didn't realise it was tested for dialability 👍
macosx seems to fail intermittently, and that race is the likely culprit.
also check that we don't initially advertise any.
rebased on master and updated deps. |
I’m assuming autorelay is critically urgent for downstream users. As long as we explicitly state in godoc that these interfaces are UNSTABLE, I can live with merging now and revisiting once the service abstraction is ready. |
Implements autorelay; Closes #444.
Depends on:
TBD:
New
constructor to automatically create relay/autorelay/routed hosts