-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A&V networking: sentry node proxies #85
Conversation
cc @mxinden |
|
||
These numbers are high enough that we consider it unnecessary to specify that requests-for-pieces can be proxied, which itself is also slightly more complex than proxying a push as it involves one extra half-round of messages. This allows us to avoid some (hopefully) unnecessary complexity in what is already a fairly complex protocol. | ||
Since the private validator node may not be able to access the address book, the sentry node is the one to perform the address book lookup. As described in 1(a) above, in the general case it will get a set of addresses as the result. For better load-balancing, the sentry node should sort this set and select the jth address to connect to, where j = i mod n, n is the size of the set, and (c, i) is the co-ordinate of its validator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jth address
What does jth stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It defines j
just a few words afterwards, is it not clear?
|
||
These numbers are high enough that we consider it unnecessary to specify that requests-for-pieces can be proxied, which itself is also slightly more complex than proxying a push as it involves one extra half-round of messages. This allows us to avoid some (hopefully) unnecessary complexity in what is already a fairly complex protocol. | ||
Since the private validator node may not be able to access the address book, the sentry node is the one to perform the address book lookup. As described in 1(a) above, in the general case it will get a set of addresses as the result. For better load-balancing, the sentry node should sort this set and select the jth address to connect to, where j = i mod n, n is the size of the set, and (c, i) is the co-ordinate of its validator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the private validator node may not be able to access the address book, the sentry node is the one to perform the address book lookup. As described in 1(a) above, in the general case it will get a set of addresses as the result. For better load-balancing, the sentry node should sort this set and select the jth address to connect to, where j = i mod n, n is the size of the set, and (c, i) is the co-ordinate of its validator. | |
Say the address book is realized via a Kademlia Dht. Given that the private validator won't be able to make outbound connections beyond its sentry nodes the validator node may not be able to access the address book. The sentry node is the one to perform the address book lookup. As described in 1(a) above, in the general case it will get a set of addresses as the result. For better load-balancing, the sentry node should sort this set and select the jth address to connect to, where j = i mod n, n is the size of the set, and (c, i) is the co-ordinate of its validator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted this paragraph to be general, since in the future we may want to have part of the address-book onchain in which case the private validator node would be able to access it. But still part of it would be on the kademlia DHT, so it is probably better to have the sentry node resolve it all the time.
Co-authored-by: Max Inden <mail@max-inden.de>
This metadata should be gossiped every few seconds. The data of the actual pieces are distributed via a separate topology as described below: | ||
These should be gossiped every few seconds, and allows the participants to know when the stages of the protocol begin and end, details below. | ||
|
||
The data of the actual pieces are distributed via a topology separate from the broadcast medium: | ||
|
||
Recall that we have a disjoint partition of N validators into C sets of parachain validators, each set having size N/C. For our purposes for this subprotocol, we will randomly assign a co-ordinate (c, i) to every validator, with c in [0, C) and i in [0, N/C). Fixing c and varying i defines a particular parachain validator set; varying c and fixing i defines what we'll call a particular validator "ring". This name is only meant to be very slightly suggestive, the precise structure and its justification will be described below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is beyond the scope of the current PR, but I wonder about the coordinate scheme in situations where we may have groups of slightly different sizes if C
is not a divisor of N
. In practice, we have two options. The first is to spread the remainder over groups, so you have N%C
groups with 1 extra validator. The other option is to leave some random N%C validators idle for a session. Ensuring that C
is a divisor poses challenges on implementation and usability so I'd prefer to leave that off the table. Our thinking has been leaning towards option 1 so far, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, one extra sounds fine. I still kinda dislike the grid topology here since we can attach fairly regular bipartite graphs every parachain group pair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The grid topology was mainly motivated back when originally I thought I might have to handle nodes with arbitrary NAT situations. So it is not essential, it can be changed. However it does make load-balancing easy to reason about, and I couldn't immediately think of anything that was significantly better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a paragraph mentioning what happens with unevenly-sized parachains, with the "one extra" approach. BTW, this problem occurs with the bipartite-graphs topology as well, since obviously you can't have a perfect matching between sets of size (C) and (C+1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is actually way less trivial than I thought, at leas to do it in a way that preserves load-balancing properties and avoids some nodes having to do 2x the amount of work. Will have to spend a bit more time to think about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: simply spreading the extra validators across the existing groups, is not well-defined, because you need to define what pieces they are supposed to distribute to everyone else in the group. But there is already someone from their chain, that is distributing a well-defined set of pieces to everyone else in the group.
I will merge this PR and file an issue about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking in w3f/research-internal#390
As discussed previously, support direct sends for sentry nodes. The extension is pretty straightforward, and replaces the previous more-complex extension on arbitrary network reachabilities.