-
Notifications
You must be signed in to change notification settings - Fork 1.6k
validator discovery improvements #2460
Comments
for p. 2, the problem is that authority discovery workers only retains keys of the as suggested by @rphmeier we can tweak the semantics of polkadot/runtime/polkadot/src/lib.rs Line 1284 in f778e52
|
@ordian I see that |
@parity2305 thanks for looking into this. We'll have to enable it only on rococo for now polkadot/runtime/rococo/src/lib.rs Line 848 in f778e52
probably need a helper function in runtime/parachains/src/runtime_api_impl/v1.rs
|
@ordian So for point 1: There are two usage of
|
Thanks for taking a look. |
Correct. Plus Inserting a key value in |
I just looked at the implementation, noticing that all that
Ideally in my opinion, the API would either take All in all not a big deal of course, in my particular case I had a |
@ordian For point 3, We can store |
@parity2305 yeah, I was thinking of something similar, we either need a background task polling authority discovery service every few minutes, or a way to subscribe to authority discovery notifications when a new id is discovered (that would be harder to implement though, but it would be more efficient). For the background task case, we'd need to poll it for every unresolved authority id I guess, so the priority here doesn't really matter, or it can be simply round-robin queue. |
@ordian Cool. I have started working on adding a background job to |
Actually, as the discovery happens for the current and the next session - shouldn't this be fine? Like if discovery was not able to discover the peer for a complete session, it probably won't a few minutes later either. |
this is just a filter that is used to retain ids in the cache
Discovery is happening continuously by listening to the DHT events. If we happen to query it at some point at it returns |
Also the default config discovery parameters are not set in stone and we might want to tweak them for polkadot needs. |
It retains still active authorities in cache and also issues queries for them. Which happens continuously for validators of the current and the next session. So the cache will be filled with authorities from the next session all along the current session, I would assume the cache to be nicely populated when we start requesting for authorities on session change, as it has been filled already in the previous one (where we did not care about them yet). This is true, except maybe on startup .... not sure if that is an issue though, as a node will have some ramp up time anyway? To me it looks, like the current implementation is totally fine. The cache represents our view of discoverable validators, sure that view can change over time, but I believe we can treat any particular query as the current status quo and be good with it. |
It could, but would argue that it is pretty unlikely and we should not need to care. The cache should make sure we don't need to care and I think it does. There can be network issues of course, but we could also have them when connecting to the validator, that's just something we need to be able to deal with, and I think we do, as we have redundancy in all our protocols. From an attack surface perspective: If an adversary was able to prevent discovery for a whole session, it could very likely prevent it for another session as well. So re-doing failed lookups would not help here either. |
Alright, so our assumptions are
If these assumptions are to stay in the future, then I agree we don't need to additional polling mechanism. |
I guess 1 can be weakened a bit - if we only start requesting connections to validators of the next session, towards the end of the previous one (to make more smooth transitions), we should still be golden. |
Hmm. Agreed. Also with merge of #2494 we are also keeping track of past N sessions in authority discovery, so as per #1461 (comment) every networking subsystem should be able to connect to relevant validators using authority-discovery worker's cache. I will close my draft PR. |
Thanks, p.2 was implemented in #2494 and p.1 and p.3 are agreed to be not worth pursuing, so I'm closing this issue. |
As discussed in Element, we have several things that can be improved with the current validator discovery implementation:
Vec<ValidatorId>
, we can accept aVec<ValidatorIndex>
instead, this would avoid a couple of runtime calls and simplify implementation a bit.AuthorityDiscoveryId
to a peerset when it's requested again in aConnectToValidators
request. Not sure what we can do about. Note that the substrate side will only update its cache in 10 minutes.The text was updated successfully, but these errors were encountered: