Rework priority groups #7374

tomaka · 2020-10-22T12:08:33Z

Close #6087
Relates to #7074

This PR isn't finished. Before finishing it, I'd like to get feedback about the general approach.

This PR implements what I commented about here.

When initializing the network, one will now be able to pass an additional list of "peer sets".
Rather than modifying a priority group, the authority-discovery module, and Polkadot, would now modify the appropriate "peer set". Priority groups are no longer a thing after this PR.

So what is the difference between a priority group and a peer set?

Each peer set has its own inbound and outbound slots, independent from the "main" inbound/outbound slots used for syncing. In practice, this guarantees that validators would always have slots reserved for collators, which is the main motivation behind this change.
Each peer set is tied to a specific list of notifications protocol. When we allocate a slot for a certain peer in a certain set, we open the corresponding notifications substream. When the notifications substreams are closed, we deallocate a slot. See also Change sc_network::protocol::generic_proto::behaviour to occupy inbound peerset slots for substreams and not connections #7074. This means that we can be "connected to" (i.e. have a substream) a peer through a certain set, while remaining "disconnected" (i.e. no substream) through a different set.

That second point isn't implemented yet, as it requires deeper changes in the behaviour/handler layer, as explained here.

A consequence of this change is that the authority discovery module should no longer pass a random sample of the list of validators, but should pass the full list instead. The peerset would now be responsible for choosing that random sample.

This PR is also a step towards #3310, as we would now be able to maintain a peer set for each chain.

…roups

ordian · 2020-10-22T14:37:17Z

Each peer set has its own inbound and outbound slots, independent from the "main" inbound/outbound slots used for syncing. In practice, this guarantees that validators would always have slots reserved for collators, which is the main motivation behind this change.

So how are priority groups different? Do they occupy only inbound or outbound slots?

IIUC from the polkadot's (validator discovery) pespective, the change would be as simple as renaming *_priority_group to *_peer_set.

tomaka · 2020-10-22T16:16:43Z

So how are priority groups different? Do they occupy only inbound or outbound slots?

At the moment, the node as a whole has 25 inbound slots and 25 outbound slots (by default).
When you add a node in a priority group, the node tries to connect to it "more" than it tries to connect to the other nodes, but it still occupies of these 25+25 slots.

The problem with priority groups is that, when it comes to collation, a validator doesn't know who the collators are. It can't set the priority group to the list of collators. Therefore, there's a possibility that a collator tries to connect to a validator, but gets denied because all the slots of the validator are occupied by regular relay chain nodes.

After this PR, and after #7074, 25+25 slots would be attributed to relay chain nodes, and, for example, 10+10 slots would be attributed specifically to collators.

IIUC from the polkadot's (validator discovery) pespective, the change would be as simple as renaming *_priority_group to *_peer_set.

That's what I understand as well.

mxinden

I'd like to get feedback about the general approach.

I am in favor of this more granular approach.

IIUC from the polkadot's (validator discovery) pespective, the change would be as simple as renaming *_priority_group to *_peer_set.

Do I understand correctly that each peer set would have (a) a set of variable nodes where a subset is chosen based on the min and max boundaries of the set and (b) a set of reserved nodes that would always be connected? If that is true and in case you were only passing a subset of nodes from Polkadot down, you could now pass all of them down, having the PeerSetManager decide who to connect to.

mxinden · 2020-10-23T08:17:48Z

client/network/src/service.rs

-						reserved_nodes.insert(validator.peer_id.clone());
-						known_addresses.push((validator.peer_id.clone(), validator.multiaddr.clone()));
+		let peerset_config = {
+			let mut sets = Vec::with_capacity(1 + params.network_config.extra_sets.len());


Feel free to ignore as pr still in progress: I doubt pre-allocating will bring us a performance gain, given that this is only called once in the lifetime of a node and relatively small. Am I missing something?

mxinden · 2020-10-23T08:20:41Z

client/network/src/service.rs

@@ -967,7 +984,7 @@ impl<B: BlockT + 'static, H: ExHashT> NetworkService<B, H> {
 			.unbounded_send(ServiceToWorkerMsg::SyncFork(peers, hash, number));
 	}

-	/// Modify a peerset priority group.
+	/// Modify a set of peers from the peerset.


Would one still be allowed to create new peer sets via set_peer_set?

client/peerset/src/lib.rs

mxinden · 2020-10-23T08:28:57Z

client/peerset/src/lib.rs

+#[derive(Debug)]
+pub struct SetConfig {
+	/// Name of the set. Used to later refer to that set.
+	pub name: &'static str,


Do I understand correctly that set names have to be known at compile time?

Does Polkadot know all set names at compile time (//CC @ordian)? E.g. when a validator is assigned to a specific parachain, would it use a prefixed priority group or reuse the same from the previous assignment?

This is indeed one of the aspects I'm hesitating about. To me it feels more correct to indicate ahead of time the list of overlay networks, and Polkadot would indeed use a set called something like "current-validators-rotation".

To me it feels more correct to indicate ahead of time the list of overlay networks

Otherwise, there isn't really a clean way to configure the number of slots of each set.

Do I understand correctly that set names have to be known at compile time?

Does Polkadot know all set names at compile time (//CC @ordian)? E.g. when a validator is assigned to a specific parachain, would it use a prefixed priority group or reuse the same from the previous assignment?

That's a good question, I don't think we have a design for it now. It depends on how peer-sets are working, but for now we have one large group that is changing over time.

The intention behind this refactor is indeed to use one "peers set" and change its content.

…roups

ordian · 2020-11-23T11:23:22Z

Do I understand correctly that each peer set would have (a) a set of variable nodes where a subset is chosen based on the min and max boundaries of the set and (b) a set of reserved nodes that would always be connected? If that is true and in case you were only passing a subset of nodes from Polkadot down, you could now pass all of them down, having the PeerSetManager decide who to connect to.

Sorry I forgot to reply. Currently, we have only one priority group https://github.com/paritytech/polkadot/blob/7ac0353728611643d1e807eb7563e6e00cd11260/node/network/bridge/src/validator_discovery.rs#L31, this is used by collators to connect to validators. This priority group changes over time.

It will also be used by validators to connect to other validators: paritytech/polkadot#1990. My understanding is that priority groups unlike topics are local, meaning each node can have its own priority groups independent of others. Is that correct? Or maybe I misunderstanding how it works

In practice, this guarantees that validators would always have slots reserved for collators, which is the main motivation behind this change.

tomaka · 2020-11-23T13:37:39Z

It will also be used by validators to connect to other validators: paritytech/polkadot#1990. My understanding is that priority groups unlike topics are local, meaning each node can have its own priority groups independent of others. Is that correct? Or maybe I misunderstanding how it works

Indeed. Contrary to the notification protocol names, the names and content of the priority groups are never communicated over the network.

I'm however considering changing the API to use indices rather than names.

Co-authored-by: Max Inden <mail@max-inden.de>

tomaka · 2020-12-08T16:45:11Z

Marked as "ready for review" in order for CI to run. Sorry for the noise, please ignore. This isn't ready for review yet.

…roups

tomaka · 2020-12-09T11:51:53Z

Closing in favour of #7700

tomaka added 9 commits October 15, 2020 11:38

Turn peerset priority groups into sets

0016a00

Wip

749873f

Err, tabs

bc8871c

Merge remote-tracking branch 'upstream/master' into rework-priority-g…

d86b51f

…roups

Merge remote-tracking branch 'upstream/master' into rework-priority-g…

375189d

…roups

WIP

d6212db

Merge remote-tracking branch 'upstream/master' into rework-priority-g…

763be4a

…roups

Adjust network config

e1128b9

Wip

86d6666

tomaka added A3-in_progress Pull request is in progress. No review needed at this stage. B3-apinoteworthy C1-low PR touches the given topic and has a low impact on builders. labels Oct 22, 2020

tomaka requested review from ordian and mxinden October 22, 2020 12:08

mxinden reviewed Oct 23, 2020

View reviewed changes

tomaka mentioned this pull request Nov 12, 2020

Use inbound peerslot slots when a substream is received, rather than a connection #7464

Merged

Merge remote-tracking branch 'upstream/master' into rework-priority-g…

6dedc77

…roups

tomaka mentioned this pull request Nov 16, 2020

Give control over the peers of custom notification protocols #7072

Closed

tomaka added 5 commits November 16, 2020 17:44

Merge remote-tracking branch 'upstream/master' into rework-priority-g…

16c84a7

…roups

Merge remote-tracking branch 'upstream/master' into rework-priority-g…

8d9cbdf

…roups

Small doc

65d893c

Merge remote-tracking branch 'upstream/master' into rework-priority-g…

19438e6

…roups

WIP

e6ec01a

wheresaddie and others added 2 commits November 23, 2020 16:23

Update client/peerset/src/lib.rs

9df04ff

Co-authored-by: Max Inden <mail@max-inden.de>

Update client/peerset/src/lib.rs

7cbce24

Co-authored-by: Max Inden <mail@max-inden.de>

tomaka added 7 commits December 8, 2020 13:06

GrandPa tests compiling

8788fe7

Fix protocols in events

b8d214d

WIP

222fe3e

WIP

1a380ed

Grandpa tests now passing

500b70b

Fix some warnings

97e3101

Comment out code to fix all warnings and let CI run

ec9b55e

tomaka marked this pull request as ready for review December 8, 2020 16:44

tomaka requested a review from andresilva as a code owner December 8, 2020 16:44

tomaka added 14 commits December 8, 2020 17:57

Merge remote-tracking branch 'upstream/master' into rework-priority-g…

4fff66a

…roups

Fix warning

653a4bb

Cut down authority-discovery priority group

a50b4b1

Allow reserved-only per set

0bf2726

WIP

221d1d8

WIP

6503a97

Line widths

6b14539

Update set 1 thing

ff920d4

Proper reserved-only handling

aaf4515

Fix Grandpa path

79042b2

Restore peerset debug info

b5f89dd

I think I'm done 🎉

38eae16

Fix TODO

06f1e21

More done than done

67b2c53

tomaka mentioned this pull request Dec 9, 2020

Rework priority groups, take 2 #7700

Merged

tomaka closed this Dec 9, 2020

tomaka deleted the rework-priority-groups branch December 9, 2020 11:51

rphmeier mentioned this pull request Jan 3, 2021

Implement dormant connections/low overhead connections #7797

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework priority groups #7374

Rework priority groups #7374

tomaka commented Oct 22, 2020 •

edited

Loading

ordian commented Oct 22, 2020

tomaka commented Oct 22, 2020 •

edited

Loading

mxinden left a comment

mxinden Oct 23, 2020

mxinden Oct 23, 2020

mxinden Oct 23, 2020

tomaka Oct 23, 2020

tomaka Oct 23, 2020

ordian Nov 23, 2020

tomaka Nov 23, 2020 •

edited

Loading

ordian commented Nov 23, 2020

tomaka commented Nov 23, 2020

tomaka commented Dec 8, 2020

tomaka commented Dec 9, 2020

Rework priority groups #7374

Rework priority groups #7374

Conversation

tomaka commented Oct 22, 2020 • edited Loading

ordian commented Oct 22, 2020

tomaka commented Oct 22, 2020 • edited Loading

mxinden left a comment

Choose a reason for hiding this comment

mxinden Oct 23, 2020

Choose a reason for hiding this comment

mxinden Oct 23, 2020

Choose a reason for hiding this comment

mxinden Oct 23, 2020

Choose a reason for hiding this comment

tomaka Oct 23, 2020

Choose a reason for hiding this comment

tomaka Oct 23, 2020

Choose a reason for hiding this comment

ordian Nov 23, 2020

Choose a reason for hiding this comment

tomaka Nov 23, 2020 • edited Loading

Choose a reason for hiding this comment

ordian commented Nov 23, 2020

tomaka commented Nov 23, 2020

tomaka commented Dec 8, 2020

tomaka commented Dec 9, 2020

tomaka commented Oct 22, 2020 •

edited

Loading

tomaka commented Oct 22, 2020 •

edited

Loading

tomaka Nov 23, 2020 •

edited

Loading