-
Notifications
You must be signed in to change notification settings - Fork 704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prospective-parachains: respond with multiple backable candidates #3160
Conversation
paves the way for elastic scaling
// Try finding a candidate chain starting from `base_node` of length `expected_count`. | ||
// If not possible, return the longest we could find. | ||
// Does a depth-first search, since we're optimistic that there won't be more than one such | ||
// chains. Assumes that the chain must be acyclic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if they are cyclic? What is the worst case complexity here, assuming parachains misbehave as much as they possibly can?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good question. we stop the search on the respective branch when we see that node A has a child that we've already visited. maybe we should just return an error. I don't think it's a valid state transition to accept. Is it?
Even if we were to accept cycles, we'd still be bounded by a function of the requested number of candidates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified the code in the latest revision to accept cycles in the fragment trees. This is actually a documented behaviour of prospective-parachains even with some tests for it. See:
//! # Cycles |
I also documented the performance considerations. Cycles are not an issue because we are limited by the core count anyway so we won't loop indefinitely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sensible! Algorithmic complexity still makes me uneasy: Pinged Rob on what is his stance on forks by now.
// 2. `expected_count` is equal to the number of cores a para is scheduled on (in an elastic | ||
// scaling scenario). For non-elastic-scaling, this is just 1. In practice, this should be a | ||
// small number (1-3), capped by the total number of available cores (a constant alterable | ||
// only via governance/sudo). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with realistic numbers (current and expected/exaggerated future numbers): How bad does it get? What would be limits to those parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Elastic scaling will eventually enable chains to be bounded by the rate of a single node's block production.
In Cumulus currently, block authorship is typically slower than block verification (in validators) due to the need to read from disk. Most of the time is spent in accessing the trie for storage, not in code execution. Access patterns for the trie are also very inefficient. So it's not feasible to use more than 3 cores until that bottleneck is cleared.
If this bottleneck is alleviated, then I could see parachains using many more cores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking about this a bit more, I don't think this traversal is any different in terms of worst-case complexity from the routine used for populating the fragment tree (which populates the tree breadth-first). Is it?
In an acyclic tree, it'll visit each node once, or until it reaches the requested count (in a worst case scenario). In a cyclic one, it'd be bounded by the requested count so it's the same complexity.
The complexity is linear in the number of nodes, but the number of nodes is an exponential function (because it's an n-ary tree, at least in theory, of arity num_forks
and height of max_candidate_depth
).
If we'll hit a bottleneck, this wouldn't be the first place where that'd surface.
I think we could limit the number of total forks allowed for a parachain (regardless of their level in the tree). So that the total number of branches in the tree would bring down the worst-case complexity massively. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anyway, I think this is somewhat orthogonal to this PR.
to quote the docs from fragment_tree module:
//! The code in this module is not designed for speed or efficiency, but conceptual simplicity.
//! Our assumption is that the amount of candidates and parachains we consider will be reasonably
//! bounded and in practice will not exceed a few thousand at any time. This naive implementation
//! will still perform fairly well under these conditions, despite being somewhat wasteful of
//! memory.
we should revisit this assumption maybe, but it doesn't seem like it'll be a problem for the forseeable future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prospective-parachains
seems to be a good subsystem benchmark target. That should give some clear numbers on CPU usage in worst case scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opened a separate issue to track this security item and added it to the elastic scaling task: #3219
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
} | ||
|
||
node | ||
}; | ||
|
||
// TODO [now]: taking the first selection might introduce bias | ||
// TODO: taking the first best selection might introduce bias |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to be solved in this PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this was here beforehand
It could be solved by randomising the order in which we visit a node's children, but that would make tests harder to write.
Since validators don't favour particular collators when requesting collations, the order of potential forks in the fragment tree should already be somewhat "random" based on network latency (unless collators find a way to censor/DOS other collators)
}}; | ||
} | ||
|
||
// Parachain 1 looks like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥇
// 2. `expected_count` is equal to the number of cores a para is scheduled on (in an elastic | ||
// scaling scenario). For non-elastic-scaling, this is just 1. In practice, this should be a | ||
// small number (1-3), capped by the total number of available cores (a constant alterable | ||
// only via governance/sudo). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prospective-parachains
seems to be a good subsystem benchmark target. That should give some clear numbers on CPU usage in worst case scenarios.
#3130 builds on top of #3160 Processes the availability cores and builds a record of how many candidates it should request from prospective-parachains and their predecessors. Tries to supply as many candidates as the runtime can back. Note that the runtime changes to back multiple candidates per para are not yet done, but this paves the way for it. The following backing/inclusion policy is assumed: 1. the runtime will never back candidates of the same para which don't form a chain with the already backed candidates. Even if the others are still pending availability. We're optimistic that they won't time out and we don't want to back parachain forks (as the complexity would be huge). 2. if a candidate is timed out of the core before being included, all of its successors occupying a core will be evicted. 3. only the candidates which are made available and form a chain starting from the on-chain para head may be included/enacted and cleared from the cores. In other words, if para head is at A and the cores are occupied by B->C->D, and B and D are made available, only B will be included and its core cleared. C and D will remain on the cores awaiting for C to be made available or timed out. As point (2) above already says, if C is timed out, D will also be dropped. 4. The runtime will deduplicate candidates which form a cycle. For example if the provisioner supplies candidates A->B->A, the runtime will only back A (as the state output will be the same) Note that if a candidate is timed out, we don't guarantee that in the next relay chain block the block author will be able to fill all of the timed out cores of the para. That increases complexity by a lot. Instead, the provisioner will supply N candidates where N is the number of candidates timed out, but doesn't include their successors which will be also deleted by the runtime. This'll be backfilled in the next relay chain block. Adjacent changes: - Also fixes: #3141 - For non prospective-parachains, don't supply multiple candidates per para (we can't have elastic scaling without prospective parachains enabled). paras_inherent should already sanitise this input but it's more efficient this way. Note: all of these changes are backwards-compatible with the non-elastic-scaling scenario (one core per para).
…ch#3233) paritytech#3130 builds on top of paritytech#3160 Processes the availability cores and builds a record of how many candidates it should request from prospective-parachains and their predecessors. Tries to supply as many candidates as the runtime can back. Note that the runtime changes to back multiple candidates per para are not yet done, but this paves the way for it. The following backing/inclusion policy is assumed: 1. the runtime will never back candidates of the same para which don't form a chain with the already backed candidates. Even if the others are still pending availability. We're optimistic that they won't time out and we don't want to back parachain forks (as the complexity would be huge). 2. if a candidate is timed out of the core before being included, all of its successors occupying a core will be evicted. 3. only the candidates which are made available and form a chain starting from the on-chain para head may be included/enacted and cleared from the cores. In other words, if para head is at A and the cores are occupied by B->C->D, and B and D are made available, only B will be included and its core cleared. C and D will remain on the cores awaiting for C to be made available or timed out. As point (2) above already says, if C is timed out, D will also be dropped. 4. The runtime will deduplicate candidates which form a cycle. For example if the provisioner supplies candidates A->B->A, the runtime will only back A (as the state output will be the same) Note that if a candidate is timed out, we don't guarantee that in the next relay chain block the block author will be able to fill all of the timed out cores of the para. That increases complexity by a lot. Instead, the provisioner will supply N candidates where N is the number of candidates timed out, but doesn't include their successors which will be also deleted by the runtime. This'll be backfilled in the next relay chain block. Adjacent changes: - Also fixes: paritytech#3141 - For non prospective-parachains, don't supply multiple candidates per para (we can't have elastic scaling without prospective parachains enabled). paras_inherent should already sanitise this input but it's more efficient this way. Note: all of these changes are backwards-compatible with the non-elastic-scaling scenario (one core per para).
Fixes #3129