Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of disabled validators in backing subsystem #1259

Merged
merged 44 commits into from
Oct 20, 2023

Conversation

tdimitrov
Copy link
Contributor

@tdimitrov tdimitrov commented Aug 29, 2023

Handles validator disabling in the backing subsystem. The PR introduces two changes:

  1. Don't import statements from disabled validators.
  2. Don't validate and second if the local validator is disabled.

The purpose of this change is first to ignore statements from disabled validators on the node side. This is an optimisation as these statements are supposed to be cleared in the runtime too (still not implemented at the moment). Additionally if the local node is a validator which is disabled - don't process any new candidates.

@burdges
Copy link

burdges commented Sep 6, 2023

Also discussed in #635 (comment) where I'm not in favor of disabling approval checkers, and definitely not grandpa voters, but yes it's harmless to disable backers, which makes it good to disable backers.

@ordian
Copy link
Member

ordian commented Sep 11, 2023

Looks sane, we would ideally also ignore dispute statements, or at least dispute initiations.

@ordian
Copy link
Member

ordian commented Sep 11, 2023

What happens if we revert the block where a validator was disabled to a block where he isn't?

polkadot/node/core/backing/src/lib.rs Outdated Show resolved Hide resolved
.any(|disabled_val_idx| *disabled_val_idx == *validator_idx)
}

pub fn local_validator_is_disabled(&self) -> Option<bool> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to return Option ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.validator is an Option. Afair the code can be executed in cases where the local node is not a validator?

I decided to return an Option here too and handle it at the caller.

@sandreim
Copy link
Contributor

We should also look into adding a new zombienet test.

@tdimitrov
Copy link
Contributor Author

Looks sane, we would ideally also ignore dispute statements, or at least dispute initiations.

Yes, but this is part of dispute-coordinator/dispute-distribution right? Or there is something in backing which I am missing?

I'm splitting the work in smaller PRs so that they are easier for review. All the work goes in https://github.com/paritytech/polkadot-sdk/tree/tsv-disabling-node-side

What happens if we revert the block where a validator was disabled to a block where he isn't?

Good question. The first thing that comes to my mind is we should count its vote, but I haven't thought about this in detail yet.

@rphmeier
Copy link
Contributor

rphmeier commented Sep 11, 2023

It would be better to handle this on the statement-distribution layer instead, i.e. since there are no changes there in this PR we will currently both accept and distribute statements from disabled validators, even though the backing system does not import them.

What I would suggest is this:

  1. Backing: do not sign, import, or distribute local statements if we are disabled (or kick off validation work at all)
  2. statement-dist: do not accept, request, or distribute statements from disabled validators

A modification to the statement-dist StatementStore would do most of the heavy-lifting here - three parts:

  1. StatementStore: generate request bitmasks which exclude disabled validators
  2. StatementStore reject statements related to disabled validators
  3. Manifest handling: do not count disabled validators towards the seconded & backing thresholds in the advertised bitmask. Likewise in request handling: skip disabled validators' statements in the response

@tdimitrov tdimitrov changed the base branch from master to tsv-disabling October 16, 2023 11:33
Co-authored-by: ordian <write@reusable.software>
@tdimitrov tdimitrov marked this pull request as ready for review October 17, 2023 09:09
Comment on lines 1029 to 1053
// TODO: https://github.com/paritytech/polkadot-sdk/issues/1940
// Once runtime ver `DISABLED_VALIDATORS_RUNTIME_REQUIREMENT` is released remove this `if`
// statement, add `request_from_runtime` call to the `try_join!` call above and use
// `try_runtime_api!` to get `disabled_validators`
let disabled_validators = if has_required_runtime(
ctx.sender(),
parent,
RuntimeApiRequest::DISABLED_VALIDATORS_RUNTIME_REQUIREMENT,
)
.await
{
let disabled_validators = request_from_runtime(parent, ctx.sender(), |tx| {
RuntimeApiRequest::DisabledValidators(tx)
})
.await
.await
.map_err(Error::JoinMultiple)?;

let disabled_validators = try_runtime_api!(disabled_validators);
disabled_validators
} else {
gum::debug!(target: LOG_TARGET, "Runtime doesn't support `DisabledValidators` - continuing with an empty disabled validators set");
vec![]
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code needs to stay until we release v8: #1940

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we extract that into subsystem-util or somewhere where statement-distribution and dispute-coordinator can reuse it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking about the same and I started working on it but against master.
My idea is to have a wrapper around request_from_runtime which does a version check and optionally returns a fallback version. I'll open a PR against master because I think this will be useful in general. We can backport it here easily after that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went into a macro rabbit hole. I'll just extract this implementation here and finish the other version in a more appropriate time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good. However, I didn't mean to use a macro for that, just a regular function should be OK.

@tdimitrov tdimitrov requested a review from ordian October 19, 2023 10:11
@tdimitrov tdimitrov merged commit 913232a into tsv-disabling Oct 20, 2023
109 checks passed
@tdimitrov tdimitrov deleted the tsv-disabling-backing branch October 20, 2023 13:13
@tdimitrov tdimitrov mentioned this pull request Oct 20, 2023
4 tasks
ordian added a commit that referenced this pull request Oct 20, 2023
* tsv-disabling: (39 commits)
  Handling of disabled validators in backing subsystem (#1259)
  Switch trie cache random seed (#1935)
  Expose prometheus metrics for minimal-relay-chain node in collators (#1942)
  Do not force collators to update after enabling async backing (#1920)
  Pin PRDoc image to v0.0.5 until we are ready for v0.0.6 (#1947)
  [prdoc] Start BEEFY gadget by default for Polkadot nodes (#1945)
  Update bridges subtree  (#1944)
  bump zombienet version (#1931)
  [FRAME] Message Queue use proper overweight limit (#1873)
  Cumulus: Allow aura to use initialized collation request receiver (#1911)
  Use prebuilt try-runtime binary in CI (#1898)
  Update kusama/polkadot bootnodes (#1895)
  Introduce XcmFeesToAccount fee manager (#1234)
  upgraded review bot to v2.1.0 (#1908)
  Trading trait and deal with metadata in Mutate trait for nonfungibles_v2 (#1561)
  Add Runtime Missing Crate Descriptions (#1909)
  Switch to the release env (#1910)
  Bump paritytech/review-bot from 2.0.1 to 2.1.0 (#1924)
  Bump actions/checkout from 4.1.0 to 4.1.1 (#1925)
  Start BEEFY client by default for Polkadot nodes (#1913)
  ...
tdimitrov added a commit that referenced this pull request Oct 23, 2023
Handles validator disabling in the backing subsystem. The PR introduces
two changes:
1. Don't import statements from disabled validators.
2. Don't validate and second if the local validator is disabled.

The purpose of this change is first to ignore statements from disabled
validators on the node side. This is an optimisation as these statements
are supposed to be cleared in the runtime too (still not implemented at
the moment). Additionally if the local node is a validator which is
disabled - don't process any new candidates.

---------

Co-authored-by: ordian <noreply@reusable.software>
Co-authored-by: ordian <write@reusable.software>
ordian added a commit that referenced this pull request Dec 20, 2023
Handles validator disabling in the backing subsystem. The PR introduces
two changes:
1. Don't import statements from disabled validators.
2. Don't validate and second if the local validator is disabled.

The purpose of this change is first to ignore statements from disabled
validators on the node side. This is an optimisation as these statements
are supposed to be cleared in the runtime too (still not implemented at
the moment). Additionally if the local node is a validator which is
disabled - don't process any new candidates.

---------

Co-authored-by: ordian <noreply@reusable.software>
Co-authored-by: ordian <write@reusable.software>
tdimitrov added a commit that referenced this pull request Jan 8, 2024
…tem (#1259) (#2764)

#1259 was merged into a feature branch, but we've decided to merge
node-side changes for disabling straight into master.
This is a dependency of #1841 and #2637.

---------

Co-authored-by: Tsvetomir Dimitrov <tsvetomir@parity.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T0-node This PR/Issue is related to the topic “node”.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants