Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Manual] Marking offline collators not producing blocks during X rounds #2259

Merged
merged 34 commits into from
Oct 6, 2023

Conversation

Agusrodri
Copy link
Contributor

@Agusrodri Agusrodri commented Apr 25, 2023

What does it do?

This PR introduces the manual design (via extrinsics) of marking a collator offline if it doesn’t produce blocks within a specific number of rounds. This number is declared as a constant (MaxOfflineRounds) and it’s different depending on which runtime we are situated on. In Moonbeam this constant is equal only to 1 round while in Moonbase and Moonriver gets the value of 2 rounds. It is important to note that this constant (in the case of the manual design) must be less than or equal to the constant RewardPaymentDelay due to this design takes into account the rewards, and wouldn't work if we use them after they are reset.

Proposed solution

Create a new extrinsic called notify_inactive_collator that anyone can call to inform a collator is inactive (not producing blocks during X consecutive rounds).

This extrinsic performs a few checks. If we are in round 10 for instance, these checks will be applied to rounds 9 and 8 (in case MaxOfflineRounds is 2). For each round it checks:

  • The collator has an AtStake<> storage key associated for the round. This means the candidate was selected as a collator in that round.
  • The collator has zero points in its AwardedPts<> storage. This means the collator hasn't produced any block during that round.

Also, at the beginning of the extrinsic, there is an additional check to ensure the collators set length is not lower than 66% of the value stored in TotalSelected. This is done to prevent collators set going empty and make the chain stop producing blocks.

Once all these conditions are met, the extrinsic uses a custom handler called OnInactiveCollator to inform the runtime that a collator is inactive, and let the runtime decides what to do with it. The default behavior is to mark the collator as offline, but if we are dealing with moonbeam-orbiters for instance, we will probably use a different implementation.

⚠️ Breaking Changes ⚠️

  • All changes are applied in pallet-parachain-staking.
  • Added notify_inactive_collator extrinsic.
  • Changed AtStake<> to be of type OptionQuery instead of previous ValueQuery.
  • Added a new custom handler called OnInactiveCollator to notify the runtime when a collator is inactive.

@Agusrodri Agusrodri added B7-runtimenoteworthy Changes should be noted in any runtime-upgrade release notes D9-needsaudit👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited D10-breaksdocs Changes to code that require changes in documentation breaking Needs to be mentioned in breaking changes labels Apr 25, 2023
@librelois
Copy link
Collaborator

librelois commented Apr 27, 2023

Another good solution is to replace the extrinsic by a non mandatory inherent, It would still be more interesting than the automatic approach for 3 reasons:

  1. Part of the computation still offchain (detection of which collators should be marked offline)
  2. If the verification code panic, it will not stall the chain nor the collator because it's a non-mandatory inherent
  3. Still zero cost for project that want to use the parachain staking pallet without this functionally

And obviously the non mandatory inherent approach has a big advantage over the extrinsic: collators do not need to run a 2nd specific program to submit the extrinsic, submission is natively managed by the inherent system

EDIT: we can also keep both the inherent ant the extrinsic

@Agusrodri
Copy link
Contributor Author

Agusrodri commented Apr 27, 2023

Another good solution is to replace the extrinsic by a non mandatory inherent, It would still be more interesting than the automatic approach for 3 reasons:

  1. Part of the computation still offchain (detection of which collators should be marked offline)
  2. If the verification code panic, it will not stall the chain nor the collator because it's a non-mandatory inherent
  3. Still zero cost for project that want to use the parachain staking pallet without this functionally

And obviously the non mandatory inherent approach has a big advantage over the extrinsic: collators do not need to run a 2nd specific program to submit the extrinsic, submission is natively managed by the inherent system

EDIT: we can also keep both the inherent ant the extrinsic

Sounds good. If you agree, we could open another PR for that case, and merge this one with the manual solution before diving into the inherents one. What do you think?

@librelois
Copy link
Collaborator

Sounds good. If you agree, we could open another PR for that case, and merge this one with the manual solution before diving into the inherents one. What do you think?

I discussed my inherent idea with @crystalin, and in fact the collators have incentive to not mark other collators offline, so the inherent would not work (it would work for a while, until someone maintains a different binary to optimize the interest of the collators, like the MeV on ethereum mainnet).

@Agusrodri Agusrodri added D9-needsaudit👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited and removed D9-needsaudit👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited labels May 3, 2023
@Agusrodri Agusrodri marked this pull request as draft June 21, 2023 13:28
@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2023

Coverage generated "Fri Oct 6 16:08:31 UTC 2023":
https://d3ifz9vhxc2wtb.cloudfront.net/pulls/2259/html/index.html

Master coverage: 87.39%
Pull coverage:

@librelois librelois mentioned this pull request Aug 25, 2023
28 tasks
@Agusrodri
Copy link
Contributor Author

@notlesh do the weights of the notify_inactive_collator extrinsic look good for you? They are provisional until we run the benchmark and update the weights. Also, in my opinion the enable_marking_offline extrinsic doesn't need a benchmark, so let me know what do you think about its weights too :)

@notlesh
Copy link
Contributor

notlesh commented Oct 5, 2023

@notlesh do the weights of the notify_inactive_collator extrinsic look good for you? They are provisional until we run the benchmark and update the weights. Also, in my opinion the enable_marking_offline extrinsic doesn't need a benchmark, so let me know what do you think about its weights too :)

I added some locally generated ones in 85301b1 and b5ea5cd. They should be regenerated on proper hardware.

Copy link
Contributor

@notlesh notlesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@Agusrodri Agusrodri merged commit 2a4c274 into master Oct 6, 2023
26 checks passed
@Agusrodri Agusrodri deleted the agustin-mark-offline-manual branch October 6, 2023 16:40
grw-ms pushed a commit that referenced this pull request Oct 9, 2023
…ds (#2259)

* begin manual mark-offline

* remove CandidateLastActive

* fix precompile staking

* comments and fmt

* fix do_go_offline() weight

* make it compile

* apply some suggestions

* test case MaxOfflineRounds < RewardPaymentDelay

* fix test

* add killswitch for marking offline feature

* fix mock config in parachain staking precompile

* add benchmark for notify_inactive_collator extrinsic

* refactor benchmark

* fix call to select_top_candidates in benchmark

* minor changes in benchmark

* add killswitch as a storage item

* fmt

* remove ConstBool

* enable killswitch in benchmark

* add enable_marking_offline extrinsic

* optmize collators_len read and benchmark

* Add locally-generated benchmark for notify_inactive_collator

* Hook up newly added benchmark to pallet call

* add extra check for current round

* minor changes in tests

* add test

* modify rounds_to_check syntax

---------

Co-authored-by: Stephen Shelton <steve@brewcraft.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B7-runtimenoteworthy Changes should be noted in any runtime-upgrade release notes breaking Needs to be mentioned in breaking changes D9-needsaudit👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited D10-breaksdocs Changes to code that require changes in documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants