Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

impl guide: Collator Generation #6938

Closed

Conversation

mrcnski
Copy link
Contributor

@mrcnski mrcnski commented Mar 22, 2023

Some notes I started taking on this subsystem.

rphmeier and others added 30 commits February 21, 2022 17:46
* initial stab at candidate_context

* fmt

* docs & more TODOs

* some cleanups

* reframe as inclusion_emulator

* documentations yes

* update types

* add constraint modifications

* watermark

* produce modifications

* v2 primitives: re-export all v1 for consistency

* vstaging primitives

* emulator constraints: handle code upgrades

* produce outbound HRMP modifications

* stack.

* method for applying modifications

* method just for sanity-checking modifications

* fragments produce modifications, not prospectives

* make linear

* add some TODOs

* remove stacking; handle code upgrades

* take `fragment` private

* reintroduce stacking.

* fragment constructor

* add TODO

* allow validating fragments against future constraints

* docs

* relay-parent number and min code size checks

* check code upgrade restriction

* check max hrmp per candidate

* fmt

* remove GoAhead logic because it wasn't helpful

* docs on code upgrade failure

* test stacking

* test modifications against constraints

* fmt

* test fragments

* descending or duplicate test

* fmt

* remove unused imports in vstaging

* wrong primitives

* spellcheck
* inclusion: utility for allowed relay-parents

* inclusion: use prev number instead of prev hash

* track most recent context of paras

* inclusion: accept previous relay-parents

* update dmp  advancement rule for async backing

* fmt

* add a comment about validation outputs

* clean up a couple of TODOs

* weights

* fix weights

* fmt

* Resolve dmp todo

* Restore inclusion tests

* Restore paras_inherent tests

* MostRecentContext test

* Benchmark for new paras dispatchable

* Prepare check_validation_outputs for upgrade

* cargo run --quiet --profile=production  --features=runtime-benchmarks -- benchmark --chain=kusama-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/kusama/src/weights/runtime_parachains_paras.rs

* cargo run --quiet --profile=production  --features=runtime-benchmarks -- benchmark --chain=westend-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/westend/src/weights/runtime_parachains_paras.rs

* cargo run --quiet --profile=production  --features=runtime-benchmarks -- benchmark --chain=polkadot-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/polkadot/src/weights/runtime_parachains_paras.rs

* cargo run --quiet --profile=production  --features=runtime-benchmarks -- benchmark --chain=rococo-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/rococo/src/weights/runtime_parachains_paras.rs

* Implementers guide changes

* More tests for allowed relay parents

* Add a github issue link

* Compute group index based on relay parent

* Storage migration

* Move allowed parents tracker to shared

* Compile error

* Get group assigned to core at the next block

* Test group assignment

* fmt

* Error instead of panic

* Update guide

* Extend doc-comment

* Update runtime/parachains/src/shared.rs

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>

Co-authored-by: Chris Sosnin <chris125_@live.com>
Co-authored-by: Parity Bot <admin@parity.io>
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
* docs and skeleton

* subsystem skeleton

* main loop

* fragment tree basics & fmt

* begin fragment trees & view

* flesh out more of view update logic

* further flesh out update logic

* some refcount functions for fragment trees

* add fatal/non-fatal errors

* use non-fatal results

* clear up some TODOs

* ideal format for scheduling info

* add a bunch of TODOs

* some more fluff

* extract fragment graph to submodule

* begin fragment graph API

* trees, not graphs

* improve docs

* scope and constructor for trees

* add some test TODOs

* limit max ancestors and store constraints

* constructor

* constraints: fix bug in HRMP watermarks

* fragment tree population logic

* set::retain

* extract population logic

* implement add_and_populate

* fmt

* add some TODOs in tests

* implement child-selection

* strip out old stuff based on wrong assumptions

* use fatality

* implement pruning

* remove unused ancestor constraints

* fragment tree instantiation

* remove outdated comment

* add message/request types and skeleton for handling

* fmt

* implement handle_candidate_seconded

* candidate storage: handle backed

* implement handle_candidate_backed

* implement answer_get_backable_candidate

* remove async where not needed

* implement fetch_ancestry

* add logic for run_iteration

* add some docs

* remove global allow(unused), fix warnings

* make spellcheck happy (despite English)

* fmt

* bump Cargo.lock

* replace tracing with gum

* introduce PopulateFrom trait

* implement GetHypotheticalDepths

* revise docs slightly

* first fragment tree scope test

* more scope tests

* test add_candidate

* fmt

* test retain

* refactor test code

* test populate is recursive

* test contiguity of depth 0 is maintained

* add_and_populate tests

* cycle tests

* remove PopulateFrom trait

* fmt

* test hypothetical depths (non-recursive)

* have CandidateSeconded return membership

* tree membership requests

* Add a ProspectiveParachainsSubsystem struct

* add a staging API for base constraints

* add a `From` impl

* add runtime API for staging_validity_constraints

* implement fetch_base_constraints

* implement `fetch_upcoming_paras`

* remove reconstruction of candidate receipt; no obvious usecase

* fmt

* export message to broader module

* remove last TODO

* correctly export

* fix compilation and add GetMinimumRelayParent request

* make provisioner into a real subsystem with proper mesage bounds

* fmt

* fix ChannelsOut in overseer test

* fix overseer tests

* fix again

* fmt
* BEGIN ASYNC candidate-backing CHANGES

* rename & document modes

* answer prospective validation data requests

* GetMinimumRelayParents request is now plural

* implement an implicit view utility for backing subsystems

* implicit-view: get allowed relay parents

* refactorings and improvements to implicit view

* add some TODOs for tests

* split implicit view updates into 2 functions

* backing: define State to prepare for functional refactor

* add some docs

* backing: implement bones of new leaf activation logic

* backing: create per-relay-parent-states

* use new handle_active_leaves_update

* begin extracting logic from CandidateBackingJob

* mostly extract statement import from job logic

* handle statement imports outside of job logic

* do some TODO planning for prospective parachains integration

* finish rewriting backing subsystem in functional style

* add prospective parachains mode to relay parent entries

* fmt

* add a RejectedByProspectiveParachains error

* notify prospective parachains of seconded and backed candidates

* always validate candidates exhaustively in backing.

* return persisted_validation_data from validation

* handle rejections by prospective parachains

* implement seconding sanity check

* invoke validate_and_second

* Alter statement table to allow multiple seconded messages per validator

* refactor backing to have statements carry PVD

* clean up all warnings

* Add tests for implicit view

* Improve doc comments

* Prospective parachains mode based on Runtime API version

* Add a TODO

* Rework seconding_sanity_check

* Iterate over responses

* Update backing tests

* collator-protocol: load PVD from runtime

* Fix validator side tests

* Update statement-distribution to fetch PVD

* Fix statement-distribution tests

* Backing tests with prospective paras #1

* fix per_relay_parent pruning in backing

* Test multiple leaves

* Test seconding sanity check

* Import statement order

Before creating an entry in `PerCandidateState` map
wait for the approval from the prospective parachains

* Add a test for correct state updates

* Second multiple candidates per relay parent test

* Add backing tests with prospective paras

* Second more than one test without prospective paras

* Add a test for prospective para blocks

* Update malus

* typos

Co-authored-by: Chris Sosnin <chris125_@live.com>
* Provisioner changes for async backing

* Select candidates based on prospective paras mode

* Revert naming

* Update tests

* Update TODO comment

* review
* Provisioner changes for async backing

* Select candidates based on prospective paras mode

* Revert naming

* Update tests

* Update TODO comment

* review
…o handle versioned packets (#5991)

* BEGIN STATEMENT DISTRIBUTION WORK

create a vstaging network protocol which is the same as v1

* mostly make network bridge amenable to vstaging

* network-bridge: fully adapt to vstaging

* add some TODOs for tests

* fix fallout in bitfield-distribution

* bitfield distribution tests + TODOs

* fix fallout in gossip-support

* collator-protocol: fix message fallout

* collator-protocol: load PVD from runtime

* add TODO for vstaging tests

* make things compile

* set used network protocol version using a feature

* fmt

* get approval-distribution building

* fix approval-distribution tests

* spellcheck

* nits

* approval distribution net protocol test

* bitfield distribution net protocol test

* Revert "collator-protocol: fix message fallout"

This reverts commit 07cc887.

* Network bridge tests

Co-authored-by: Chris Sosnin <chris125_@live.com>
* remove max_pov_size requirement from prospective pvd request

* fmt
* add compatibility type to v2 statement distribution message

* warning cleanup

* handle compatibility layer for v2

* clean up an unimplemented!() block

* circulate statements based on version

* extract legacy v1 code into separate module

* remove unimplemented

* clean up naming of from_requester/responder

* remove TODOs

* have backing share seconded statements with PVD

* fmt

* fix warning

* Quick fix unused warning for not yet implemented/used staging messages.

* Fix network bridge test

* Fix wrong merge.

We now have 23 subsystems (network bridge split + prospective
parachains)

Co-authored-by: Robert Klotzner <robert.klotzner@gmx.at>
* Fix backing tests

* Fix warnings.
* Draft collator side changes

* Start working on collations management

* Handle peer's view change

* Versioning on advertising

* Versioned collation fetching request

* Handle versioned messages

* Improve docs for collation requests

* Add spans

* Add request receiver to overseer

* Fix collator side tests

* Extract relay parent mode to lib

* Validator side draft

* Add more checks for advertisement

* Request pvd based on async backing mode

* review

* Validator side improvements

* Make old tests green

* More fixes

* Collator side tests draft

* Send collation test

* fmt

* Collator side network protocol versioning

* cleanup

* merge artifacts

* Validator side net protocol versioning

* Remove fragment tree membership request

* Resolve todo

* Collator side core state test

* Improve net protocol compatibility

* Validator side tests

* more improvements

* style fixes

* downgrade log

* Track implicit assignments

* Limit the number of seconded candidates per para

* Add a sanity check

* Handle fetched candidate

* fix tests

* Retry fetch

* Guard against dequeueing while already fetching

* Reintegrate connection management

* Timeout on advertisements

* fmt

* spellcheck

* update tests after merge
bredamatt and others added 9 commits March 8, 2023 10:40
* start on handling incoming

* split off session info into separate map

* start in on a knowledge tracker

* address some grumbles

* format

* missed comment

* some docs for direct

* add note on slashing

* amend

* simplify 'direct' code

* finish up the 'direct' logic

* add a bunch of tests for the direct-in-group logic

* rename 'direct' to 'cluster', begin a candidate_entry module

* distill candidate_entry

* start in on a statement-store module

* some utilities for the statement store

* rewrite 'send_statement_direct' using new tools

* filter sending logic on peers which have the relay-parent in their view.

* some more logic for handling incoming statements

* req/res: BackedCandidatePacket -> AttestedCandidate + tweaks

* add a `validated_in_group` bitfield to BackedCandidateInventory

* BackedCandidateInventory -> Manifest

* start in on requester module

* add outgoing request for attested candidate

* add a priority mechanism for requester

* some request dispatch logic

* add seconded mask to tagged-request

* amend manifest to hold group index

* handle errors and set up scaffold for response validation

* validate attested candidate responses

* requester -> requests

* add some utilities for manipulating requests

* begin integrating requester

* start grid module

* tiny

* refactor grid topology to expose more info to subsystems

* fix grid_topology test

* fix overseer test

* implement topology group-based view construction logic

* fmt

* flesh out grid slightly more

* add indexed groups utility

* integrate Groups into per-session info

* refactor statement store to borrow Groups

* implement manifest knowledge utility

* add a test for topology setup

* don't send to group members

* test for conflicting manifests

* manifest knowledge tests

* fmt

* rename field

* garbage collection for grid tracker

* routines for finding correct/incorrect advertisers

* add manifest import logic

* tweak naming

* more tests for manifest import

* add comment

* rework candidates into a view-wide tracker

* fmt

* start writing boilerplate for grid sending

* fmt

* some more group boilerplate

* refactor handling of topology and authority IDs

* fmt

* send statements directly to grid peers where possible

* send to cluster only if statement belongs to cluster

* improve handling of cluster statements

* handle incoming statements along the grid

* API for introduction of candidates into the tree

* backing: use new prospective parachains API

* fmt prospective parachains changes

* fmt statement-dist

* fix condition

* get ready for tracking importable candidates

* prospective parachains: add Cow logic

* incomplete and complete hypothetical candidates

* remove keep_if_unneeded

* fmt

* implement more general HypotheticalFrontier

* fmt, cleanup

* add a by_parent_hash index to candidate tracker

* more framework for future code

* utilities for getting all hypothetical candidates for frontier

* track origin in statement store

* fmt

* requests should return peer

* apply post-confirmation reckoning

* flesh out import/announce/circulate logic on new statements

* adjust

* adjust TODO comment

* fix  backing tests

* update statement-distribution to use new indexedvec

* fmt

* query hypothetical candidates

* implement `note_importable_under`

* extract common utility of fragment tree updates

* add a helper function for getting statements unknown by backing

* import fresh statements to backing

* send announcements and acknowledgements over grid

* provide freshly importable statements

also avoid tracking backed candidates in statement distribution

* do not issue requests on newly importable candidates

* add TODO for later when confirming candidate

* write a routine for handling backed candidate notifications

* simplify grid substantially

* add some test TODOs

* handle confirmed candidates & grid announcements

* finish implementing manifest handling, including follow up statements

* send follow-up statements when acknowledging freshly backed

* fmt

* handle incoming acknowledgements

* a little DRYing

* wire up network messages to handlers

* fmt

* some skeleton code for peer view update handling

* more peer view skeleton stuff

* Fix async backing statement distribution tests (#6621)

* Fix compile errors in tests

* Cargo fmt

* Resolve some todos in async backing statement-distribution branch (#6482)

* Implement `remove_by_relay_parent`

* Extract `minimum_votes` to shared primitives.

* Add `can_send_statements_received_with_prejudice` test

* Fix test

* Update docstrings

* Cargo fmt

* Fix compile error

* Fix compile errors in tests

* Cargo fmt

* Add module docs; write `test_priority_ordering` (first draft)

* Fix `test_priority_ordering`

* Move `insert_or_update_priority`: `Drop` -> `set_cluster_priority`

* Address review comments

* Remove `Entry::get_mut`

* fix test compilation

* add a TODO for a test

* clean up a couple of TODOs

* implement sending pending cluster statements

* refactor utility function for sending acknowledgement and statements

* mostly implement catching peers up via grid

* Fix clippy error

* alter grid to track all pending statements

* fix more TODOs and format

* tweak a TODO in requests

* some logic for dispatching requests

* fmt

* skeleton for response receiving

* Async backing statement distribution: cluster tests (#6678)

* Add `pending_statements_set_when_receiving_fresh_statements`

* Add `pending_statements_updated_when_sending_statements` test

* fix up

* fmt

* update TODO

* rework seconded mask in requests

* change doc

* change unhandledresponse not to borrow request manager

* only accept responses sufficient to back

* finish implementing response handling

* extract statement filter to protocol crate

* rework requests: use statement filter in network protocol

* dispatch cluster requests correctly

* rework cluster statement sending

* implement request answering

* fmt

* only send confirmed candidate statement messages on unified relay-parent

* Fix Tests In Statement Distribution Branch

* Async Backing: Integrate `vstaging` of statement distribution into `lib.rs` (#6715)

* Integrate `handle_active_leaves_update`

* Integrate `share_local_statement`/`handle_backed_candidate_message`

* Start hooking up request/response flow

* Finish hooking up request/response flow

* Limit number of parallel requests in responder

* Fix test compilation errors

* Fix missing check for prospective parachains mode

* Fix some more compile errors

* clean up some review comments

* clean up warnings

* Async backing statement distribution: grid tests (#6673)

* Add `manifest_import_returns_ok_true` test

* cargo fmt

* Add pending_communication_receiving_manifest_on_confirmed_candidate

* Add `senders_can_provide_manifests_in_acknowledgement` test

* Add a couple of tests for pending statements

* Add `pending_statements_cleared_when_sending` test

* Add `pending_statements_respect_remote_knowledge` test

* Refactor group creation in tests

* Clarify docs

* Address some review comments

* Make some clarifications

* Fix post-merge errors

* Clarify test `senders_can_provide_manifests_in_acknowledgement`

* Try writing `pending_statements_are_updated_after_manifest_exchange`

* Document "seconding limit" and `reject_overflowing_manifests` test

* Test that seconding counts are not updated for validators on error

* Fix tests

* Fix manifest exchange test

* Add more tests in `requests.rs` (#6707)

This resolves remaining TODOs in this file.

* remove outdated inventory terminology

* Async backing statement distribution: `Candidates` tests (#6658)

* Async Backing: Fix clippy errors in statement distribution branch (#6720)

* Integrate `handle_active_leaves_update`

* Integrate `share_local_statement`/`handle_backed_candidate_message`

* Start hooking up request/response flow

* Finish hooking up request/response flow

* Limit number of parallel requests in responder

* Fix test compilation errors

* Fix missing check for prospective parachains mode

* Fix some more compile errors

* Async Backing: Fix clippy errors in statement distribution branch

* Fix some more clippy lints

* add tests module

* fix warnings in existing tests

* create basic test harness

* create a test state struct

* fmt

* create empty cluster & grid modules for tests

* some TODOs for cluster test suite

* describe test-suite for grid logic

* describe request test suite

* fix seconding-limit bug

* Remove extraneous `pub`

This somehow made it into my clippy PR.

* Fix some test compile warnings

* Remove some unneeded `allow`s

* adapt some new test helpers from Marcin

* add helper for activating a gossip topology

* add utility for signing statements

* helpers for connecting/disconnecting peers

* round out network utilities

* fmt

* fix bug in initializing validator-meta

* fix compilation

* implement first cluster test

* TODOs for incoming request tests

* Remove unneeded `make_committed_candidate` helper

* fmt

* Hook up request sender

* Add `valid_statement_without_prior_seconded_is_ignored` test

* Fix `valid_statement_without_prior_seconded_is_ignored` test

* some more tests for cluster

* add a TODO about grid senders

* integrate inbound req/res into test harness

* polish off initial cluster test suite

* keep introduce candidate request

* fix tests after introduce candidate request

* fmt

* Add grid protocol to module docs

* Remove obsolete test

* Fix comments

* Test `backed_in_path_only: true`

* Update node/network/protocol/src/lib.rs

Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>

* Update node/network/protocol/src/request_response/mod.rs

Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>

* Mark receiver with `vstaging`

* First draft of `ensure_seconding_limit_is_respected` test

* validate grid senders based on manifest kind

* fix mask_seconded/valid

* fix unwanted-mask check

* fix build

* resolve todo on leaf mode

* Unify protocol naming to vstaging

* Fix `ensure_seconding_limit_is_respected` test

* Start `backed_candidate_leads_to_advertisement` test

* fmt, fix grid test after topology change

* Send Backed notification

* Finish `backed_candidate_leads_to_advertisement` test

* Finish `peer_reported_for_duplicate_statements` test

* Finish `received_advertisement_before_confirmation_leads_to_request`

* Add `advertisements_rejected_from_incorrect_peers` test

* Add `manifest_rejected_*` tests

* Add `manifest_rejected_when_group_does_not_match_para` test

* Add `local_node_sanity_checks_incoming_requests` test

* Add `local_node_respects_statement_mask` test

* Add tests where peer is reported for providing invalid signatures

* Add `cluster_peer_allowed_to_send_incomplete_statements` test

* Add `received_advertisement_after_backing_leads_to_acknowledgement`

* Add `received_advertisement_after_confirmation_before_backing` test

* peer_reported_for_advertisement_conflicting_with_confirmed_candidate

* Add `peer_reported_for_not_enough_statements` test

* Add `peer_reported_for_providing_statements_meant_to_be_masked_out`

* Add `additional_statements_are_shared_after_manifest_exchange`

* Add `grid_statements_imported_to_backing` test

* Add `relay_parent_entering_peer_view_leads_to_advertisement` test

* Add `advertisement_not_re_sent_when_peer_re_enters_view` test

* Update node/network/statement-distribution/src/vstaging/tests/grid.rs

Co-authored-by: asynchronous rob <rphmeier@gmail.com>

* Resolve TODOs, update test

* Address unused code

* Add check after every test for unhandled requests

* Refactor (`make_dummy_leaf` and `handle_sent_request`)

* Refactor (`make_dummy_topology`)

* Minor refactor

---------

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
Co-authored-by: Chris Sosnin <chris125_@live.com>
Copy link
Contributor

@BradleyOlson64 BradleyOlson64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An interesting read! Found a couple nits that would make things more clear to me.

mrcnski and others added 2 commits March 31, 2023 03:42
…tion.md

Co-authored-by: Bradley Olson <34992650+BradleyOlson64@users.noreply.github.com>
…tion.md

Co-authored-by: Bradley Olson <34992650+BradleyOlson64@users.noreply.github.com>
@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: test-linux-stable
Logs: https://gitlab.parity.io/parity/mirrors/polkadot/-/jobs/2624048

@mrcnski mrcnski changed the title Async backing: impl guide: Collator Generation impl guide: Collator Generation May 18, 2023
@mrcnski mrcnski marked this pull request as ready for review May 18, 2023 13:00
@mrcnski mrcnski changed the base branch from rh-async-backing-feature to master May 18, 2023 13:01
@mrcnski mrcnski requested review from a team and chevdor as code owners May 18, 2023 13:01
@mrcnski mrcnski added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit. labels May 18, 2023
@mrcnski
Copy link
Contributor Author

mrcnski commented May 18, 2023

Closing, will raise a new PR without the "With Async Backing" section.

@mrcnski mrcnski closed this May 18, 2023
@mrcnski mrcnski deleted the mrcnski/async-backing-impl-guide-collator-generation branch May 28, 2023 11:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants