Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collator-protocol: Handle unknown validator heads #5538

Merged
merged 7 commits into from
Sep 2, 2024

Conversation

bkchr
Copy link
Member

@bkchr bkchr commented Aug 30, 2024

There is a race condition when a validator sends its heads to the collator, but the collator doesn't yet know these heads. Before it is aware of these heads by importing the block(s), any collation registered on the collator is not announced to the validators. The collations aren't advertised, because the collator doesn't know yet that these heads of the validator are descendants of the collations relay parent.

The solution is to store these unknown heads of the validators and to handle them when the collator updates its own view.

There is a race condition when a validator sends its heads to the collator,
but the collator doesn't yet know these heads. Before it is aware of these heads
by importing the block(s), any collation registered on the collator is not announced
to the validators. The collations aren't advertised, because the collator doesn't know
yet that these heads of the validator are descendants of the collations relay parent.

The solution is to store these unknown heads of the validators and to handle them when
the collator updates its own view.
@bkchr bkchr added the T8-polkadot This PR/Issue is related to/affects the Polkadot network. label Aug 30, 2024
@bkchr bkchr requested a review from alexggh August 30, 2024 20:44
Copy link
Contributor

@alexggh alexggh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and good catch.

Did you catch any scenarios when this make a difference?

Right now we have two paths when we advertise a collation:

  1. When we create the collation.
  2. When peer sends its view update.

We can't really have a collation if we don't know about the leaf, so I would have expected we try at least once to advertise the collation.

But, then the scenario you describe can happen at the next relay chain leaf, but we should've already advertised the collation, wouldn't we ? And advertise collation seems to have logic to not advertise a collation more than once to the same peer.

@bkchr
Copy link
Member Author

bkchr commented Sep 2, 2024

Did you catch any scenarios when this make a difference?

Yes. When I was building this, I was running constantly into this issue.

For this to work I was building on an older relay chain block and the collation was most of the time always announced internally in this small window before the collator had imported these new blocks.

@bkchr bkchr requested a review from sandreim September 2, 2024 08:25
Copy link
Contributor

@lexnv lexnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on my limited knowledge of this area 👍


// Announce relevant collations to these peers.
for (peer_id, peer_version) in &peers {
advertise_collation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the race condition, but I still don't get why the collations are not advertised when calling distribute_collation adter we built on a known relay parent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite strange for local testing, but in practice it could happen that the collation is not advertised for other reasons. The collator has to be really unlucky.

@bkchr bkchr enabled auto-merge September 2, 2024 10:00
@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: test-linux-stable 3/3
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7213465

@bkchr bkchr added this pull request to the merge queue Sep 2, 2024
Merged via the queue into master with commit f58e2b8 Sep 2, 2024
184 of 185 checks passed
@bkchr bkchr deleted the bkchr-collator-protocol-view-mismatch branch September 2, 2024 14:44
x3c41a pushed a commit that referenced this pull request Sep 4, 2024
There is a race condition when a validator sends its heads to the
collator, but the collator doesn't yet know these heads. Before it is
aware of these heads by importing the block(s), any collation registered
on the collator is not announced to the validators. The collations
aren't advertised, because the collator doesn't know yet that these
heads of the validator are descendants of the collations relay parent.

The solution is to store these unknown heads of the validators and to
handle them when the collator updates its own view.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T8-polkadot This PR/Issue is related to/affects the Polkadot network.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants