Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec without peer sampling #3870

Merged
merged 9 commits into from
Aug 12, 2024
Merged

Conversation

fradamt
Copy link
Contributor

@fradamt fradamt commented Aug 7, 2024

A first spec for this proposed simplification of the first version of PeerDAS. This includes:

  1. Splitting the peer sampling part of the spec into a separate file, peer-sampling.md
  2. Specifying "subnet sampling" as participating in max(custody_subnet_count, SAMPLES_PER_SLOT) subnets, only custody_subnet_count of which are advertised. custody_subnet_count is required to be at least CUSTODY_REQUIREMENT, which is less than SAMPLES_PER_SLOT, so only a portion of the subnet sampling is enforced by peers.
  3. A simple fork-choice spec: is_data_available is entirely based on subnet sampling, meaning that something is available as long as all subnet sampling columns (again, a superset of custody) are available. This is used as a fork-choice filter in get_head rather than as a block import filter in on_block. We always import blocks regardless of whether they are available, but we never follow unavailable branches in the fork-choice. Possibly moving this to its own separate PR and keeping the Deneb-style fork-choice (don't import unavailable blocks) for now, see Spec without peer sampling #3870 (comment)

@jtraglia jtraglia added the EIP-7594 PeerDAS label Aug 7, 2024
fradamt and others added 3 commits August 7, 2024 16:34
Co-authored-by: Justin Traglia <95511699+jtraglia@users.noreply.github.com>
Co-authored-by: Justin Traglia <95511699+jtraglia@users.noreply.github.com>
Copy link
Member

@jtraglia jtraglia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 I would like another review before merging this.

@nisdas
Copy link
Contributor

nisdas commented Aug 8, 2024

A simple fork-choice spec: is_data_available is entirely based on subnet sampling, meaning that something is available as long as all subnet sampling columns (again, a superset of custody) are available. This is used as a fork-choice filter in get_head rather than as a block import filter in on_block. We always import blocks regardless of whether they are available, but we never follow unavailable branches in the fork-choice.

What is the reason for changing this ? Any reason we can't hold off on importing blocks till we verify that all custodied data is there . Importing unavailable blocks into forkchoice and the db, will require some special case handling

@fradamt
Copy link
Contributor Author

fradamt commented Aug 8, 2024

A simple fork-choice spec: is_data_available is entirely based on subnet sampling, meaning that something is available as long as all subnet sampling columns (again, a superset of custody) are available. This is used as a fork-choice filter in get_head rather than as a block import filter in on_block. We always import blocks regardless of whether they are available, but we never follow unavailable branches in the fork-choice.

What is the reason for changing this ? Any reason we can't hold off on importing blocks till we verify that all custodied data is there . Importing unavailable blocks into forkchoice and the db, will require some special case handling

I don't like this about the current behavior, applied to a system with sharded distribution:

  1. At slot n there's an uncontroversial block A, with a lot of voting weight
  2. At slot n+1, someone builds B on top of A, and only makes some of the columns available
  3. Some validators in the slot n+1 committee vote for B, because they see it as available
  4. Anyone that does not see B as available (and hasn't imported it) will ignore those votes as well, even though they should at the very least count for A

Basically A ends up missing out on voting weight by no fault of its own. Of course, if B is actually not available (< 50%), the amount of "missed weight" would be low due to sampling. And if it is available, reconstruction should quickly enough make sure that everyone imports it and sees all weight for A. Therefore, we could also go with the simpler option of just not importing, at a small price.

In principle I struggle to see why not importing should be simpler than this other solution, and so I would prefer what I see as the more principled approach. On the other hand, given that we already have the not importing approach implemented, there's little harm in keeping that one and leaving it up to a separate decision whether to at some point switch to the fork-choice filtering approach. I might revert to the original approach in this PR, and open another PR for the fork-choice change only.

@hwwhww hwwhww mentioned this pull request Aug 9, 2024
3 tasks
Copy link
Member

@ralexstokes ralexstokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work! I like the simplification and if anything just makes the implementation process a bit more manageable with smaller chunks

@ralexstokes
Copy link
Member

going to merge for now to facilitate development targets for peerdas implementers

let's address outstanding concerns in future PRs :)

@ralexstokes ralexstokes merged commit 13ac373 into ethereum:dev Aug 12, 2024
26 checks passed
@nisdas
Copy link
Contributor

nisdas commented Aug 14, 2024

I realize the reply is late for this:

In principle I struggle to see why not importing should be simpler than this other solution, and so I would prefer what I see as the more principled approach. On the other hand, given that we already have the not importing approach implemented, there's little harm in keeping that one and leaving it up to a separate decision whether to at some point switch to the fork-choice filtering approach. I might revert to the original approach in this PR, and open another PR for the fork-choice change only.

This is more of an issue on how clients handle invalid blocks. I view unavailable blocks as a class of them. For us if a block is invalid (either invalid consensus validation or execution validation), it would never be inserted into the db or forkchoice. In the event of optimistic sync this block would then be removed from the db. If we do start inserting blocks into the db which we have determined are unavailable and therefore invalid it would require a lot downstream changes ( ex: API,etc) on how to handle these class of blocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants