Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt heaviest subtree fork choice rule #10441

Merged
merged 14 commits into from
Jun 11, 2020

Conversation

carllin
Copy link
Contributor

@carllin carllin commented Jun 6, 2020

Problem

Old fork selection mechanism was hard to reason about and could cause liveness problems when
interacting with switching proofs: #10343 (comment)

Summary of Changes

  1. Add fork_weight_tracker.rs to keep track of the best subtree
  2. Replace old fork choice rule in replay_stage.rs and consensus.rs

Fixes #

@carllin
Copy link
Contributor Author

carllin commented Jun 6, 2020

@aeyakovenko thinking through the best way to migrate onto this new fork choice rule.

I think the problem with immediate unlocking this feature is if both rules choose different blocks:

                                   A 
                             /           \
             (old best, 50%)  B            C (new best, 50%)

It seems the safest thing to do is a rolling upgrade, which unlocks the new for choice rule when the root is greater than some slot UNLOCK_FORK_CHOICE. I think the switching threshold shouldn't cause problems during the transition because the new fork choice rule will choose a branch that has more stake/enough stake (38%) for a switch.

                 A (UNLOCK_FORK_CHOICE)
              /      \
            B        C (new best, 62%, even if not locked out past block D, eventually voters will)
           /
         D (old best, 38%)

Sucky part is that means I have to reintroduce the old fork choice logic :P

@carllin carllin force-pushed the FixForkSelection branch 4 times, most recently from 21aa1f1 to 0a36b17 Compare June 7, 2020 03:06
@codecov
Copy link

codecov bot commented Jun 7, 2020

Codecov Report

Merging #10441 into master will increase coverage by 0.1%.
The diff coverage is 96.3%.

@@            Coverage Diff            @@
##           master   #10441     +/-   ##
=========================================
+ Coverage    81.6%    81.7%   +0.1%     
=========================================
  Files         296      299      +3     
  Lines       69320    69794    +474     
=========================================
+ Hits        56608    57078    +470     
- Misses      12712    12716      +4     

@carllin carllin force-pushed the FixForkSelection branch 14 times, most recently from da8f8e2 to ed9249b Compare June 9, 2020 01:03
@aeyakovenko
Copy link
Member

i think i need you to walk me through this :)

@carllin carllin force-pushed the FixForkSelection branch 2 times, most recently from b38475c to ed96850 Compare June 9, 2020 20:01
@@ -1812,6 +1751,14 @@ impl ReplayStage {
}
}

pub fn get_unlock_heaviest_subtree_fork_choice(operating_mode: OperatingMode) -> Slot {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mvines, fyi, one more of our favorite rolling upgrades 😄 . This one should be done before any of the switching proof features are enabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg, just tag me again when this lands so I can manage the rollout as clusters upgrade to 1.3

@mvines
Copy link
Member

mvines commented Jun 11, 2020

@carllin - I removed the v1.2 tag, this appears way to risky for v1.2. This can ride the v1.3 train

@aeyakovenko
Copy link
Member

@carllin - I removed the v1.2 tag, this appears way to risky for v1.2. This can ride the v1.3 train

we would really like to test this in 1.2, otherwise the whole optimistic conf train gets pushed back a month.

@mvines
Copy link
Member

mvines commented Jun 11, 2020

The code churn in here is huge.

@mvines
Copy link
Member

mvines commented Jun 11, 2020

The problem is that optimistic conf basically missed the 1.2 train. It's a feature that didn't come in by the branch date, or even a day or two later. Ramming in a ton of new code weeks after the branch date isn't a good way to stabilize a release.

@mvines mvines added the v1.2 label Jun 11, 2020
@mvines
Copy link
Member

mvines commented Jun 11, 2020

v1.2 label restored. I'm still quite nervous about enabling this in the v1.2 timeframe, but given that this can be landed disabled I'll 🙈 and 🙏 for the best. We can discuss a timeline for actually enabling on v1.2 once testnet upgrades to v1.2 and is running well

@carllin carllin merged commit 2e1d59f into solana-labs:master Jun 11, 2020
mergify bot pushed a commit that referenced this pull request Jun 11, 2020
* Add HeaviestSubtreeForkChoice

* Make replay stage switch between two fork choice rules

Co-authored-by: Carl <carl@solana.com>
(cherry picked from commit 2e1d59f)

# Conflicts:
#	core/src/consensus.rs
#	core/src/replay_stage.rs
carllin added a commit that referenced this pull request Jun 12, 2020
* Add HeaviestSubtreeForkChoice

* Make replay stage switch between two fork choice rules

Co-authored-by: Carl <carl@solana.com>
carllin added a commit that referenced this pull request Jun 12, 2020
* Add HeaviestSubtreeForkChoice

* Make replay stage switch between two fork choice rules

Co-authored-by: Carl <carl@solana.com>
solana-grimes pushed a commit that referenced this pull request Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants