Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIP-0047? | Hardfork safety mechanism #318

Conversation

JaredCorduan
Copy link
Contributor

This CIP replaces a manual safety check regarding the readiness of the network for a hardfork with an automatic check.

Ever since the Shelley ledger era, block headers have included a protocol version indicating the maximum supported protocol version that the block producer is capable of supporting (see section 13, Software Updates, of the Shelley ledger specification).

This (semantically meaningless) field in the header provides a helpful metric for determining how many blocks will be produced after a hardfork, since nodes that have not upgraded will no longer produce blocks. (Nodes that have not upgraded will fail the chainChecks check from Figure 74 of the Shelley ledger specification, since the major protocol version in the ledger state will exceed the node's max major protocol version value, and hence can no longer make blocks.)

If most of the blocks in the recent past (e.g. the last epoch) are broadcasting their readiness for a hardfork, we know that it is safe to propose an update to the major protocol version which triggers a hardfork.

This CIP proposes automating this process, and making the protocol version in the header semantically meaningful. The ledger state will determine the stake (represented as the proportion of the active stake) of all the block producers whose last block contained the next major protocol version. Moreover, a new protocol parameter hfThreshold will be used to reject any protocol parameter update that proposes to change the major protocol version but does not have enough backing stake.

@JaredCorduan JaredCorduan changed the title First draft of hardfork safety mechanism Hardfork safety mechanism Aug 19, 2022
Copy link
Collaborator

@rphair rphair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am 100% behind this, pending the other editors being satisfied with the technical details. It represents a huge improvement in the decentralisation of network control by the governance key holders: a challenging problem to say the least. From my own point of view I can say that stake pool operators will feel positively about being included in the governance process as described. 😎

@KtorZ KtorZ changed the title Hardfork safety mechanism CIP-0047? | Hardfork safety mechanism Aug 20, 2022
@kieransimkin
Copy link
Contributor

kieransimkin commented Aug 20, 2022

Gets my support - maybe should be combined with a bit of a rethink of the release tagging process on github too? I'd like to see a Dev + SPO community vote before a release candidate is tagged


### New Protocol Parameter

There will be a new protocol parameter named `hfThreshhold`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that making hfThreshhold a protocol parameter means that the quorum required for a change to protocol parameters has to be equal to or greater than the requirement to trigger a hardfork
Otherwise, you could first make a protocol parameter vote to lower hfThreshhold and then do the hardfork afterwards

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely get and empathize with this concern, but it does come at the cost of extra complexity (to an otherwise pretty bare-bones change). I'm not sure I understand what "has to be equal to or greater than the requirement" means (the quorum and the threshold have different units, they are not comparable), but I can guess that you would like the SPOs to be able to signal support for changing the hfThreshhold (the way the protocol version is done today). This would require a change to the block header, and we'd need to think about how it would be configured. The broadcast protocol version is baked into the node, and requires a software update to change (which is natural, given that it relates to a fork). Would we require a software upgrade to endorse a change to the hfThreshhold, or make it configurable?


Without addressing this concern, it makes hfThreshhold more like all the other protocol parameters that also greatly effect the SPOs, like k, the fees, etc. The governance structure has to openly, on chain for everyone to see, declare that a change is being made. And it is the SPOs who ultimately run the network. Decentralized governance is a huge topic that we'll be getting into next, but in the short term we can think about where to draw the line on small changes that we could implement sooner rather than later. My bias is toward being as minimal as possible, but I'm happy for folks to disagree!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively, hfThreshold has a different feel than any other parameter. To me, it resembles an article of a constitution. Because of that, it would require a larger majority to get modified.

Without an protective mechanism (hfThresold2?), its value could go down over time, and there is surely a point where it starts getting dangerously low.


### Tracking Hardfork endorsements

The ledger state will maintain a set of stake pool IDs corresponding to the
Copy link
Contributor

@SebastienGllmt SebastienGllmt Aug 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem this doesn't address is what if you have competing proposals? The scheme described in this CIP only allows for sequential votes. The way this CIP is structure, I don't think this even allows you to propose two competing upgrades one after the other because the current structure doesn't have an expiry on upgrade proposals (other than maybe skipping version numbers if a version is deemed to have failed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are strict rules on how the protocol version can be increased:

https://github.com/input-output-hk/cardano-ledger/blob/7e2f674d2a2d14752d4c2d5abf60b26ae015b9e2/eras/shelley/impl/src/Cardano/Ledger/Shelley/PParams.hs#L519-L520

(m + 1, 0) == (m', n') || (m, n + 1) == (m', n')

Either the major is increased by exactly one (and the minor reset to zero), or the minor is increased by exactly one (and the major remains unchanged).

Moreover, this proposal is only putting in a safeguard for hardforks (major number increase). So it is always clear what the broadcasted protocol version in the block header is referring to.

Maybe this is only clear if I also explain how the existing protocol parameter update system works? During the voting window, each goverance key can propose a change (they can submit multiple proposals, but the latest one overrides the previous) for the end of the current epoch. If quorum is met, the change happens, otherwise nothing happens and the voting state resets. After the voting window, the each goverance key can stage a vote for the next epoch, which behaves exactly as though they waited until the next epoch and placed a vote during the next voting window.


current structure doesn't have an expiry on upgrade proposals

I think I've explained this above as well. The current structure has harsh expirations. or am I misunderstanding what you meant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a sentence to the end of this paragraph, let me know if it's clear.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you actually addressed the real question - and this is related to my comment above.
Who gets to make such proposals. Let's say entity A wants to make a change to parameter X and therefore submits a proposal to change the protocol version to (m+1, 0) - and entity B wants to leave parameter X unchanged, but wants to change parameter Y instead. The also submit a proposal. Which protocol version would be assigned to that? How would a determination be made, which of the changes to proceed with? How would the SPOs indicate which of the proposals they endorse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say entity A wants to make a change to parameter X and therefore submits a proposal to change the protocol version to (m+1, 0) - and entity B wants to leave parameter X unchanged, but wants to change parameter Y instead. The also submit a proposal. Which protocol version would be assigned to that?

I guess there are two issues here:

  • how the protocol parameter update system works
  • how stake pool operators endorse a hard fork

The answer to the first question is:

Whether or not the governance system will move to change the protocol version to (m+1, 0) depends not just on A and B, but also on the other five governance entities (the quorum is 5 of 7 on mainnet). The current system is very basic: at least five of the keys must agree on the entire set of changes. So if entities A - G all want to change Y to 42, but only four of the entities want to change the protocol version to (m+1, 0), nothing is changed at all, not even Y.

The answer to the second question is:

Regardless of what the governance body is doing, if you are a stake pool operator and you are aware of a software update that prepares a hard fork, let's say introducing protocol version to (m+1, 0), you can:

  • signal your willingness to enact the harfork by placing m+1 in your block headers (the new software will actually do this for you)
  • signal your unwillingness to enact the harfork by placing m in your block headers (the old software will actually do this for you)

If not enough stake is backed by the bolck producers posting m+1, no update proposal can occur which changes the major version to m+1, even if quorum is met and even if other protocol parameters we also slated to change.


I can try to summarize this in the CIP, since clearly it's still not clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried again to make this point more clear. Let me know if it is still murky.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if 20% of SPO's upgrade, then a bug is found. 50% of the original upgraded SPO then install the new fixed major release and enough other SPO's do as well so the hard fork is successful. But 10% of the SPO's have the buggy prefix running but it has the same version number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent point @WarriorField , and I'm embarrassed I did not think to address it in this CIP. It's far from just a theoretical concern, this came up quite recently, and this is what we did:

https://github.com/input-output-hk/cardano-node/blame/8832f86728ef6a425452b44f2f269acde149448c/cardano-node/src/Cardano/Node/Protocol/Cardano.hs#L201-L206

We fiddled with the minor version. And of course this only worked since the process is still manual. This CIP should address this, thank you!

@michael-liesenfelt
Copy link
Contributor

If threshold is a parameter community voting could lower a super-majority to a simple majority. To maintain the security of this parameter it should not be subject to simple majority governance. It may be wise to require a hard fork, not a vote, to change the hard fork threshold. Maybe it becomes a global constant instead of a parameter?

On programming style I oppose the abbreviation legacy of punch cards and tapes. We are not memory space or character limited. It is ideal to make all software less 'coded' and more human readable. Type out the words and use: hardfork_threshold.

@beriardas
Copy link

Great proposal

@JaredCorduan
Copy link
Contributor Author

If threshold is a parameter community voting could lower a super-majority to a simple majority. To maintain the security of this parameter it should not be subject to simple majority governance. It may be wise to require a hard fork, not a vote, to change the hard fork threshold. Maybe it becomes a global constant instead of a parameter?

If we decide that we prefer to make this value hard to change, hard-coding it is a nice and simple solution. I'll just reiterate that nearly all of the protocol parameters are capable of having a dramatic change to the network, and that what we really need is a holistic solution for governing all of them.

On programming style I oppose the abbreviation legacy of punch cards and tapes. We are not memory space or character limited. It is ideal to make all software less 'coded' and more human readable. Type out the words and use: hardfork_threshold.

I agree that spelling it out is a much better idea. I picked up this habit after a few years of writing latex specs for the Cardano ledger, where horizontal space is often at a premium. If we are discussing variable names in code, however, you should know that there is a very strong convention in Haskell to use camel case.

@michael-liesenfelt
Copy link
Contributor

If we decide that we prefer to make this value hard to change, hard-coding it is a nice and simple solution. I'll just reiterate that nearly all of the protocol parameters are capable of having a dramatic change to the network, and that what we really need is a holistic solution for governing all of them.

Agreed. For this parameter, hardforkThreshold = 0.75 with a hard-coded range bound of ( 0.51 , 0.95 ) would be wise. It would be very wise to hard-code values of a safe reasonable minimum and a safe reasonable maximum for all of the parameters. Governance could be used to adjust the parameter within the safe range and only a hardfork could change the bounds per parameter.

@JaredCorduan
Copy link
Contributor Author

Agreed. For this parameter, hardforkThreshold = 0.75 with a hard-coded range bound of ( 0.51 , 0.95 ) would be wise.

I think this is an excellent suggestion! Make it a protocol parameter that can be changed, but restrict the range to safe values.

* tweak intro
* change parameter name to hardforkThreshold
* bounds on hardforkThreshold values
* make clear that protocol version endorsement is not ambiguous
Copy link

@dirkhh-cf dirkhh-cf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the direction this is going, but I have some questions... 😁


## Copyright

This CIP is licensed under Apache-2.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a silly nit-pick, but you have a license under the Copyright header - and that license is inconsistent with the one you list in line 8 above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching this! I will ask about what I am supposed to use, I admit I just copied those from an existing CIP created by an IOG employee.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speaking with my "I am not a lawyer but I have spent way too much time working on licenses" hat on... I'd suggest going with the license in line 8 (CC-BY-4.0) as that is a great license for documents. Apache v2 is a very good choice for code (as it makes it easy to reuse and clarifies patent concerns), but doesn't really make a lot of sense if you apply it to documents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

whose last block contained the next major protocol version.
Moreover, a new protocol parameter `hardforkThreshold` will be used to reject any
protocol parameter update that proposes to change the major protocol version
but does not have enough backing stake.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify was "automating the process" means in this context?
I can see in your description how this could be used to prevent a hardfork if there is insufficient backing - but what isn't explained is how such a hardfork would be initiated. Who could make such a change proposal? Is that also automated? Is it anyone who can produce blocks (so any SPO for example)? Or just some specific entity / entities?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I can explain it well enough here, and if it makes sense I'll add it to the CIP.

The current manual process is: around the time of a hard fork, humans use a tool like db-sync to see which stake pools are posting blocks with the current major protocol version in the block header, and which ones have the the current major protocol version plus one. If the holders of the goverance keys decide "yea, looks good", and they are otherwise ok with the hard fork (I do not myself know all that goes into the operational side if this, and am not proposing automating anything besides this single check), then they submit update proposals. If 5 of 7 agree on the exact proposals, the change occurs on the epoch boundary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that makes total sense... I guess what I asked for was some clarification on what exactly is meant by "automating the process" - so in the brave new world, how would this work. Is it still one of the seven proposing, at least five of seven agreeing, but THEN (after that) there is ALSO the technical requirement proposed here?
Or does this change the first part of the process as well, i.e. can others propose that change? Who needs to approve before this technical requirement comes into play.

I am thinking (but could be totally wrong) that right now you are only focused on the negative case, i.e. this CIP only deals with creating a hurdle that prevents a change and doesn't try to change any other part of the process. But I think this could be somewhat clearer in this paragraph of the explanation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking (but could be totally wrong) that right now you are only focused on the negative case, i.e. this CIP only deals with creating a hurdle that prevents a change and doesn't try to change any other part of the process.

That is exactly right! This CIP does not intend to change the off-chain process in any way. It's just a final blocker to stop a hard fork that is not properly endorsed by the SPOs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized I lied 😆

This CIP does not intend to change the off-chain process in any way.

This CIP intends to lift one very specific off-chain process to the protocol itself, namely looking through block headers to see who is signaling their readiness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to make this clear now.


### Tracking Hardfork endorsements

The ledger state will maintain a set of stake pool IDs corresponding to the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you actually addressed the real question - and this is related to my comment above.
Who gets to make such proposals. Let's say entity A wants to make a change to parameter X and therefore submits a proposal to change the protocol version to (m+1, 0) - and entity B wants to leave parameter X unchanged, but wants to change parameter Y instead. The also submit a proposal. Which protocol version would be assigned to that? How would a determination be made, which of the changes to proceed with? How would the SPOs indicate which of the proposals they endorse?

The safeguard presented in this CIP aligns very closely with the manual check
currently performed
today before any hardfork.
Moreover, we have strived to make the minimal changes needed to automate

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually only part of the criteria currently being used. SPO block creation ratio, defi TVL criterion, exchange adoption criterion. I'm not saying that the other two should be encoded here - but it would be reasonable to mention them in the Rationale

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I can make it more clear that this CIP is only aiming to automate one very specific check? I myself do not know the whole process.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that clarification. Yes, that would be very appreciated. I clearly read way too much into what you were intending to change.

* consistently use CC-BY-4.0 for the license
* make clear that only one specific aspect of the off-chain hardfork
  process is being automated
* make clear that the endorsements are unambiguous
@JaredCorduan
Copy link
Contributor Author

I am going to close this PR, since it is replaced by CIP-1694. yea?

@rphair
Copy link
Collaborator

rphair commented Nov 18, 2022

that makes sense @JaredCorduan but can you please add this URL https://github.com/cardano-foundation/CIPs/pull/318 to the Discussions: header of your draft in #380? (unless you'd like to declare it irrelevant)

@JaredCorduan
Copy link
Contributor Author

that makes sense @JaredCorduan but can you please add this URL https://github.com/cardano-foundation/CIPs/pull/318 to the Discussions: header of your draft in #380? (unless you'd like to declare it irrelevant)

I think I'd prefer to declare it irrelevant, since all the discussion here was about the nitty gritty details of the mechanism here that we are abandoning. but I'll happily defer to y'all's guidance!

@michael-liesenfelt
Copy link
Contributor

I agree.
I don't believe there are any good ideas in CIP-0047 which haven't been incorporated into the scope of CIP-1694.

@KtorZ KtorZ closed this Nov 25, 2022
@KtorZ KtorZ added State: Likely Deprecated Close if confirmed deprecated (or long waiting). and removed Candidate CIP labels Nov 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
State: Likely Deprecated Close if confirmed deprecated (or long waiting).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants