Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Cosmos upgrade handler calls swingset #7994

Merged
merged 4 commits into from
Jul 12, 2023
Merged

Conversation

JimLarson
Copy link
Contributor

@JimLarson JimLarson commented Jun 29, 2023

refs: #7992
closes: #7151

Description

Allow Swingset to perform upgrade-time actions by making a new bridge action that contains the upgrade plan that is called by the Cosmos upgrade handler. The upgrade plan's Info field is a string that by convention contains a JSON-serialized object. We can add additional fields to that object for swingset to use.

Security Considerations

The Swingset upgrade handler runs with significant authority, so it must be written carefully.

Scaling Considerations

The upgrade plan cannot be too large as it must be assayable.

Documentation Considerations

N/A

Testing Considerations

Swingset upgrade handlers must be tested in a full upgrade test.

@JimLarson JimLarson added enhancement New feature or request cosmic-swingset package: cosmic-swingset agoric-cosmos labels Jun 29, 2023
@JimLarson JimLarson requested a review from michaelfig June 29, 2023 21:53
@JimLarson JimLarson self-assigned this Jun 29, 2023
Copy link
Member

@mhofman mhofman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in our discussions we wanted to avoid a bridge message sent during the upgrade handler. This may indeed work, but I'd like to think more about the design (hence the request for changes)

The upgrade plan cannot be too large as it must be assayable.

Could you clarify what you mean?

The Swingset upgrade handler runs with significant authority, so it must be written carefully.

Could you details how this authority is different from other swingset invocations from golang?

func swingsetUpgrade(ctx sdk.Context, plan upgradetypes.Plan, callToController func(ctx sdk.Context, str string) (string, error)) error {
action := swingsetUpgradeAction{
Type: enactUpgradePlanType,
Plan: plan,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the whole plan be sent over? The info may contain information unrelated to swingset, maybe it'd be better to reserve a specific field for the swingset module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conventional Cosmos use of the Info field is pretty small, just containing a reference to the exact commit. See the "Plan" section of https://bigdipper.live/agoric/proposals/34

If upgrade proposal plans become larger in the future I'd be happy to restrict the information sent to swingset, but for now I don't think there's a need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing the entire plan is the best choice, IMO.

@JimLarson
Copy link
Contributor Author

By "assayable", I mean that a human voter must be able to look at the raw plan and decide whether it implements the actions described in the prose of the upgrade proposal. For example: https://bigdipper.live/agoric/proposals/34

@JimLarson
Copy link
Contributor Author

The security caution only means to say that we cannot really tell ahead of time what sort of authority an upgrade handler might not need, so they need to be written just as carefully as blockingSend() is.

@michaelfig
Copy link
Member

I believe in our discussions we wanted to avoid a bridge message sent during the upgrade handler. This may indeed work, but I'd like to think more about the design (hence the request for changes)

I think that this is okay, as long as it can be handled before initializing the kernel. If that's the case, I think it is the best and most flexible design.

@michaelfig
Copy link
Member

Here is the stack trace from the failed CI:

goroutine 69 [running]:
runtime/debug.Stack()
        /usr/local/go/src/runtime/debug/stack.go:24 +0x65
github.com/tendermint/tendermint/consensus.(*State).receiveRoutine.func2()
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:724 +0x4c
panic({0x7f7492c36cc0, 0x7f7493c03530})
        /usr/local/go/src/runtime/panic.go:884 +0x213
github.com/cosmos/cosmos-sdk/x/upgrade.BeginBlocker({{0xc000d7c8f0, 0xd}, 0xc001110d20, {0x7f7492e806d8, 0xc001140810}, {0x7f7492ea1180, 0xc000d84030}, 0xc0011b18c0, {0x7f7492e7ab58, 0xc001064540}, ...}, ...)
        /go/pkg/mod/github.com/agoric-labs/cosmos-sdk@v0.45.11-alpha.agoric.3/x/upgrade/abci.go:38 +0x1014
github.com/cosmos/cosmos-sdk/x/upgrade.AppModule.BeginBlock(...)
        /go/pkg/mod/github.com/agoric-labs/cosmos-sdk@v0.45.11-alpha.agoric.3/x/upgrade/module.go:130
github.com/cosmos/cosmos-sdk/types/module.(*Manager).BeginBlock(_, {{0x7f7492e93ae8, 0xc000132030}, {0x7f7492ea2f60, 0xc001e0f780}, {{0xb, 0x0}, {0xc000c94e00, 0xb}, 0x29b, ...}, ...}, ...)
        /go/pkg/mod/github.com/agoric-labs/cosmos-sdk@v0.45.11-alpha.agoric.3/types/module/module.go:491 +0x1fb
github.com/Agoric/agoric-sdk/golang/cosmos/app.(*GaiaApp).BeginBlocker(...)
        /usr/src/agoric-sdk/golang/cosmos/app/app.go:888
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).BeginBlock(_, {{0xc0010bce40, 0x20, 0x20}, {{0xb, 0x0}, {0xc000c94e00, 0xb}, 0x29b, {0xe24203e, ...}, ...}, ...})
        /go/pkg/mod/github.com/agoric-labs/cosmos-sdk@v0.45.11-alpha.agoric.3/baseapp/abci.go:177 +0x97b
github.com/tendermint/tendermint/abci/client.(*committingClient).BeginBlockSync(_, {{0xc0010bce40, 0x20, 0x20}, {{0xb, 0x0}, {0xc000c94e00, 0xb}, 0x29b, {0xe24203e, ...}, ...}, ...})
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/abci/client/committing_client.go:332 +0x1a4
github.com/tendermint/tendermint/proxy.(*appConnConsensus).BeginBlockSync(_, {{0xc0010bce40, 0x20, 0x20}, {{0xb, 0x0}, {0xc000c94e00, 0xb}, 0x29b, {0xe24203e, ...}, ...}, ...})
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/proxy/app_conn.go:81 +0x55
github.com/tendermint/tendermint/state.execBlockOnProxyApp({0x7f7492e94990?, 0xc000d7ff80}, {0x7f7492e9b340, 0xc001208930}, 0xc0000e7c20, {0x7f7492ea2558, 0xc002366018}, 0x29a?)
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/state/execution.go:307 +0x51d
github.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(_, {{{0xb, 0x0}, {0xc002c18c88, 0x8}}, {0xc002c18ca0, 0xb}, 0x1, 0x29a, {{0xc002c6af00, ...}, ...}, ...}, ...)
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/state/execution.go:140 +0x171
github.com/tendermint/tendermint/consensus.(*State).finalizeCommit(0xc001055c00, 0x29b)
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:1654 +0xafd
github.com/tendermint/tendermint/consensus.(*State).tryFinalizeCommit(0xc001055c00, 0x29b)
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:1563 +0x2ff
github.com/tendermint/tendermint/consensus.(*State).enterCommit.func1()
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:1498 +0x94
github.com/tendermint/tendermint/consensus.(*State).enterCommit(0xc001055c00, 0x29b, 0x0)
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:1536 +0xccf
github.com/tendermint/tendermint/consensus.(*State).addVote(0xc001055c00, 0xc000f35e00, {0x0, 0x0})
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:2150 +0x18dc
github.com/tendermint/tendermint/consensus.(*State).tryAddVote(0xc001055c00, 0xc000f35e00, {0x0?, 0x7f7490dbdfd4?})
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:1948 +0x2c
github.com/tendermint/tendermint/consensus.(*State).handleMsg(0xc001055c00, {{0x7f7492e7ba98, 0xc00012d620}, {0x0, 0x0}})
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:853 +0x3ff
github.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc001055c00, 0x0)
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:780 +0x4dc
created by github.com/tendermint/tendermint/consensus.(*State).OnStart
        /go/pkg/mod/github.com/agoric-labs/tendermint@v0.34.23-alpha.agoric.3/consensus/state.go:379 +0x12d

Copy link
Member

@michaelfig michaelfig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the design, but something is wrong with the implementation causing a crash.

func swingsetUpgrade(ctx sdk.Context, plan upgradetypes.Plan, callToController func(ctx sdk.Context, str string) (string, error)) error {
action := swingsetUpgradeAction{
Type: enactUpgradePlanType,
Plan: plan,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing the entire plan is the best choice, IMO.

// business.
stdlog.Println("Rebooting SwingSet")
return mvm, swingset.BootSwingset(ctx, app.SwingSetKeeper)
// Lastly, let Swingset reaction to the upgrade
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Lastly, let Swingset reaction to the upgrade
// Lastly, let SwingSet react to the upgrade

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@JimLarson
Copy link
Contributor Author

The previous failure doesn't recur. I think Raph's fix did it.

Oh, I forgot to give a shout-out to @raphdev for #7996.

@@ -853,6 +836,7 @@ func (app *GaiaApp) MustInitController(ctx sdk.Context) {
VibcPort: app.vibcPort,
VbankPort: app.vbankPort,
LienPort: app.lienPort,
UpgradePlan: app.upgradePlan,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, placing the upgradePlan in the init message actually makes sense. I have plans to refactor the toSwingSet in chain-main.js which would make this a lot easier and straightforward to handle too!

@mhofman
Copy link
Member

mhofman commented Jul 11, 2023

I believe in our discussions we wanted to avoid a bridge message sent during the upgrade handler. This may indeed work, but I'd like to think more about the design (hence the request for changes)

I think that this is okay, as long as it can be handled before initializing the kernel. If that's the case, I think it is the best and most flexible design.

As we discussed offline, the ability to handle before initializing the kernel was the sticking point if we were going to take the blocking send approach. The PR now allows us to do this, in a clever way by adding the upgrade plan to the init message.

Note for future self: the JS side of cosmic-swingset will need to be careful about hangover inconsistency. It's possible for JS to perform its upgrade steps, commit the swing-store, then for the cosmos side to fail and not commit the cosmos DB. On restart it would result in the init message containing the upgrade plan again. cosmic-swingset needs to detect this case and skip the upgrade work (similarly to how chain sends are simply replayed for normal block processing), at least until we have a better story about swing-store rollbacks (#7963).

@JimLarson, to clarify, the upgrade plan includes the upgrade name, so the cosmic-swingset JS side can use that info if needed?

@JimLarson
Copy link
Contributor Author

Correct - the plan contains the upgrade name. Note that the upgrade Info field will contain a JSON-serialized object which can contain interesting stuff - but it must be explicitly unpacked.

@JimLarson JimLarson added this pull request to the merge queue Jul 12, 2023
Merged via the queue into master with commit bcca849 Jul 12, 2023
@JimLarson JimLarson deleted the 7992-enact-upgrade-plan branch July 12, 2023 02:13
mhofman pushed a commit that referenced this pull request Aug 7, 2023
feat: Cosmos upgrade handler calls swingset
mhofman pushed a commit that referenced this pull request Aug 7, 2023
feat: Cosmos upgrade handler calls swingset
mhofman pushed a commit that referenced this pull request Aug 16, 2023
feat: Cosmos upgrade handler calls swingset
mhofman pushed a commit that referenced this pull request Aug 16, 2023
feat: Cosmos upgrade handler calls swingset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agoric-cosmos cosmic-swingset package: cosmic-swingset enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement upgrade-11 handler
3 participants