-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multi-store upgrades that need to happen outside of BeginBlock #7
Comments
I think the problem is the store upgrades should be written when halting the old binary, but are not known until producing the new binary. The upgrade handler is only present in the new binary and is a suboptimal place to store it. I would propose adding a field to the SoftwareUpgradeProposal to store the story types.StoreUpgrades and then when the old app hits the halt height (and prints the log message and info to stdout), it will also write the upgrade-needed.json file to disk |
Given that we have multi-week voting periods, we do not want to enforce the requirement that the binary and store upgrades are all known in advance of voting. The new binary should have full knowledge of what is needed to perform the upgrade and the upgrade plan should require zero knowledge of the upgrade, but that info can be specified when available. Otherwise, social consensus can be used to coordinate the new binary. I have edited the proposed design above with more clear specification. Please see the text in bold above. |
The new app cannot even mount the store until the store upgrades have been executed. You cannot even read the version of the store safely without digging into root multistory internals. The simple load returns error (panicked until my pr) You have to do this before the new binary touches the store. Meaning it would do it blind |
I think we're going to need to find a way. We would do this in the method
of the root multistore initialization not once it's mounted. Upgrading
before the new binary is incorrect behavior because it would make a
rollback impossible. The only correct behavior is in the transaction of the
upgrade block in the new binary.
…On Mon, Oct 28, 2019, 08:53 Ethan Frey ***@***.***> wrote:
The new app cannot even mount the store until the store upgrades have been
executed. You cannot even read the version of the store safely without
digging into root multistory internals. The simple load returns error
(panicked until my pr)
You have to do this before the new binary touches the store. Meaning it
would do it blind
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7?email_source=notifications&email_token=AAAL6FSVENC4FXXCDZ62XK3QQ3OFVA5CNFSM4JEY43YKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECMYMWY#issuecomment-546932315>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAL6FVUGVPALKDQYB5PFC3QQ3OFVANCNFSM4JEY43YA>
.
|
@ethanfrey I am not clear whether or not the So I would say there are two options:
Either way, I think it would be possible to use this signature: SetUpgradeHandler(name string, upgradeHandler types.UpgradeHandler, storeUpgrades []storytypes.StoreUpgrades) which allows the new binary to be the sole source of truth about what happens in an upgrade In terms of implementation, I think we could simply test whether or not applying upgrades is transactional or not and if it isn't then we just need to create or document a |
You are correct that It does have significant gas cost - we should double-check the infinite gas meter is set for these operations. The proposed change to The first to call And a second call to set up the migrations (anywhere in the same init function, after baseapp is created, but before the store is loaded): storeMigrations := StoreLoaderWithUpgrade(&storetypes.StoreUpgrades{
Renamed: []storetypes.StoreRename{{
OldKey: "foo",
NewKey: "bar",
}},
Deleted: []string{"not_needed"},
})
app.SetStoreLoader(storeMigrations) I understand your concern about relying on a json file on disk that is not strongly tied to the binary itself. This approach would tie it to a binary. And the upgrade BeginBlocker will panic (avoiding commiting the migration) if this binary is launched before the plan is active due to the |
One issue with this approach is it may run multiple times. I make a binary v2 that contains a new handler and a new migration. I then launch it after the planned upgrade time and it does the proper thing. Awesome. A week later, I restart the binary and it will then attempt to re-run the The original design of yours was to dump the json file when the old binary panic'd and use that when loading, deleting after use. This is implemented and would work unless the user messes with those files manually. I would note that the following statements of leave your intention less than clear:
My understanding is that the So we could try to read the file and extract a height, something like: if height, err := readUpgradeInfo(homeDir); err == nil {
storeMigrations := StoreLoaderWithHeightLimitedUpgrade(height, &storetypes.StoreUpgrades{ /* ... */ })
app.SetStoreLoader(storeMigrations)
} and then define a Is that a correct understanding? If does still have a file on disk, but the actual migrations are now defined in the binary rather than in the upgrade plan, which is more secure and testable. |
Yes, that's what I was thinking |
@aaronc I think the conclusion is needed for this to implement the solution. My understanding based on the discussion is,
Please confirm if this what you are thinking and update the proposed design accordingly. |
Thanks for this write-up @anilcse. I agree with 1, 3, and 4. I'm not sure I really understand 2. Would the solution in 2 mean that we don't really on the upgrade.json file on disk? I'd like to see this proposed solution fleshed out a little bit more. Maybe that's something for @ethanfrey to comment on. |
This is correct. Let me explain 2 a bit more. On panic, old binary will dump some file to disk, with the upgrade name and height. On start of new binary, in the app constructor, we (1) This StoreLoader will be called every time the new binary is launched, and we want to ensure it only happens once. We do have access to query the CommiInfo of the root multistore before executing the StoreLoader, so we can read the height of the on-disk store. My proposal is then to code this UpgradeMigrationStoreLoader (or whatever you want to name it), such that it has the logic in v2 to do the v1->v2 migration (as Aaron suggested). But to ensure it is only run once, it checks for the existence of Are we all clear here? Minor point: Other point: |
I think we shouldn't delete |
I couldn't think of a case where new binary requires multiple upgrades information. So, may be we can eliminate this case for now. @aaronc what do you think? |
Sure, good idea. If present and the height matches current block height: Apply the store migration as defined in the (new) binary. If present and the height is in the past: Delete the file, and do default load. If present and height is in the future, or not present: Do default load. |
@aaronc @ethanfrey how about we intialise upgrade-info.json when binary runs for first time and only update it when proposal is passed(height and storekeys are added) and plan is executed(height and storekeys are removed). Just to make it more in accordance with existing app.toml. I think this will be better over creating and deleting entire file based on upgrade plans |
So, if i understand correctly, you would ensure the upgrade.json is an existent but more or less empty file at init After that, you update the file in the upgrade handler, when a store migration is planned. Otherwise, on startup: If present and the height matches current block height: Apply the store migration as defined in the (new) binary. Otherwise (height != current block height): Do default load |
Yes, on point. The values of height and storeUpgrades will be empty when intitalised |
Sounds good to me. Simpler is better |
@aaronc do you see any issues with this approach |
@ethanfrey is there a way to get |
Alternate solution I could think of is, we should write the info to same But there will be a problem if user deletes the @aaronc @ethanfrey Any issues with this approach? |
@anilcse quick answer from my phone... I'm pretty sure when running the loader that you have access to root multi store. There is a method like LatestCommit(), which returns the height and hash of the last committed block. I use that for checking height. |
Oh okay. Thanks @ethanfrey . I hope |
@ethanfrey, popped up with following idea while working on the implementation. In that case, we should just get to change
to
What do you think? |
Oops, this doesn't work for time-based upgrades. Missed it totally |
Yes it does. |
@ethanfrey is there a way possible to get $DAEMON_HOME ? Can we do the following? Have
And, setting the rootDir from
|
I don't think you need I would simply read it from the config as needed. This can be done via |
We can use |
My only idea then is to do the same as in the skip-upgrade pr, where the list of heights is parsed in the cli and passed into the keeper on construction. We can pass in |
I think this should not be handled/set at keeper. $HOME should be accessible throughout the app. In later stages as well, if other modules require this path, they shouldn't change keeper implementation. I still feel, server/utils.go is the right place to store/get $HOME. |
Looks like we thought too much about what happens for Case - 1 Case - 2 In either of the cases, height doesn't matter. @aaronc @ethanfrey works? |
@aaronc @ethanfrey I've a question regarding storeUpgrades, in a scenario where a chain breaks after few blocks on the upgraded chain, don't we need a way to revert back the changed stores. Very unlikely to happen on cosmos, but let's say there's some bug in x tranaction which may not be known till it's executed on new chain(after 10-20 blocks maybe). |
The most probable decision will be to revert the chain running old binary, and I don't see our upgrade module handling this scenario. Would like to know your take on this. |
As discussed, it might be tough to handle this With this, if P.S: We cannot rollback |
This is exactly the situation we want to avoid. If we manually restart the binary after an upgrade and this causes an error/panic, then we have a problem. The whole design around keeping this height is to allow this to be only applied once, and if it errors, then it really is an error and we should abort the binary and force operator intervention |
All the reads and writes to the database are done on the Thus a panic or other abort that doesn't allow us to reach the commit handler will effectively rollback the state, cuz it was an uncommitted change. |
Yes. In that case as well, we cannot rollback if some tx got invoked and failed at 10th block after the upgrade. We might be able to rollback the 10th block but eventually it fails on every other tx (of same msg type). It cannot work properly. |
Lemme describe the situation a bit more.
A work around is as mentioned before, need to generate |
You are right, there is no path there. We assumed that a failed migration would error in the first block. Some minor inconsistency would cause issues if we need to deal with it a long time later. For this case, reverting to old binary is not possible with any upgrade mechanism. The only solution I know of (and which was proposed in Cosmos Validator slacks for their mainnet upgrade), is if an issue is detected in eg. 100 blocks, then rollback to the pre-update state and continue with old binary there. If we want to add this ability for inplace upgrades, we should export genesis file before performing any upgrade - with or without store migrations. |
That would cause even bigger issues I think. What happens to all the transactions that took place after the upgrade? I doubt if |
Context
The upgrade handler for the upgrade module is able to handle migrations that can happen within
BeginBlock
. Certain root multi-store migrations need to happen outside of BeginBlock - specifically renaming or deleting store keys. The basic support for doing this was added in cosmos#4724. This functionality needs to be integrated into the upgrade module.Acceptance Criteria
Given that multi-store
StoreUpgrade
s are needed and an upgrade is happeningWhen the new binary starts
Then the
StoreUpgrade
s will be performed at the correct upgrade height before the ABCI app startsProposed Design
upgrade.Keeper
BeginBlock
method, write a file$DAEMON_HOME/data/upgrade-needed.json
file to disk with the upgrade plan serialized at panic time with the actual upgrade height written in the file in the case of time-based upgrades.upgrade.Keeper
SetUpgradeHandler` method to:so that whenever store key renames/deletions are needed they can be registered with the upgrade handler
BaseApp
and the multi-store before starting the ABCI app. The multi-store upgrades should only be performed when there is an upgrade-info.json present and the version of the store matches the upgrade height in the json file. This will prevent store upgrades from happening too early.Notes
BaseApp.UpgradeableStoreLoader
likely doesn't do what is required as this would require that the store upgrades are written to disk outside of the binary, presumably by the upgrade module. The actual behavior would likely be that the new binary contains a handler for the desiredStoreUpgrade
s as mentioned above. Likely some hook betweenBaseApp
and theupgrade.Keeper
is neded.The text was updated successfully, but these errors were encountered: