Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation test: out of gas error for some simulation #369

Closed
lumtis opened this issue Oct 27, 2021 · 11 comments · Fixed by #743
Closed

Simulation test: out of gas error for some simulation #369

lumtis opened this issue Oct 27, 2021 · 11 comments · Fixed by #743
Assignees
Labels
test Improve tests

Comments

@lumtis
Copy link
Contributor

lumtis commented Oct 27, 2021

Describe the bug
When setting values for FlagNumBlocksValue and FlagBlockSizeValue flags, the simulation test can fail with the error out of gas in location: WritePerByte; gasWanted: 1000000, gasUsed: 1000373: out of gas
While it particularly happen when setting high values, the error can sometimes occurs for random low values. Therefore, there might be a flaw in our configuration for simulation tests

To Reproduce
Set following values in app/simulation_test.go

simapp.FlagNumBlocksValue = 500
simapp.FlagBlockSizeValue = 100
Simulating... block 50/500, operation 0/273.    simulate.go:295: error on block  50/500, operation (14/80) from x/staking:
        out of gas in location: WritePerByte; gasWanted: 1000000, gasUsed: 1000373: out of gas
@giunatale
Copy link
Contributor

I think this error is due to this value https://github.com/cosmos/cosmos-sdk/blob/master/simapp/helpers/test_helpers.go#L17
being used here https://github.com/cosmos/cosmos-sdk/blob/master/x/simulation/util.go#L107 and other places.

We can't override it being a constant, so the best option is to use a custom GenAndDeliverTxWithRandFees that internally uses a GenAndDeliverTx with a different value for gas.

However, I've seen this error arise also for SDK internal modules (e.g. staking) where we can't really do much.

Thoughts?

@giunatale
Copy link
Contributor

giunatale commented Apr 14, 2022

For instance, doing ignite chain simulate --seed 7 from the test/campaign-sim-improvements branch yields:

Starting the simulation from time Tue Dec 31 14:21:40 UTC 9935 (unixtime 251382579700)
Simulating... block 149/200, operation 0/73.--- FAIL: BenchmarkSimulation
    simulate.go:300: error on block  149/200, operation (27/58) from x/staking:
        out of gas in location: WritePerByte; gasWanted: 1000000, gasUsed: 1000080: out of gas [cosmos/cosmos-sdk@v0.45.2/baseapp/recovery.go:55]
        Comment: unable to deliver tx
FAIL

Which is coming from x/staking.

EDIT: I've noticed you also have seen this error coming from x/staking and not one of our modules, as per your first post @lubtd

@giunatale
Copy link
Contributor

cosmos/cosmos-sdk#9463
JayB may know more about this

@jaybxyz
Copy link

jaybxyz commented Apr 15, 2022

@giunatale @lubtd That issue was created a while ago, so I had to trace it down for a bit. If I remember it correctly, it occurs due to the usage of denomination other than sdk.DefaultBondDenom when spending fees for a transaction (I don't know if this is the case for the spn module). A quick fix was to increase gas, but since it is dependent on Cosmos SDK we had to resolve the issue by having a custom function that returns random amount of fee with only default bond denom. See this function and where it is being used in operations.

From our team's experience of implementing simulation tests for other modules, we decide to have custom GenAndDeliverTx and GenAndDeliverTxWithFees functions so that we can pass whatever gas and fees we desire to be spent for a transaction.

Reference the following code lines:

I hope this helps you in resolving the out of gas in location issue.

@lumtis
Copy link
Contributor Author

lumtis commented Apr 19, 2022

Thanks for the information @kogisin !

@giunatale do you think you can tackle this task?

@giunatale
Copy link
Contributor

Sounds good, it was in my plan to take it on

@giunatale giunatale self-assigned this Apr 19, 2022
@giunatale
Copy link
Contributor

giunatale commented Apr 21, 2022

Unfortunately using a custom GenAndDeliverTx and GenAndDeliverTxWithFees functions for our modules did not solve the issue.

This is because we are having this error come up from x/staking, systematically (and it seems to be always this the source of the error). I need to investigate more, but right now it seems we either patch the SDK or we cannot get past this.
From the simulation log

{"entry_kind":"msg","height":269,"order":111,"operation":{"route":"staking","name":"begin_unbonding","comment":"unable to deliver tx","ok":false,"msg":null}}

It seems begin_unbonding is the culprit. It appears to be always this what causes the out of gas issue, from all my simulations.

Will investigate more but this might require more time than expected.

@lumtis
Copy link
Contributor Author

lumtis commented Apr 21, 2022

Let's continue investing on tomorrow, otherwise, we can put it aside for now. Most of the time we can perform the full non-determism simulation test without encountering the issue. I considered this task as a priority because it seemed originally that the error occurred everytime

@giunatale
Copy link
Contributor

So after some further investigation, I've come to similar conclusions as per cosmos/cosmos-sdk#9463. And, unfortunately, although we can mitigate this issue we will not be able to fully solve it unless we patch the cosmos-sdk.

The out of gas error appears to happen in both undelegate and redelegate simulations in x/staking. In particular, gas consumption increases dramatically in the k.BeforeDelegationSharesModified hook in Unbond, that is called by both messages.
The culprit for this extreme gas consumption is the k.withdrawDelegationRewards function from the x/distribution staking hooks and in particular the k.bankKeeper.SendCoinsFromModuleToAccount call when claimable rewards include a large amount of sdk.Coins (since it accesses and writes the KVStore one time for each sdk.Coin). Such a scenario happens due the fact that RandomFees, used in GenAndDeliverTxWithRandFees as well as in pretty much all cosmos-sdk modules (example) selects a random permutation from the account's balance therefore fee-based rewards for accounts end up being a large number of different denoms. This problem happens for us because we mint a large amount of sdk.Coins as vouchers.

An example auto-claimed rewards for the account that created the out of gas error with a begin_redelegate message.

315366877798318auction,1098869727896940stake,2v/10/bar,1v/10/foo,4v/10/toto,3v/12/bar,5v/12/foo,3v/12/toto,1v/13/bar,3v/15/bar,10v/15/foo,1v/15/toto,6v/16/bar,4v/16/foo,6v/16/toto,12v/2/bar,9v/2/foo,9v/2/toto,4v/20/foo,4v/20/toto,2v/21/bar,2v/21/foo,2v/21/toto,1v/23/bar,2v/23/foo,1v/25/bar,2v/25/toto,3v/29/bar,2v/29/toto,3v/3/bar,2v/3/toto,1v/35/bar,1v/4/bar,1v/4/foo,4v/41/toto,1v/42/toto,1v/43/foo,3v/43/toto,3v/48/bar,2v/48/toto,1v/5/bar,5v/6/bar,4v/6/foo,4v/6/toto,7v/7/bar,8v/7/foo,5v/7/toto,5v/8/foo,5v/8/toto,1v/9/foo,1v/9/toto

For this instance gas consumption was about 870000+ already after k.BeforeDelegationSharesModified and about 60% of this high gas consumption happens in k.withdrawDelegationRewards > k.bankKeeper.SendCoinsFromModuleToAccount. As comparison, for other begin_redelegate that happens earlier in the simulation, where the number of different denoms accumulated as rewards is less, gas consumption is around 100000-400000.

Now, this problem can be mitigated if at least for our modules' simulations we just use the sdk.DefaultBondDenom for random fees, so we don't increase fee-based rewards with other denoms. However, since modules from the cosmos-sdk will still use random coins for fees, this problem will eventually happen for certain simulations. This is inevitable, especially for longer, beefier simulations. I think though we can safely discard these simulations when they fail for out of gas in x/staking, knowing what is the root cause (and knowing it does not really affect negatively the functionality of the chain).

I will do a PR changing how we use fees for our modules, but I can't do anything for the cosmos-sdk internals. This problem is specific for chains where simulation ends up creating (and thus accumulating as fee-based rewards) a lot of different denoms, and addressing this directly in the comsos-sdk would mean either changing the logic of RandomFees or some more invasive changes to the simulation config, both of which IMHO would be easily rejected. This PR will tentatively close this issue, as I think we will need to live with it. I've also found some other improvements I can make that I will include in this PR.

@lumtis
Copy link
Contributor Author

lumtis commented Apr 22, 2022

I will do a PR changing how we use fees for our modules, but I can't do anything for the cosmos-sdk internals.

Sounds good👍
We can see how simulation will be affected.

(and knowing it does not really affect negatively the functionality of the chain)

The issue is that it prevents us from doing large simulation

@giunatale
Copy link
Contributor

giunatale commented Apr 22, 2022

The issue is that it prevents us from doing large simulation

I know, but our hands are tied. Only realistic option would be to used a forked cosmos-sdk. Unless we make all vouchers non-sendable in simulations, so they don't end up being used as fees. But I'm not even sure this would work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Improve tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants