Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Transaction submission panics due to unavailability of BTC config #117

Closed
vitsalis opened this issue Sep 6, 2022 · 3 comments · Fixed by #121
Closed

bug: Transaction submission panics due to unavailability of BTC config #117

vitsalis opened this issue Sep 6, 2022 · 3 comments · Fixed by #121
Labels
bug Something isn't working

Comments

@vitsalis
Copy link
Member

vitsalis commented Sep 6, 2022

Summary of Bug

Submitting a transaction to the btclightclient module leads to a panic in ValidateBasic due to btcConfig being nil.

The BTC config is initialized whenever a new Babylon app instance is created . However, when the insert-header command is executed, there is a call to GenerateOrBroadcastTxCli. This in turn leads to a call to ValidateBasic for the provided message, which then tries to access the powLimit. However, the btcConfig is nil in this case which leads to a panic.

I think that this happens because ValidateBasic is executed in a different thread in which the btcConfig is never initialized. Below you can find reproduction steps. @KonradStaniec thoughts?

Version

main branch

Steps to Reproduce

  1. Build
$ make build
  1. Create the testnet files
$ ./build/babylond testnet --v 1 --output-dir ./.testnet \
    --starting-ip-address 192.168.10.2 --keyring-backend test --chain-id chain-test
  1. Start the chain
$ ./build/babylond start --home ./.testnet/node0/babylond
  1. Submit a header that builds on top of the base BTC header
$ ./build/babylond tx btclightclient insert-header "0000402047ab81c8fdc66b2cc9e69144f7410f92bc5388211cbf02000000000000000000a9ab776b7736b0f5c84f696baf2618b09b2408d171f4e5fc2049e083a71bfcf34afd7c62ba0109177fdb6aec" --keyring-backend test --from node0 --home .testnet/node0/babylond --chain-id chain-test --fees 1000stake --broadcast-mode block

The above steps lead to the following stack trace

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x1034f221c]

goroutine 1 [running]:
github.com/babylonchain/babylon/types.(*BtcConfig).PowLimit(...)
        github.com/babylonchain/babylon/types/btc_config.go:97
github.com/babylonchain/babylon/types.GetGlobalPowLimit(...)
        github.com/babylonchain/babylon/types/btc_config.go:109
github.com/babylonchain/babylon/x/btclightclient/types.(*MsgInsertHeader).ValidateBasic(0x1400000ecf0)
        github.com/babylonchain/babylon/x/btclightclient/types/msgs.go:27 +0x4c
github.com/cosmos/cosmos-sdk/client/tx.GenerateOrBroadcastTxWithFactory({{0x14000041ea8, 0x14, 0x18}, {0x104435e38, 0x1400124ecc0}, {0x16d6bb798, 0xa}, {0x10442a8a0, 0x14000f47b60}, {0x104433140, ...}, ...}, ...)
        github.com/cosmos/cosmos-sdk@v0.45.4/client/tx/tx.go:41 +0x104
github.com/cosmos/cosmos-sdk/client/tx.GenerateOrBroadcastTxCLI({{0x14000041ea8, 0x14, 0x18}, {0x104435e38, 0x1400124ecc0}, {0x16d6bb798, 0xa}, {0x10442a8a0, 0x14000f47b60}, {0x104433140, ...}, ...}, ...)
        github.com/cosmos/cosmos-sdk@v0.45.4/client/tx/tx.go:30 +0xb4
github.com/babylonchain/babylon/x/btclightclient/client/cli.CmdTxInsertHeader.func1(0x14001211900?, {0x14001242f70, 0x1, 0xd?})
        github.com/babylonchain/babylon/x/btclightclient/client/cli/tx.go:43 +0x150
github.com/spf13/cobra.(*Command).execute(0x14001211900, {0x14001242ea0, 0xd, 0xd})
        github.com/spf13/cobra@v1.4.0/command.go:856 +0x4c4
github.com/spf13/cobra.(*Command).ExecuteC(0x14000f27680)
        github.com/spf13/cobra@v1.4.0/command.go:974 +0x354
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.4.0/command.go:902
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        github.com/spf13/cobra@v1.4.0/command.go:895
github.com/cosmos/cosmos-sdk/server/cmd.Execute(0x14000070768?, {0x14000ce9420, 0x19})
        github.com/cosmos/cosmos-sdk@v0.45.4/server/cmd/execute.go:36 +0x1b0
main.main()
        github.com/babylonchain/babylon/cmd/babylond/main.go:18 +0x44
@vitsalis vitsalis added the bug Something isn't working label Sep 6, 2022
@KonradStaniec
Copy link
Contributor

I think I know what is going on.

It looks like each command is basically different app (process) i.e when you have run ./build/babylond start --home ./.testnet/node0/babylond then the global is initialised correctly in the node application BabylonApp

When you run tx command this is different process , which broadcast tx to node app. This tx app knows nothing about any global initialisation so when it tries to use it in ValidateBasic it is still nil.

There are two solutions I see to this problem:

  • move initialisation of global config up the chain, somewhere near root command initialisation. Maybe to place like sdk.GetConfig(), but ours is a bit different as we take our values from config files. So it maybe a bit ugly.
  • remove any checks that require state (powLimit and babylontag) from ValidateBasic. The way we using it now is a bit hacky i.e those supposed to be stateless checks but we are reading the state from config and using it there. There maybe a way to add those check in anteHandler if necessary. This is a bit more principled solution i.e working in bounds which cosmos-sdk provided for us.

I think at this point I would go for second one and get rid of this hacky global, as for demo/mvp/testnet we will not worry too much about headers with invalid pow in mempool (or invalid tags), and either way those with bad values would fail during tx execution and senders would pay for it. wdyt ?

@vitsalis
Copy link
Member Author

vitsalis commented Sep 7, 2022

I don't think that the work check can be considered a stateful check since we do not expect the state to change to a different PoW limit. Overall, I do not entirely like the idea of removing functionality that we will need in the end for the demo purposes. However, if we use an AnteHandler that's another thing. Will we have access to the configuration from that? Also, it may be worth investigating what exact changes would the first option require.

@KonradStaniec
Copy link
Contributor

so:

  1. Agree that definition of state is a bit blurry here, but ultimately from pov of ValidateBasic everything which is not in message it self is considered state, so powLimit and babylon tag falls under this definition.
  2. If we will need it in the end is also not super certain i.e we could as well allow headers which do not have valid pow in mempool, senders of such transaction ultimately would need to pay gas for failed transactions. There are few trade off here.
  3. Antehandler will have access to config, as it is instantiated in babylon app. My main uncertainty here is if we can do module specific validation there.

Ultimately it is probably worhtwhile looking a bit into antehandler now, before making final decsion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants