Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] crash with PoisonError and Block.Header.AppHash incorrect #599

Closed
ivanfong01 opened this issue Oct 28, 2021 · 10 comments · Fixed by #601 or #603
Closed

[BUG] crash with PoisonError and Block.Header.AppHash incorrect #599

ivanfong01 opened this issue Oct 28, 2021 · 10 comments · Fixed by #601 or #603
Labels
bug Something isn't working

Comments

@ivanfong01
Copy link

Describe the bug
The node stopped syncing after 'PoisonError' and Block.Header.AppHash will be incorrect with main chain.
Block Data must be removed and resync from snapshot.

To Reproduce
Its happen randomly and I encountered 3 times since using v0.5.9.

Context & versions
v0.5.9

thread '' panicked at 'assertion failed: prev.start > max', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/wasmer-engine-2.0.0/src/trap/frame_info.rs:232:9

note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/wasmer-engine-2.0.0/src/trap/frame_info.rs:228:39
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rsthread ':256' panicked at ':called Result::unwrap() on an Err value: PoisonError { .. }', 43/usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs
:256:43

The node halt and stopped receiving new block data.
After restart, the node AppHash is not correct and need to remove block data and resync from snapshot

anic: Failed to process committed block (5081539:224DFCA3A35E2795A3B222F42C83776F71198FEE4D257E07FBF000006F043864): wrong Block.Header.AppHash. Expected 9A28C1116CE0E1B351784E24D9AF5881B5963896AA9A5EDEB33982AF8C2A2E0F, got 8EC27D4B4DBAAA5A901F2777262E656539BF563444E04704A45CA8CA30C94015

goroutine 104 [running]:
github.com/tendermint/tendermint/blockchain/v0.(*BlockchainReactor).poolRoutine(0xc000029c00, 0x0)
github.com/tendermint/tendermint@v0.34.13/blockchain/v0/reactor.go:401 +0x15bf
created by github.com/tendermint/tendermint/blockchain/v0.(*BlockchainReactor).OnStart
github.com/tendermint/tendermint@v0.34.13/blockchain/v0/reactor.go:110 +0x8c

@yun-yeo
Copy link
Contributor

yun-yeo commented Oct 30, 2021

@ivanfong01 thanks for reporting.

Can you share your wasm config setting?? Like write-vm-cache-size and read-vm-cache-size and num-read-vms

@yun-yeo
Copy link
Contributor

yun-yeo commented Oct 30, 2021

Other logs

Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]: thread '<unnamed>' panicked at 'assertion failed: prev.start > max', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/wasmer-engine-2.0.0/src/trap/frame_info.rs:232:9
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]: stack backtrace:
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    0:     0x7f68904d1b20 - std::backtrace_rs::backtrace::libunwind::trace::ha5edb8ba5c6b7a6c
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/../../backtrace/src/backtrace/libunwind.rs:90:5
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    1:     0x7f68904d1b20 - std::backtrace_rs::backtrace::trace_unsynchronized::h0de86d320a827db2
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    2:     0x7f68904d1b20 - std::sys_common::backtrace::_print_fmt::h97b9ad6f0a1380ff
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/sys_common/backtrace.rs:67:5
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    3:     0x7f68904d1b20 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h14be7eb08f97fe80
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/sys_common/backtrace.rs:46:22
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    4:     0x7f68904f1fdf - core::fmt::write::h2ca8877d3e0e52de
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/core/src/fmt/mod.rs:1094:17
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    5:     0x7f68904cfc15 - std::io::Write::write_fmt::h64f5987220b618f4
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/io/mod.rs:1584:15
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    6:     0x7f68904d3a8b - std::sys_common::backtrace::_print::h7f1a4097308f2e0a
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/sys_common/backtrace.rs:49:5
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    7:     0x7f68904d3a8b - std::sys_common::backtrace::print::h1f799fc2ca7f5035
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/sys_common/backtrace.rs:36:9
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    8:     0x7f68904d3a8b - std::panicking::default_hook::{{closure}}::hf38436e8a3ce1071
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:208:50
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:    9:     0x7f68904d355d - std::panicking::default_hook::he2f8f3fae11ed1dd
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:225:9
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   10:     0x7f68904d409d - std::panicking::rust_panic_with_hook::h79a18548bd90c7d4
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:591:17
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   11:     0x7f68904d3c07 - std::panicking::begin_panic_handler::{{closure}}::h212a72cc08e25126
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:495:13
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   12:     0x7f68904d1fbc - std::sys_common::backtrace::__rust_end_short_backtrace::hbd6897dd42bc0fcd
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/sys_common/backtrace.rs:141:18
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   13:     0x7f68904d3b99 - rust_begin_unwind
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/std/src/panicking.rs:493:5
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   14:     0x7f6890294b71 - core::panicking::panic_fmt::h77ecd04e9b1dd84d
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/core/src/panicking.rs:92:14
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   15:     0x7f6890294abd - core::panicking::panic::h60569d8a39169222
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /rustc/53cb7b09b00cbea8754ffb78e7e3cb521cb8af4b/library/core/src/panicking.rs:50:5
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   16:     0x7f6890428c2e - wasmer_engine::trap::frame_info::register::hfe9f67fb552d5b1a
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   17:     0x7f68902f96a5 - <wasmer_engine_universal::artifact::UniversalArtifact as wasmer_engine::artifact::Artifact>::register_frame_info::ha53475710931ffc3
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   18:     0x7f68902f2b56 - wasmer_engine::artifact::Artifact::instantiate::hd05b86e5ba918a5c
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   19:     0x7f68902e9706 - wasmer::module::Module::instantiate::h45d4338a9791cf84
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   20:     0x7f68902f0106 - wasmer::instance::Instance::new::h74d3fb4055786549
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   21:     0x7f68902b233c - cosmwasm_vm::instance::Instance<A,S,Q>::from_module::h12749beb64cbbe73
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   22:     0x7f689029e0c1 - cosmwasm_vm::cache::Cache<A,S,Q>::get_instance::h7c1a4e8002e60bcf
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   23:     0x7f68902b9dab - wasmvm::calls::do_call_2_args::h56c64befc44f67ef
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   24:     0x7f68902b9911 - wasmvm::calls::call_2_args::hffc55df42063d2d0
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   25:     0x7f68902b904d - query
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   26:          0x15f1b9e - _cgo_0e96ffb483ec_C2func_query
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /tmp/go-build/cgo-gcc-prolog:479
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:   27:           0x46bfd0 - runtime.asmcgocall
Oct 29 17:01:33 ip-10-0-30-58 mantle[4609]:                                at /snap/go/8489/src/runtime/asm_amd64.s:765
....

@ivanfong01
Copy link
Author

ivanfong01 commented Oct 30, 2021

@ivanfong01 thanks for reporting.

Can you share your wasm config setting?? Like write-vm-cache-size and read-vm-cache-size and num-read-vms

The config:

The write WASM VM memory cache size in MiB not bytes

write-vm-memory-cache-size = "500"

The read WASM VM memory cache size in MiB not bytes

read-vm-memory-cache-size = "300"

The number of read WASM VMs

num-read-vms = "5"

I switched num-read-vms to 1 right now and did not crash (yet) since my last reboot (30hours+)

@yun-yeo
Copy link
Contributor

yun-yeo commented Oct 30, 2021

yea I also think multiple concurrent wasm VMs are causing this error.

Before we find the problem, recommend to use 1 read vm.

@yun-yeo
Copy link
Contributor

yun-yeo commented Oct 30, 2021

@ivanfong01 btw the feature you requested is on going
cosmos/cosmos-sdk#10281

@ivanfong01
Copy link
Author

Thank you. Quite sure its multiple concurrent wasm VMs'problem. The node is running smoothly for 3 days since I updated the config. Its better to put a remark to remind others to use 1 read vm only until the problem solved.

@ivanfong01
Copy link
Author

@YunSuk-Yeo Hi! YunSuk.

I encountered the same bug using v0.5.11. Maybe we should re-open the case? I saw someone reported the same issue too for #604

thread '' panicked at 'assertion failed: prev.start > max', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/wasmer-engine-2.0.0/src/trap/frame_info.rs:232:9

note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43
thread '' panicked at 'called Result::unwrap() on an Err value: PoisonError { .. }', /usr/local/cargo/git/checkouts/cosmwasm-bbcc9de72c368f6b/93ea53e/packages/vm/src/cache.rs:256:43

@hanjukim
Copy link
Contributor

Follow up: wasmerio/wasmer#2434 (comment)

@ivanfong01
Copy link
Author

i got the same issue again @v0.5.16, two times....

@yun-yeo
Copy link
Contributor

yun-yeo commented Feb 21, 2022

@yun-yeo yun-yeo closed this as completed Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants