You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[2022-12-12T21:19:22.949402625Z INFO local_cluster_flakey] Create validator
C's ledger [2022-12-12T21:19:22.992800408Z INFO local_cluster_flakey] Create
validator A's ledger [2022-12-12T21:19:23.079169152Z INFO
local_cluster_flakey] Checking A's tower for a vote on slot descended from
slot `next_slot_on_a` [2022-12-12T21:19:23.098472257Z INFO
local_cluster_flakey] Removing tower! [2022-12-12T21:19:23.099375667Z INFO
local_cluster_flakey] Restart validator C again!!!
[2022-12-12T21:19:26.123293290Z INFO local_cluster_flakey] collected
validator C's votes: {29, 30, 31, 32} [2022-12-12T21:19:26.123306395Z INFO
local_cluster_flakey] Restart validator A again!!!
[2022-12-12T21:19:26.633276484Z ERROR solana_core::validator] Rebuilding a new
tower from the latest vote account due to failed tower restore: IO Error: No
such file or directory (os error 2) [2022-12-12T21:19:38.035273429Z INFO
local_cluster_flakey] Observed A's votes on: [26, 26, 26, 26, 27, 27, 27, 27,
27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27] thread 'main'
panicked at 'Violation expected because of removed persisted tower!',
local-cluster/tests/local_cluster_flakey.rs:351:13 stack backtrace: 0:
rust_begin_unwind at
/rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt at
/rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
2:
local_cluster_flakey::do_test_optimistic_confirmation_violation_with_or_without_tower
3: serial_test::serial_code_lock::local_serial_core 4:
core::ops::function::FnOnce::call_once 5:
core::ops::function::FnOnce::call_once at
/rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose
backtrace. FAILED failures: failures:
test_optimistic_confirmation_violation_without_tower
A rerun with sucess log
[2022-12-12T21:52:58.443640037Z INFO local_cluster_flakey] Waiting on both
validators A and B to vote on fork at slot 27 [2022-12-12T21:53:13.269944257Z
INFO local_cluster_flakey] Create validator C's ledger
[2022-12-12T21:53:13.390689919Z INFO local_cluster_flakey] Create validator
A's ledger [2022-12-12T21:53:13.647698343Z INFO local_cluster_flakey]
Checking A's tower for a vote on slot descended from slot `next_slot_on_a`
[2022-12-12T21:53:13.697183806Z INFO local_cluster_flakey] Removing tower!
[2022-12-12T21:53:13.701354097Z INFO local_cluster_flakey] Restart validator
C again!!! [2022-12-12T21:53:18.901249687Z INFO local_cluster_flakey]
collected validator C's votes: {29, 30, 31, 32}
[2022-12-12T21:53:18.901300033Z INFO local_cluster_flakey] Restart validator
A again!!! [2022-12-12T21:53:21.553643995Z ERROR solana_core::validator]
Rebuilding a new tower from the latest vote account due to failed tower
restore: IO Error: No such file or directory (os error 2)
[2022-12-12T21:53:22.179317214Z INFO local_cluster_flakey] Observed A's votes
on: [26, 26, 26, 26, 30] [2022-12-12T21:53:22.179366507Z INFO
local_cluster_flakey] THIS TEST expected violations. And indeed, there was
some, because of removed persisted tower. test
test_optimistic_confirmation_violation_without_tower ... ok test result: ok.
2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 111.31s
Proposed Solution
Debug and fix the failure to make the test robust
The text was updated successfully, but these errors were encountered:
Based on the above logs, seems like the issue is that when A restarts it is able to repair 27 through D. Will confirm once I'm able to reproduce. Easiest solution seems to be to just kill D during A restart, although there might be a gossip discovery issue.
Problem
Investigate and fix local_cluster_flakey tests
An example failure log
A rerun with sucess log
Proposed Solution
Debug and fix the failure to make the test robust
The text was updated successfully, but these errors were encountered: