Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious failures in validator_manager tests #4616

Open
michaelsproul opened this issue Aug 14, 2023 · 3 comments
Open

Spurious failures in validator_manager tests #4616

michaelsproul opened this issue Aug 14, 2023 · 3 comments
Labels
infra-ci val-client Relates to the validator client binary

Comments

@michaelsproul
Copy link
Member

Description

I just had a spurious failure when running the validator_manager tests locally. Weirdly, it's an invalid password failure:

---- move_validators::test::three_validators_move_two stdout ----
Starting derivation of 3 keystores. Each keystore may take several seconds.
Completed 1/3: 0x88b6b3a9b391fa5593e8bce8d06102df1a56248368086929709fbb4a8570dc6a560febeef8159b19789e9c1fd13572f0
Completed 2/3: 0xa33ab9d93fb53c4f027944aaa11a13be0c150b7cc2e379d85d1ed4db38d178b4e4ebeae05832158b8c746c1961da00ce
Completed 3/3: 0x807d7219776c5460dd30851ed869ad0636d20fc27046d153870c2696dea7e114c60cd02fac0d71387ef7f00042c5f2a8
Keystore generation complete
Writing "/tmp/.tmpKCG18e/validators.json"
Writing "/tmp/.tmpKCG18e/deposits.json"
Validator client is reachable at http://127.0.0.1:34623/ and reports 0 validators
Starting to submit 3 validators to VC, each validator may take several seconds
Uploaded keystore 1 of 3 to the VC
Upload of keystore 2 of 3 failed with message: Some("failed to initialize validator: \"Unable to add definition: UnableToDecryptKeystore(InvalidPassword)\""). A potential solution is run this command again using the --ignore-duplicates flag, however care should be taken to ensure that there are no duplicate deposits submitted.
thread 'move_validators::test::three_validators_move_two' panicked at 'assertion failed: import_test_result.result.is_ok()', validator_manager/src/move_validators.rs:915:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Steps to resolve

At first blush I'm not sure how this is possible without memory corruption or some other nastiness. I'm using mold as my linker, I'll see if I can also reproduce this failure without it. Platform tested on is Linux.

@michaelsproul michaelsproul added val-client Relates to the validator client binary infra-ci labels Aug 14, 2023
@michaelsproul
Copy link
Member Author

I guess it could be a tmp file collision between multiple tests running in parallel. One would hope that tempfile guards against that though.

@michaelsproul
Copy link
Member Author

I couldn't repro the failure running just that test in a loop, so I think the tempfile explanation is most likely

@jmcph4
Copy link
Member

jmcph4 commented Aug 28, 2023

I couldn't repro the failure running just that test in a loop, so I think the tempfile explanation is most likely

This seems quite plausible (from the tempfile documentation):

tempfile will (almost) never fail to cleanup temporary resources. However TempDir and NamedTempFile will fail if their destructors don’t run. This is because tempfile relies on the OS to cleanup the underlying file, while TempDir and NamedTempFile rely on rust destructors to do so. Destructors may fail to run if the process exits through an unhandled signal interrupt (like SIGINT), or if the instance is declared statically (like with lazy_static), among other possible reasons.

-- https://docs.rs/tempfile/latest/tempfile/#resource-leaking

TestBuilder relies on TempDir here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra-ci val-client Relates to the validator client binary
Projects
None yet
Development

No branches or pull requests

2 participants