Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Windows conflict test failures on main #1777

Merged
merged 3 commits into from
Feb 19, 2021

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Feb 19, 2021

Motivation

In #1770, we unconditionally kill the conflicted node in the conflict acceptance tests. But on Windows, this makes was_killed return true, failing the tests. We didn't pick up this error in the PR, due to testnet unreliability, which will be fixed by #1222.

Solution

  • Only kill the conflicted node if it is still running
  • Make the code where both nodes are running as short as possible

Review

CI is failing all the time on main, so this PR is critical priority.

Since I don't have a Windows box locally, we should make sure Windows, macOS, and Linux CI pass on this PR.

Follow Up Work

Maybe we should disable the testnet large sync tests, or disable fail_fast in CI.

Fix the Windows-specific bugs in TestChild::kill and TestChild::is_running #1781

@teor2345 teor2345 added C-bug Category: This is a bug A-rust Area: Updates to Rust code P-Critical I-integration-fail Continuous integration fails, including build and test failures labels Feb 19, 2021
@teor2345 teor2345 added this to the 2021 Sprint 3 milestone Feb 19, 2021
@teor2345 teor2345 requested a review from a team February 19, 2021 00:06
@teor2345 teor2345 self-assigned this Feb 19, 2021
@teor2345 teor2345 changed the title Fix Windows conflict tests kill code Fix Windows conflict tests failures on main Feb 19, 2021
@teor2345 teor2345 changed the title Fix Windows conflict tests failures on main Fix Windows conflict test failures on main Feb 19, 2021
yaahc
yaahc previously approved these changes Feb 19, 2021
Copy link
Contributor

@yaahc yaahc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, one UX suggestion for the error messages then I think we can merge this

zebrad/tests/acceptance.rs Outdated Show resolved Hide resolved
zebrad/tests/acceptance.rs Outdated Show resolved Hide resolved
yaahc
yaahc previously approved these changes Feb 19, 2021
yaahc
yaahc previously approved these changes Feb 19, 2021
@teor2345
Copy link
Contributor Author

teor2345 commented Feb 19, 2021

The macOS testnet sync failure will be fixed by #1222

The Windows testnet sync will be disabled by #1782

On Windows, if a process is killed after it is dead, it returns `true`
for `was_killed`. Instead, check if the process is running before killing
it.

Also make the section where processes are running as short as possible,
and include context for both processes in every error.
@teor2345
Copy link
Contributor Author

I cherry-picked #1776 so I could actually see the Windows tests succeed, even if macOS failed.

teor2345 and others added 2 commits February 19, 2021 17:58
`node2.is_running()` can return `true` on Windows, even though `node2`
has logged a panic. This cleanup code only runs if `node2` fails to panic
and exit as expected. So it's ok for us to skip it.

See ZcashFoundation#1781 for details.
@teor2345
Copy link
Contributor Author

I did an admin-merge because the code was reviewed, I just needed to disable some cleanup on Windows.

@teor2345 teor2345 merged commit a9e4768 into ZcashFoundation:main Feb 19, 2021
@teor2345 teor2345 mentioned this pull request Feb 23, 2021
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rust Area: Updates to Rust code C-bug Category: This is a bug I-integration-fail Continuous integration fails, including build and test failures
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants