Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable zebrad sync tests by default #1179

Merged
merged 5 commits into from
Oct 21, 2020

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Oct 19, 2020

Motivation

In the past, we have accidentally merged PRs that break zebrad's sync functionality. So we want to add a basic genesis sync test to the zebrad acceptance tests.

Solution

  • enable the sync tests by default
  • set the ZEBRA_SKIP_NETWORK_TESTS env var on any CI jobs that fail
    • Disable CI / Test on ubuntu-latest, because it fails
    • Check that other CI runners succeed
    • Check if Google Cloud Build fails
    • Check if tests for cd.yml fail
      • needs merge to main?

Related Issues

Closes #1141.

Follow-up work

CI Reliability

Make sure CI is reliable, even if the seeder sends us slow peers

  • Tune the sync test timeouts, if needed
  • Add sync retries, if needed

This work doesn't need a ticket - we only need to do it if we see CI failures.

Extended Sync Tests

Once we are sure that the genesis sync is reliable, we can add longer sync tests.
See #1004 (RFC) and #745 (tracking issue) for details.

If your test environment does not have DNS or network access, set the
ZEBRA_SKIP_NETWORK_TESTS environmental variable to disable these tests.
@teor2345 teor2345 added A-rust Area: Updates to Rust code A-devops Area: Pipelines, CI/CD and Dockerfiles C-enhancement Category: This is an improvement labels Oct 19, 2020
@teor2345 teor2345 added this to the First Alpha Release milestone Oct 19, 2020
@teor2345 teor2345 requested review from dconnolly and a team October 19, 2020 04:15
@teor2345 teor2345 self-assigned this Oct 19, 2020
They don't seem to have DNS or network configured during the tests.

Also make capitalisation of step names consistent.
@teor2345 teor2345 removed the request for review from a team October 19, 2020 05:08
@teor2345
Copy link
Contributor Author

teor2345 commented Oct 19, 2020

@dconnolly how can I test if this PR makes Google Cloud CD tests fail?
(The tests in this PR depend on the network configuration on the CI runner.)

This PR is from my GitHub repository, and #1180 is from a branch in the ZcashFoundation repository.

@teor2345
Copy link
Contributor Author

The existing sync, state, and network code is unreliable, see #1181, #1182, and #1183.

So we might see some errors downloading genesis, but it's just one block, so the errors should not be too frequent.

When the loop exits, either the process has stopped running,
or the deadline has passed.

If the process is still running, we want to kill it.
@teor2345
Copy link
Contributor Author

teor2345 commented Oct 20, 2020

The Windows sync test fails intermittently, so I'm going to disable it.
Either Windows intermittently has a slow network, or some CI runners don't have network at all, or we're seeing the sync hang in #1181.

@dconnolly
Copy link
Contributor

@dconnolly how can I test if this PR makes Google Cloud CD tests fail?
(The tests in this PR depend on the network configuration on the CI runner.)

This PR is from my GitHub repository, and #1180 is from a branch in the ZcashFoundation repository.

All tests in the Google Cloud Build are passing:
image

Can you load this url? This is the raw console out .

@teor2345
Copy link
Contributor Author

teor2345 commented Oct 20, 2020

The Windows sync test fails intermittently, so I'm going to disable it.
Either Windows intermittently has a slow network, or some CI runners don't have network at all, or we're seeing the sync hang in #1181.

I added #1189 to follow up, because we really should be testing sync on every platform. (Particularly Linux, if we want it to be our distribution platform.)

@teor2345
Copy link
Contributor Author

Can you load this url? This is the raw console out .

Yeah, that works for me.

I guess I don't know exactly how Zebra CI works:

  • is the some CI/CD that only runs on branches in ZcashFoundation/zebra?
  • is there some CI/CD that only runs on merges to main?

@dconnolly
Copy link
Contributor

Can you load this url? This is the raw console out .

Yeah, that works for me.

I guess I don't know exactly how Zebra CI works:

  • is the some CI/CD that only runs on branches in ZcashFoundation/zebra?
  • is there some CI/CD that only runs on merges to main?

Oh, yes:

  • ci.yml is on PRs
  • cd.yml is on main
  • coverage.yml is both

@dconnolly
Copy link
Contributor

Can you load this url? This is the raw console out .

Yeah, that works for me.
I guess I don't know exactly how Zebra CI works:

  • is the some CI/CD that only runs on branches in ZcashFoundation/zebra?
  • is there some CI/CD that only runs on merges to main?

Oh, yes:

  • ci.yml is on PRs
  • cd.yml is on main
  • coverage.yml is both

But cd and ci both build the same Dockerfile in gcloud, which has the same commands, just one is via a gcloud action and one is via the Google Cloud Build App integration to allow it to work on PRs from external forks.

@teor2345
Copy link
Contributor Author

* ci.yml is on PRs
* cd.yml is on main
* coverage.yml is both

Ok, so here's our current test matrix for the sync tests:

  • macOS CI on pull requests
  • Debian Linux CD on main branch merges

(The other CI runners have unreliable networks.)

I think that's enough for now.

@dconnolly dconnolly merged commit b4f92ad into ZcashFoundation:main Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles A-rust Area: Updates to Rust code C-enhancement Category: This is an improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run sync tests on CI runners with working DNS and network
2 participants