Skip to content

phd-runner is very sensitive to network issues #944

@iximeow

Description

@iximeow

in #941 i saw two failures of phd-run-migrate-from-base, they were something like this:

	phd-runner: [TEST - EVENT] Error obtaining artifact from source
    e = reqwest::Error { kind: Request, url: "https://oxide-omicron-build.s3.amazonaws.com/alpine.iso", source: hyper_util::client::legacy::Error(Connect, Ssl(Error { code: ErrorCode(5), cause: Some(Io(Os { code: 131, kind: ConnectionReset, message: "Connection reset by peer" })) }, X509VerifyResult { code: 0, error: "ok" })) }
    file = phd-tests/framework/src/artifacts/store.rs
    line = 627
    path = phd_tests::migrate::from_base::migration_from_base_and_back
    target = phd_framework::artifacts::store
    uri = https://oxide-omicron-build.s3.amazonaws.com/alpine.iso

our approach to downloading then is, per test: try to fetch the artifact once if it is not already accounted for. so in the migration-from-base-and-back case above, we failed the first attempt to get alpine.iso, failed that test, retried on the next test, succeeded, and the rest passed.

it's probably worth trying a second time on a reset, and/or just getting artifacts up front before running tests, or .. something? It's also just not great that we see resets from S3 anyway.

Metadata

Metadata

Assignees

No one assigned

    Labels

    testingRelated to testing and/or the PHD test framework.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions