Skip to content
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.

ci: update Consul versions in test matrices, bump consul-k8s to v0.47.1 #303

Merged
merged 4 commits into from
Aug 15, 2022

Conversation

mikemorris
Copy link
Contributor

@mikemorris mikemorris commented Aug 10, 2022

Changes proposed in this PR:

How I've tested this PR:

Running CI unit, e2e and conformance tests for all supported versions.

How I expect reviewers to test this PR:

Confirm that all new or updated CI tests are passing.

Checklist:

  • Tests added
  • CHANGELOG entry added

    Run make changelog-entry for guidance in authoring a changelog entry, and
    commit the resulting file, which should have a name matching your PR number.
    Entries should use imperative present tense (e.g. Add support for...)

@mikemorris mikemorris added pr/no-changelog Skip the CI check that requires a changelog entry pr/run-conformance do not merge labels Aug 10, 2022
@mikemorris mikemorris force-pushed the ci/consul-k8s-v0.47 branch from f83c35d to 3ccb1d0 Compare August 10, 2022 16:35
@mikemorris mikemorris changed the title ci: bump consul-k8s to v0.47.0 for conformance tests ci: update Consul versions in test matrices, bump consul-k8s to v0.47.0 Aug 10, 2022
@mikemorris
Copy link
Contributor Author

mikemorris commented Aug 10, 2022

e2e tests appear to be failing consistently with a SIGQUIT for Consul v1.13.0, maybe due to different handling of rejected TCP connections somehow? apparently due to timing out after the background sync into Consul test fails and it retries.

 2022-08-10T19:20:17.842Z [TRACE] Synced memory store in background
    k8s_e2e_test.go:367: 
        	Error Trace:	/home/runner/work/consul-api-gateway/consul-api-gateway/internal/commands/server/k8s_e2e_test.go:367
        	            				/home/runner/work/consul-api-gateway/consul-api-gateway/internal/commands/server/env.go:405
        	            				/home/runner/work/consul-api-gateway/consul-api-gateway/internal/commands/server/env.go:436
        	Error:      	Condition never satisfied
        	Test:       	TestGatewayBasic/gateway_admission/background_sync_into_consul
        	Messages:   	service still returned after de-registering
        --- FAIL: TestGatewayBasic/gateway_admission/background_sync_into_consul (308.05s)
2022-08-10T19:24:47.673Z [TRACE] reconciliation finished: time="2022-08-10 19:24:47.667113734 +0000 UTC m=+648.687264301" spent="127.504µs"
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": read tcp 127.0.0.1:56638->127.0.0.1:29503: read: connection reset by peer
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": EOF
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": EOF
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": EOF
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": read tcp 127.0.0.1:56654->127.0.0.1:29503: read: connection reset by peer
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": read tcp 127.0.0.1:56658->127.0.0.1:29503: read: connection reset by peer
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": EOF
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": read tcp 127.0.0.1:56666->127.0.0.1:29503: read: connection reset by peer
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": read tcp 127.0.0.1:56670->127.0.0.1:29503: read: connection reset by peer
    k8s_e2e_test.go:2404: Get "https://localhost:29503/": read tcp 127.0.0.1:56674->127.0.0.1:29503: read: connection reset by peer
SIGQUIT: quit
 github.com/hashicorp/consul-api-gateway/internal/commands/server.checkRoute(0xc0007f5a00, 0x733f, {0x20c9d04, 0x1}, {0x1?, {0xc001500ba0?, 0xc00108acf0?}}, 0xc000f31cb0, {0x1e44828, 0x29})
	/home/runner/work/consul-api-gateway/consul-api-gateway/internal/commands/server/k8s_e2e_test.go:2385 +0x18b
github.com/hashicorp/consul-api-gateway/internal/commands/server.TestReferenceGrantLifecycle.func1({0x20f1a30, 0xc0008a9680}, 0x514?, 0x0?)
	/home/runner/work/consul-api-gateway/consul-api-gateway/internal/commands/server/k8s_e2e_test.go:1571 +0x1f39
 *** Test killed with quit: ran too long (11m0s).
FAIL internal/commands/server.TestReferenceGrantLifecycle/reference_grant/route_controllers_watch_reference_grant_changes (-1.00s)

@mikemorris
Copy link
Contributor Author

mikemorris commented Aug 10, 2022

Prior error looks like a flaky timeout, but new error I'm seeing consistently looks like maybe a behavior change in the error returned by Envoy in v1.23.0 from an HTTP 503 status at

// TODO: Routing from multiple gateways is not yet supported.
// When implemented, this check should be updated to wait for
// http.StatusOK and serviceOne.Name as the body content prefix.
checkRoute(t, secondGatewayCheckPort, "/", httpResponse{
StatusCode: http.StatusServiceUnavailable,
Body: "no healthy upstream",
}, nil, "service one not returning expected error from second gateway in allotted time")

to

    k8s_e2e_test.go:2414: RBAC: access denied
    k8s_e2e_test.go:2417: status code 403

It appears this behavior changed due to envoyproxy/envoy#20877, where an unmatched incoming request will now hit the RBAC filter.

It looks like we should be able work around this temporarily by pinning the Consul v1.13.0 e2e test matrices to Envoy v1.22 (or tweaking the test to look for the alternate error response), but longer-term is yet another motivation for controlling Envoy configuration more directly rather than relying on default behavior.

.github/workflows/ci.yml Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
.github/workflows/conformance.yml Outdated Show resolved Hide resolved
.github/workflows/conformance.yml Outdated Show resolved Hide resolved
.github/workflows/conformance.yml Outdated Show resolved Hide resolved
@mikemorris mikemorris force-pushed the ci/consul-k8s-v0.47 branch 6 times, most recently from b357dd2 to c154b40 Compare August 12, 2022 17:16
@mikemorris mikemorris force-pushed the ci/consul-k8s-v0.47 branch 3 times, most recently from 98b4167 to 9721e2f Compare August 12, 2022 19:51
…rt installing from Consul Helm chart main branch (#196)"

This reverts commit 648b2b6.
@mikemorris mikemorris force-pushed the ci/consul-k8s-v0.47 branch from 9721e2f to f05732a Compare August 12, 2022 20:32
@mikemorris
Copy link
Contributor Author

mikemorris commented Aug 12, 2022

Consul 1.12 conformance tests were failing due to the hashicorp/consul:1.12 Docker Hub tag only pointing to an arm64 image rather than including all four arches (referencing hashicorp/consul:1.12.4 directly would have actually worked fine), which weirdly manifested as network corruption preventing the Consul servers and agents from becoming ready, rather than something more obvious like a missing image CrashLoopBackoff. With that issue resolved out-of-band, I'd like to merge this to get the nightly conformance tests fixed and updated for Consul 1.11 and 1.12 while trying to figure out the Consul 1.13 issues in a fresh PR.

@mikemorris mikemorris marked this pull request as ready for review August 12, 2022 20:40
@mikemorris mikemorris requested review from a team, shore and claire-labry August 12, 2022 20:40
@nathancoleman nathancoleman changed the title ci: update Consul versions in test matrices, bump consul-k8s to v0.47.0 ci: update Consul versions in test matrices, bump consul-k8s to v0.47.1 Aug 12, 2022
@nathancoleman nathancoleman merged commit 5d6462e into main Aug 15, 2022
@nathancoleman nathancoleman deleted the ci/consul-k8s-v0.47 branch August 15, 2022 14:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr/no-changelog Skip the CI check that requires a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants