Skip to content

[Flaky test] Tests in vendor/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go are flaking #114145

@Rajalakshmi-Girish

Description

@Rajalakshmi-Girish

Which jobs are flaking?

https://prow.ppc64le-cloud.org/job-history/s3/ppc64le-prow-logs/logs/postsubmit-master-golang-kubernetes-unit-test-ppc64le

Which tests are flaking?

Tests in vendor/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go are flaking when run with master golang on ppc64le.

TestGracefulTerminationWithKeepListeningDuringGracefulTerminationDisabled
TestGracefulTerminationWithKeepListeningDuringGracefulTerminationEnabled
TestMuxAndDiscoveryComplete
TestPreShutdownHooks/ShutdownSendRetryAfter_is_disabled
TestPreShutdownHooks/ShutdownSendRetryAfter_is_enabled

Since when has it been flaking?

After the commit golang/go@8a81fdf

Testgrid link

No response

Reason for failure (if possible)

The request to APIServer is timing out at https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go#L861

The tests are Passing when the timeout value is increased to 200ms

[root@raji-workspace server]# go version
go version devel go1.20-8a81fdf165 Sat Nov 19 16:48:07 2022 +0000 linux/ppc64le
[root@raji-workspace server]# go test -race -run TestMuxAndDiscoveryComplete
W1125 12:05:42.845733 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:42.845988 3169366 authentication.go:47] Authentication is disabled
I1125 12:05:42.859571 3169366 secure_serving.go:210] Serving securely on [::]:46773
I1125 12:05:42.859730 3169366 tlsconfig.go:240] "Starting DynamicServingCertificateController"
--- FAIL: TestMuxAndDiscoveryComplete (5.21s)
    genericapiserver_graceful_termination_test.go:890: Sending request - timeout: 100ms, url: https://127.0.0.1:46773/echo?message=attempt-1
    genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc0007a6080)}}
    genericapiserver_graceful_termination_test.go:865: Still waiting for the server to start - err: <nil>
    genericapiserver_graceful_termination_test.go:890: Sending request - timeout: 100ms, url: https://127.0.0.1:46773/echo?message=attempt-2
    genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc0007a6100)}}
    ........
    ........
    genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc000ace180)}}
    genericapiserver_graceful_termination_test.go:865: Still waiting for the server to start - err: <nil>
    genericapiserver_graceful_termination_test.go:878: The server has failed to start - err: timed out waiting for the condition
W1125 12:05:48.059617 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.059691 3169366 authentication.go:47] Authentication is disabled
W1125 12:05:48.061665 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.061725 3169366 authentication.go:47] Authentication is disabled
W1125 12:05:48.063942 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.063993 3169366 authentication.go:47] Authentication is disabled
FAIL
exit status 1
FAIL    k8s.io/apiserver/pkg/server     5.428s
[root@raji-workspace server]#

The PASS after increasing timeout:

[root@raji-workspace server]# vi genericapiserver_graceful_termination_test.go +861
[root@raji-workspace server]# git diff
diff --git a/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go b/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
index c18ce70c4ea..419e8fb3308 100644
--- a/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
+++ b/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
@@ -858,7 +858,7 @@ func waitForAPIServerStarted(t *testing.T, doer doer) {
        client := newClient(true)
        i := 1
        err := wait.PollImmediate(100*time.Millisecond, 5*time.Second, func() (done bool, err error) {
-               result := doer.Do(client, func(httptrace.GotConnInfo) {}, fmt.Sprintf("/echo?message=attempt-%d", i), 100*time.Millisecond)
+               result := doer.Do(client, func(httptrace.GotConnInfo) {}, fmt.Sprintf("/echo?message=attempt-%d", i), 200*time.Millisecond)
                i++

                if result.err != nil {
[root@raji-workspace server]# go test -race -run TestMuxAndDiscoveryComplete
W1125 12:13:40.991970 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:40.992158 3178806 authentication.go:47] Authentication is disabled
I1125 12:13:41.004942 3178806 secure_serving.go:210] Serving securely on [::]:44603
I1125 12:13:41.004991 3178806 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W1125 12:13:44.200081 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.200164 3178806 authentication.go:47] Authentication is disabled
W1125 12:13:44.201983 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.202041 3178806 authentication.go:47] Authentication is disabled
W1125 12:13:44.203595 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.203649 3178806 authentication.go:47] Authentication is disabled
PASS
ok      k8s.io/apiserver/pkg/server     3.445s
[root@raji-workspace server]#

Anything else we need to know?

Seeing this falkiness only on ppc64le architecture and when run with golang versions after the commit 8a81fdf165facdcefa06531de5af98a4db343035flakiness

Relevant SIG(s)

/sig testing

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/flakeCategorizes issue or PR as related to a flaky test.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/testingCategorizes an issue or PR as relevant to SIG Testing.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions