Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: Reduce flake for integration test TestStartStop #6999

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion test/integration/fn_mount_cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ func validateMountCmd(ctx context.Context, t *testing.T, profile string) {
}

start := time.Now()
if err := retry.Expo(checkMount, time.Second, 15*time.Second); err != nil {
if err := retry.Expo(checkMount, time.Second, Seconds(15)); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconds is not a good name for this function.

How about "ApproximateSeconds" or "SecondsWithMultiplier"?

// For local testing, allow macOS users to click prompt. If they don't, skip the test.
if runtime.GOOS == "darwin" {
t.Skip("skipping: mount did not appear, likely because macOS requires prompt to allow non-codesigned binaries to listen on non-localhost port")
Expand Down
2 changes: 1 addition & 1 deletion test/integration/fn_pvc.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ func validatePersistentVolumeClaim(ctx context.Context, t *testing.T, profile st
}

// Ensure the addon-manager has created the StorageClass before creating a claim, otherwise it won't be bound
if err := retry.Expo(checkStorageClass, time.Second, 90*time.Second); err != nil {
if err := retry.Expo(checkStorageClass, time.Second, Seconds(90)); err != nil {
t.Errorf("no default storage class after retry: %v", err)
}

Expand Down
30 changes: 29 additions & 1 deletion test/integration/functional_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -595,6 +595,20 @@ func validateProfileCmd(ctx context.Context, t *testing.T, profile string) {

// validateServiceCmd asserts basic "service" command functionality
func validateServiceCmd(ctx context.Context, t *testing.T, profile string) {
// to avoid https://github.com/kubernetes/minikube/issues/6997
// adding this logic to minikube start is not necessary but useful for rare test flakes
saReady := func() error { // check if default service account is created.
if _, err := Run(t, exec.CommandContext(ctx, "kubectl", "--context", profile, "get", "serviceaccount", "default")); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be covering up an actual flaw that users can face. Shouldn't the start command wait for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, this rarely happens but it eats up 13 seconds of waiting on my machine. I have a feeling that would not be helpful

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, this integration test is surfacing a bug: minikube start + kubectl is currently racy. As it is successfully capturing a rare race condition (yay!), we should either fix it, or leave it present, not paper over it.

I believe the right way to go about this is to add a --wait-for-serviceaccount flag to minikube start, and set it to false initially, but override it in these tests that are affected. That way other users who prefer reliability over latency can benefit: integration tests, IDE's, etc. This way we can better measure how to optimize for serviceaccount start latency.

I would argue that defaulting this flag to 'true' would be more in the spirit of minikube (reliability > latency) , but I do understand the trade-off here, and am comfortable leaving it as false for the time being.

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tstromberg
I agree with you, lets not merge this PR so it bugs us in integeration test,
and we fix minikube, here are my thoughts, let keep the discussion in this issue so others can participate, #7011 (comment)

t.Logf("temporary error waiting for default service account: %v", err)
return fmt.Errorf("Error waiting for default service account %v", err)
}
return nil
}

if err := retry.Expo(saReady, 500*time.Microsecond, time.Second*30); err != nil {
t.Errorf("default service account was not created in 30 seconds.: %v", err)
}

rr, err := Run(t, exec.CommandContext(ctx, "kubectl", "--context", profile, "create", "deployment", "hello-node", "--image=gcr.io/hello-minikube-zero-install/hello-node"))
if err != nil {
t.Logf("%s failed: %v (may not be an error)", rr.Args, err)
Expand Down Expand Up @@ -710,6 +724,20 @@ func validateSSHCmd(ctx context.Context, t *testing.T, profile string) {

// validateMySQL validates a minimalist MySQL deployment
func validateMySQL(ctx context.Context, t *testing.T, profile string) {
// to avoid https://github.com/kubernetes/minikube/issues/6997
// adding this logic to minikube start is not necessary but useful for rare test flakes
saReady := func() error { // check if default service account is created.
if _, err := Run(t, exec.CommandContext(ctx, "kubectl", "--context", profile, "get", "serviceaccount", "default")); err != nil {
t.Logf("temporary error waiting for default service account: %v", err)
return fmt.Errorf("Error waiting for default service account %v", err)
}
return nil
}

if err := retry.Expo(saReady, 500*time.Microsecond, time.Second*30); err != nil {
t.Errorf("default service account was not created in 30 seconds.: %v", err)
}

rr, err := Run(t, exec.CommandContext(ctx, "kubectl", "--context", profile, "replace", "--force", "-f", filepath.Join(*testdataDir, "mysql.yaml")))
if err != nil {
t.Fatalf("%s failed: %v", rr.Args, err)
Expand All @@ -725,7 +753,7 @@ func validateMySQL(ctx context.Context, t *testing.T, profile string) {
rr, err = Run(t, exec.CommandContext(ctx, "kubectl", "--context", profile, "exec", names[0], "--", "mysql", "-ppassword", "-e", "show databases;"))
return err
}
if err = retry.Expo(mysql, 5*time.Second, 180*time.Second); err != nil {
if err = retry.Expo(mysql, 5*time.Second, Seconds(180)); err != nil {
t.Errorf("mysql failing: %v", err)
}
}
Expand Down
15 changes: 15 additions & 0 deletions test/integration/start_stop_delete_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,13 @@ import (
"strconv"
"strings"
"testing"
"time"

"github.com/docker/machine/libmachine/state"
"github.com/google/go-cmp/cmp"
"k8s.io/minikube/pkg/minikube/bootstrapper/images"
"k8s.io/minikube/pkg/minikube/constants"
"k8s.io/minikube/pkg/util/retry"
)

func TestStartStop(t *testing.T) {
Expand Down Expand Up @@ -169,6 +171,19 @@ func TestStartStop(t *testing.T) {
// testPodScheduling asserts that this configuration can schedule new pods
func testPodScheduling(ctx context.Context, t *testing.T, profile string) {
t.Helper()
// to avoid https://github.com/kubernetes/minikube/issues/6997
// adding this logic to minikube start is not necessary but useful for rare test flakes
saReady := func() error { // check if default service account is created.
if _, err := Run(t, exec.CommandContext(ctx, "kubectl", "--context", profile, "get", "serviceaccount", "default")); err != nil {
t.Logf("temporary error waiting for default service account: %v", err)
return fmt.Errorf("Error waiting for default service account %v", err)
}
return nil
}

if err := retry.Expo(saReady, 500*time.Microsecond, time.Second*30); err != nil {
t.Errorf("default service account was not created in 30 seconds.: %v", err)
}

// schedule a pod to assert persistence
rr, err := Run(t, exec.CommandContext(ctx, "kubectl", "--context", profile, "create", "-f", filepath.Join(*testdataDir, "busybox.yaml")))
Expand Down