fix: improve resiliency for e2e tests by adding tweaking retries and timeouts (partially fixes #449) #867
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This pull request includes several changes to improve the reliability and efficiency of the Kubernetes end-to-end testing framework. The most important changes involve adjusting timeouts and retry mechanisms to enhance robustness and reduce wait times.
Improvements to retry mechanisms:
test/e2e/framework/kubernetes/exec-pod.go
: Added retry logic usingretry.OnError
for executing commands in a pod to handle transient errors more gracefully.test/e2e/framework/kubernetes/port-forward.go
: Enabled exponential backoff in the default retrier to improve the efficiency of retry attempts.Adjustments to timeouts and delays:
test/e2e/framework/azure/create-cluster-with-npm.go
: Increased theclusterTimeout
from 10 to 15 minutes to allow more time for cluster creation.test/e2e/framework/kubernetes/port-forward.go
: Reduced thedefaultRetryDelay
from 5 seconds to 500 milliseconds to decrease the wait time between retry attempts.Dependency updates:
test/e2e/framework/kubernetes/exec-pod.go
: Added import fork8s.io/client-go/util/retry
to support the new retry logic.Please provide a brief description of the changes made in this pull request.
Related Issue
It fixes the issue #449 which talk about the intermittent failures in our e2e test.
Checklist
git commit -S -s ...
). See this documentation on signing commits.Screenshots (if applicable) or Testing Completed
Please add any relevant screenshots or GIFs to showcase the changes made.
Additional Notes
Add any additional notes or context about the pull request here.
Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.