Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the failure mode of timeout_test. #3396

Merged
merged 2 commits into from
Oct 21, 2020

Conversation

mattmoor
Copy link
Member

Changes

I notice that the failure mode for timeout_test.go tests on flakes seems to often be hitting the Go -timeout=X limit, which means that without -v you get no logs for the test.

This change makes the tests use a context with a Timeout, and makes the various Wait functions check the context.Done() and return context.Err() to support the timeout terminating the test earlier than the above and producing logs (other than an ugly panic!).

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

  • Includes tests (if functionality changed/added)
  • Includes docs (if user facing)
  • Commit messages follow commit message best practices
  • Release notes block has been filled in or deleted (only if no user facing changes)

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Reviewer Notes

If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.

Release Notes

NONE

@tekton-robot tekton-robot added the release-note-none Denotes a PR that doesnt merit a release note. label Oct 15, 2020
@tekton-robot tekton-robot requested review from afrittoli and a user October 15, 2020 23:05
@tekton-robot tekton-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 15, 2020
@mattmoor
Copy link
Member Author

cc @imjasonh @afrittoli @vdemeester

@mattmoor
Copy link
Member Author

/kind cleanup

@tekton-robot tekton-robot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Oct 15, 2020
@mattmoor
Copy link
Member Author

/test check-pr-has-kind-label

@mattmoor
Copy link
Member Author

Looks like there may be some bad assumption here that I need to dig into 😞

I'm out of time though before vacay, so if anyone want to pick this up, feel free, otherwise I'll come back to it when I'm back.

I notice that the failure mode for `timeout_test.go` tests on flakes seems to often be hitting the Go `-timeout=X` limit, which means that without `-v` you get no logs for the test.

This change makes the tests use a context with a Timeout, and makes the various `Wait` functions check the `context.Done()` and return `context.Err()` to support the timeout terminating the test earlier than the above and producing logs (other than an ugly panic!).
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 20, 2020
@mattmoor
Copy link
Member Author

Alright, so I also changed the tearDown calls in timeout_test.go to not pass ctx but a fresh context.Background() since we don't want the cleanup and dumping logic to fail.

@mattmoor
Copy link
Member Author

Alright, this should be RFAL...

/assign @afrittoli @vdemeester @imjasonh

test/wait.go Outdated
@@ -147,6 +167,11 @@ func WaitForServiceExternalIPState(ctx context.Context, c *clients, namespace, n
defer span.End()

return wait.PollImmediate(interval, timeout, func() (bool, error) {
select {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried future callers will forget this stanza at the top of their funcs. Can we somehow make this the responsibility of PollImmediate so it's not the caller's responsibility?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have another shape in mind that you might be happier with, gimme a few minutes 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ends up being gross too, lemme write a shared helper...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, I think I pushed a change that better centralizes the context cancellation handling logic

test/wait.go Outdated Show resolved Hide resolved
Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🎉

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 21, 2020
@imjasonh
Copy link
Member

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 21, 2020
@mattmoor
Copy link
Member Author

couldn't resolve host github.com...

/retest

@mattmoor
Copy link
Member Author

Weird, when I click the details links, all the builds say still running.

Let's try this again...

/retest

@tekton-robot tekton-robot merged commit 329f0f0 into tektoncd:master Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesnt merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants