Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flake: TestPipelineRunTasksTimeout #6624

Closed
lbernick opened this issue May 5, 2023 · 5 comments
Closed

Flake: TestPipelineRunTasksTimeout #6624

lbernick opened this issue May 5, 2023 · 5 comments
Labels
kind/flake Categorizes issue or PR as related to a flakey test lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@lbernick
Copy link
Member

lbernick commented May 5, 2023

Logs: https://storage.googleapis.com/tekton-prow/pr-logs/pull/tektoncd_pipeline/6619/pull-tekton-pipeline-alpha-integration-tests/1654262716136689664/build-log.txt

Link to test case

Relevant snippet from the logs:

=== CONT  TestPipelineRunTasksTimeout
    build_logs.go:37: build logs 
        >>> Container step-unnamed-0:
        unable to retrieve container logs for containerd://e80f4f62aa449b50bf18beec8a2bcf85ccfb19531d340c48207d63eb3f877d8a
    build_logs.go:35: Could not get logs for pod pipeline-run-tasks-timeout-wshksphb-dagtask-pod: pods "pipeline-run-tasks-timeout-wshksphb-dagtask-pod" not found
    build_logs.go:37: build logs 
    build_logs.go:37: build logs 
        >>> Container step-unnamed-0:
        unable to retrieve container logs for containerd://e80f4f62aa449b50bf18beec8a2bcf85ccfb19531d340c48207d63eb3f877d8a
    timeout_test.go:562: Not deleting namespace arendelle-xpkb2
--- FAIL: TestPipelineRunTasksTimeout (40.38s)

(The reason the pod cannot be found is that we delete pods when taskruns time out.)

The test case uses

  timeouts:
    pipeline: 60s
    tasks: 20s
    finally: 20s

and verifies that the DAG tasks were canceled and the finally task completed successfully. However it looks like in this case the finally task timed out after 20s:

        apiVersion: tekton.dev/v1
        kind: TaskRun
        metadata:
          annotations:
            pipeline.tekton.dev/release: 8e6926b-dirty
          creationTimestamp: "2023-05-04T23:31:50Z"
          generation: 2
          labels:
            app.kubernetes.io/managed-by: tekton-pipelines
            tekton.dev/memberOf: finally
            tekton.dev/pipeline: pipeline-run-tasks-timeout-pndvqlch
            tekton.dev/pipelineRun: pipeline-run-tasks-timeout-wshksphb
            tekton.dev/pipelineTask: finallytask
            tekton.dev/task: pipeline-run-tasks-timeout-nijhaoxl
          name: pipeline-run-tasks-timeout-wshksphb-finallytask
          namespace: arendelle-xpkb2
          ownerReferences:
          - apiVersion: tekton.dev/v1beta1
            blockOwnerDeletion: true
            controller: true
            kind: PipelineRun
            name: pipeline-run-tasks-timeout-wshksphb
            uid: 9bd9028f-fb99-452f-b368-1b8ccf995900
          resourceVersion: "7697"
          uid: 422fb59e-867f-47bf-a619-e0b33eb06ef9
        spec:
          serviceAccountName: default
          status: TaskRunCancelled
          statusMessage: TaskRun cancelled as the PipelineRun it belongs to has timed out.
          taskRef:
            kind: Task
            name: pipeline-run-tasks-timeout-nijhaoxl
          timeout: 1h0m0s
        status:
          completionTime: "2023-05-04T23:32:10Z"
          conditions:
          - lastTransitionTime: "2023-05-04T23:32:10Z"
            message: TaskRun "pipeline-run-tasks-timeout-wshksphb-finallytask" was cancelled.
              TaskRun cancelled as the PipelineRun it belongs to has timed out.
            reason: TaskRunCancelled
            status: "False"
            type: Succeeded
          podName: pipeline-run-tasks-timeout-wshksphb-finallytask-pod
          startTime: "2023-05-04T23:31:50Z"
          steps:
          - container: step-unnamed-0
            imageID: docker.io/library/busybox@sha256:b5d6fe0712636ceb7430189de28819e195e8966372edfc2d9409d79402a0dc16
            name: unnamed-0
            terminated:
              exitCode: 1
              finishedAt: "2023-05-04T23:32:10Z"
              reason: TaskRunCancelled
              startedAt: "2023-05-04T23:31:52Z"
          taskSpec:
            steps:
            - args:
              - -c
              - sleep 1
              command:
              - /bin/sh
              computeResources: {}
              image: busybox
              name: ""
@lbernick lbernick added the kind/flake Categorizes issue or PR as related to a flakey test label May 5, 2023
@lbernick
Copy link
Member Author

lbernick commented May 8, 2023

Observed a similar flake in #6621 with TestPipelineTaskTimeout (logs)

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 6, 2023
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 5, 2023
@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flakey test lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

2 participants