Flake: TestPipelineRunTasksTimeout #6624

lbernick · 2023-05-05T13:27:36Z

Logs: https://storage.googleapis.com/tekton-prow/pr-logs/pull/tektoncd_pipeline/6619/pull-tekton-pipeline-alpha-integration-tests/1654262716136689664/build-log.txt

Link to test case

Relevant snippet from the logs:

=== CONT  TestPipelineRunTasksTimeout
    build_logs.go:37: build logs 
        >>> Container step-unnamed-0:
        unable to retrieve container logs for containerd://e80f4f62aa449b50bf18beec8a2bcf85ccfb19531d340c48207d63eb3f877d8a
    build_logs.go:35: Could not get logs for pod pipeline-run-tasks-timeout-wshksphb-dagtask-pod: pods "pipeline-run-tasks-timeout-wshksphb-dagtask-pod" not found
    build_logs.go:37: build logs 
    build_logs.go:37: build logs 
        >>> Container step-unnamed-0:
        unable to retrieve container logs for containerd://e80f4f62aa449b50bf18beec8a2bcf85ccfb19531d340c48207d63eb3f877d8a
    timeout_test.go:562: Not deleting namespace arendelle-xpkb2
--- FAIL: TestPipelineRunTasksTimeout (40.38s)

(The reason the pod cannot be found is that we delete pods when taskruns time out.)

The test case uses

  timeouts:
    pipeline: 60s
    tasks: 20s
    finally: 20s

and verifies that the DAG tasks were canceled and the finally task completed successfully. However it looks like in this case the finally task timed out after 20s:

        apiVersion: tekton.dev/v1
        kind: TaskRun
        metadata:
          annotations:
            pipeline.tekton.dev/release: 8e6926b-dirty
          creationTimestamp: "2023-05-04T23:31:50Z"
          generation: 2
          labels:
            app.kubernetes.io/managed-by: tekton-pipelines
            tekton.dev/memberOf: finally
            tekton.dev/pipeline: pipeline-run-tasks-timeout-pndvqlch
            tekton.dev/pipelineRun: pipeline-run-tasks-timeout-wshksphb
            tekton.dev/pipelineTask: finallytask
            tekton.dev/task: pipeline-run-tasks-timeout-nijhaoxl
          name: pipeline-run-tasks-timeout-wshksphb-finallytask
          namespace: arendelle-xpkb2
          ownerReferences:
          - apiVersion: tekton.dev/v1beta1
            blockOwnerDeletion: true
            controller: true
            kind: PipelineRun
            name: pipeline-run-tasks-timeout-wshksphb
            uid: 9bd9028f-fb99-452f-b368-1b8ccf995900
          resourceVersion: "7697"
          uid: 422fb59e-867f-47bf-a619-e0b33eb06ef9
        spec:
          serviceAccountName: default
          status: TaskRunCancelled
          statusMessage: TaskRun cancelled as the PipelineRun it belongs to has timed out.
          taskRef:
            kind: Task
            name: pipeline-run-tasks-timeout-nijhaoxl
          timeout: 1h0m0s
        status:
          completionTime: "2023-05-04T23:32:10Z"
          conditions:
          - lastTransitionTime: "2023-05-04T23:32:10Z"
            message: TaskRun "pipeline-run-tasks-timeout-wshksphb-finallytask" was cancelled.
              TaskRun cancelled as the PipelineRun it belongs to has timed out.
            reason: TaskRunCancelled
            status: "False"
            type: Succeeded
          podName: pipeline-run-tasks-timeout-wshksphb-finallytask-pod
          startTime: "2023-05-04T23:31:50Z"
          steps:
          - container: step-unnamed-0
            imageID: docker.io/library/busybox@sha256:b5d6fe0712636ceb7430189de28819e195e8966372edfc2d9409d79402a0dc16
            name: unnamed-0
            terminated:
              exitCode: 1
              finishedAt: "2023-05-04T23:32:10Z"
              reason: TaskRunCancelled
              startedAt: "2023-05-04T23:31:52Z"
          taskSpec:
            steps:
            - args:
              - -c
              - sleep 1
              command:
              - /bin/sh
              computeResources: {}
              image: busybox
              name: ""

The text was updated successfully, but these errors were encountered:

lbernick · 2023-05-08T20:46:41Z

Observed a similar flake in #6621 with TestPipelineTaskTimeout (logs)

tekton-robot · 2023-08-06T21:13:05Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot · 2023-09-05T21:18:09Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot · 2023-10-05T21:21:58Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

tekton-robot · 2023-10-05T21:22:00Z

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

lbernick added the kind/flake Categorizes issue or PR as related to a flakey test label May 5, 2023

lbernick mentioned this issue May 5, 2023

Refactor Sidecar Containers Construction If Script Exists #6619

Merged

7 tasks

lbernick mentioned this issue May 8, 2023

move trusted resources verification after we resolve the remote resources #6621

Merged

7 tasks

jerop mentioned this issue May 9, 2023

Clean up metrics code slightly. #6609

Merged

7 tasks

lbernick mentioned this issue May 22, 2023

Docs update: CSI + projected workspaces are beta #6700

Merged

4 tasks

Yongxuanzhang mentioned this issue May 31, 2023

Bump go.opentelemetry.io/otel/exporters/jaeger from 1.14.0 to 1.16.0 #6706

Merged

JeromeJu mentioned this issue Jun 16, 2023

Change the Storage Version to V1 Types #6444

Merged

7 tasks

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 6, 2023

tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 5, 2023

tekton-robot closed this as completed Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flake: TestPipelineRunTasksTimeout #6624

Flake: TestPipelineRunTasksTimeout #6624

lbernick commented May 5, 2023

lbernick commented May 8, 2023

tekton-robot commented Aug 6, 2023

tekton-robot commented Sep 5, 2023

tekton-robot commented Oct 5, 2023

tekton-robot commented Oct 5, 2023

Flake: TestPipelineRunTasksTimeout #6624

Flake: TestPipelineRunTasksTimeout #6624

Comments

lbernick commented May 5, 2023

lbernick commented May 8, 2023

tekton-robot commented Aug 6, 2023

tekton-robot commented Sep 5, 2023

tekton-robot commented Oct 5, 2023

tekton-robot commented Oct 5, 2023