Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom controller behaviour with no retries support #5182

Closed
JeromeJu opened this issue Jul 20, 2022 · 6 comments
Closed

Custom controller behaviour with no retries support #5182

JeromeJu opened this issue Jul 20, 2022 · 6 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@JeromeJu
Copy link
Member

JeromeJu commented Jul 20, 2022

Feature request

In TEP-69: Support retries for custom task in a pipeline, the cases where the custom controller
does not support retries have not been clarified.
Hence this issue is addressed to track down the cases and to seek for improvement on the user experience.

Follow up of #4686

Use case

To produce the case:

  • Create a Custom Task with the retries field
  • Try running it with a controller that does not support retries
  • Fail the Custom Task and record the output/ behaviour

Supports of concepts

The concern is how a user would know that the custom task controller doesn't support retries -- it may be surprising or confusing to users, maybe documentation or tooling can address this

Originally posted by @jerop in tektoncd/community#491 (review)

cc @lbernick @jerop

References:

@JeromeJu JeromeJu added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 20, 2022
@JeromeJu
Copy link
Member Author

JeromeJu commented Jul 21, 2022

Cases:

Utilized experimental/wait-task controller with apiVersion v1beta1.

  • Succeeded
    • Succeeded after 10s wait time
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
 name: custom-task-pipeline
 namespace: foo
spec:
 tasks:
 - name: wait-tasks
   retries: 2
   taskRef:
     apiVersion: example.dev/v0
     kind: Wait
   params:
     - name: duration
       value: 10s
  • Failed at 1st attempt, then edited to be a valid pipeline
    • Same run status described using kubectl as 1st failure with no retries
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
 name: custom-task-pipeline
 namespace: foo
spec:
 tasks:
 - name: wait-tasks
   retries: 2
   taskRef:
     apiVersion: example.dev/v0
     kind: Wait
   params:
     - name: duration
       value: 40                     <- This is an error and it gets updated to '40s', 
                                        which should be valid for the 1st retry.
  • Error
  Type     Reason             Age                 From         Message
  ----     ------             ----                ----         -------
  Normal   Started        3m                   PipelineRun  
  Normal   Running        3m                   PipelineRun  Tasks Completed: 0 (Failed: 0, Cancelled 0), Incomplete: 1, Skipped: 0
  Warning  InternalError  2m54s (x11 over 3m)  PipelineRun  1 error occurred:
           * error creating Run called custom-task-pipeline-run-4njl8-wait-tasks for PipelineTask wait-tasks from PipelineRun custom-task-pipeline-run-4njl8: runs.tekton.dev "custom-task-pipeline-run-4njl8-wait-tasks" already exists
  Warning  RunCreationFailed  2m49s (x12 over 3m)  PipelineRun  Failed to create Run "custom-task-pipeline-run-4njl8-wait-tasks": runs.tekton.dev "custom-task-pipeline-run-4njl8-wait-tasks" already exists

@JeromeJu
Copy link
Member Author

JeromeJu commented Jul 21, 2022

The TaskRun way of avoiding the duplicates pipeline/pkg/pod/pod.go:

     podNameSuffix := "-pod"
     if taskRunRetries := len(taskRun.Status.RetriesStatus); taskRunRetries > 0 {
         podNameSuffix = fmt.Sprintf("%s-retry%d", podNameSuffix, taskRunRetries)
     }

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 31, 2022
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 30, 2022
@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
Status: Done
Development

No branches or pull requests

2 participants