Publish results when task and pipeline runs fail #3749

afrittoli · 2021-02-04T12:33:57Z

Feature request

Results should be available for failed task and pipeline runs.

Today, if a TaskRun or PipelineRun is set to failed, we skip the code that grabs the results and adds them to status.

Use case

Results can be used by a failing task or pipeline to provide more context about why the failed, or how to react to the failure.

One specific use case that I'm trying to implement, is to enrich the GitHub update pipelines in Tekton based CI with the ability to publish a comment that gives the developer more context about the failure. Today that can achieved by adding that comment into the logs, however a comment on GitHub provides a more immediate feedback to developers. Specifically I'd like to improve the job that checks for the "kind label".

That would work as follow:

the CI TaskRun is executed.
if no valid label is found, a message with the list of valid labels is published to a specific result
the TaskRun fails
a cloud event is sent to the tekton-events trigger. In includes the TaskRun status in the payload
the trigger filters the event
a binding extracts the result from the payload and passes it the trigger template
the trigger template runs the github update pipeline, which sets the status and adds the comment

/cc @bobcatfish @vdemeester

The text was updated successfully, but these errors were encountered:

ghost · 2021-02-04T14:00:37Z

I've discussed this before with users in another issue.

Results are required to be populated. Pipelines fail if they're not. If you introduce an "errorMessage" result to a Task then the Task has to populate it whether there was a failure or not. That means:

# Success!
echo "" > $(results.errorMessage.path)

...in any Tasks that want to provide this.

Further, the error message results will be named differently for every single Task. There could be Tasks that have an errorMessage result, an errorContext result, an err result, etc etc. That means tooling (like the dashboard or CLI) won't have a consistent way to look up these messages / extra context and surface them differently from other results.

I'd like to propose an alternative for your consideration: Provide an explicit status.errorResult field (or named anything else) that is explicitly for the purpose of returning context on Task error. Provide a Tekton variable to its path - It can function almost identically to a Result but will be optional for a Task to return. This leaves the semantics of Results alone but provides a well-defined outlet for this kind of information.

There are other alternatives which could be considered too. But I want to make sure that the knock-on effects of supporting Results for failure scenarios are well documented before we decide to support this.

Edit: Here's the previous discussion: #3439
Edit 2: This was also requested here: #3649

ghost · 2021-02-04T14:02:12Z

Another alternative could be to explicitly mark certain results as error-results? Or have a separate section like results just for them?

The check labels job uses the PR pipeline resource to add a comment to the PR with instructions for developers in case of failure. That doesn't work though because in case of failure the sync-up side of the pipeline resource is not processed, so the commit is not written. Fix that by removing the pipeline resource and writing the instructions to the step log instead. The step log is linked in the github status details through the tekton dashboard. In future we might extend the github update service to include support for comments, but that would require support for results on failure: tektoncd/pipeline#3749 Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>

afrittoli · 2021-02-08T16:11:58Z

I've discussed this before with users in another issue.

Results are required to be populated. Pipelines fail if they're not. If you introduce an "errorMessage" result to a Task then the Task has to populate it whether there was a failure or not. That means:
# Success!
echo "" > $(results.errorMessage.path)

This is not what I'm saying, I'm not proposing to have error specific results.
My proposal is to stop ignoring valid results in case of failure, for instance:

# Write a result which is valid for the failure case
echo "This failed. To try again re-run the pipeline" > $(results.githubcomment.path)

# Perform some task
really-useful-task
echo "This was successful" > $(results.githubcomment.path)

Today, if really-useful-task fails, no result is published.

ghost · 2021-02-08T16:28:59Z

My proposal is to stop ignoring valid results in case of failure, for instance:

I understand the distinction but I'm not sure I totally see the difference in outcome. Once we support emitting results on Task failure they will immediately become available for use as "error results". I'm looking at the proposal both in terms of the described use-case but also the other patterns that become available when implementing that proposal.

I guess my question is: why not also consider the alternative of a more official "error result" or "optional result" too as part of the design work?

afrittoli · 2021-02-08T16:46:43Z

I didn't think in terms of error results before, I kind of expected results to be there in case of failure too, and I'm not sure why they are not. In case of a task failure we should provide as much context as available - why skip results then?
I was tempted to file this as bug, but I created it as a feature because the code explicitly skips results in case of failure.

Optional results are certainly an interesting topic, this TEP is related: tektoncd/community#240.
Default values for results may address some case but not when the result value needs to be produced by a step.

ghost · 2021-02-08T16:58:09Z

There's definitely more discussion that could be had around this but one of the original reasons we didn't emit results on failure was because Tekton can't infer whether the result data is valid or not. Given the original purpose of Task Results was for passing along short bits of data to other Tasks (and there was no Finally on the horizon at that point) the design ultimately omitted those bits of data on failure.

bobcatfish · 2021-02-08T21:29:40Z

I didn't think in terms of error results before, I kind of expected results to be there in case of failure too, and I'm not sure why they are not. In case of a task failure we should provide as much context as available - why skip results then?

Note also that if you have a Task that depends on the result of another Task, and the Task producing the result fails, the Task consuming the result will not run.

e.g.:

TaskA
TaskB uses TaskA.results.foo

If TaskA fails or is skipped, TaskB will not run (which is where the optional results come in)

But @afrittoli maybe you're not talking about making the result available to consuming Tasks, you're talking specifically about putting it into the status of the TaskRun?

afrittoli · 2021-02-08T21:33:30Z

I didn't think in terms of error results before, I kind of expected results to be there in case of failure too, and I'm not sure why they are not. In case of a task failure we should provide as much context as available - why skip results then?

Note also that if you have a Task that depends on the result of another Task, and the Task producing the result fails, the Task consuming the result will not run.

e.g.:
TaskA
TaskB uses TaskA.results.foo
If TaskA fails or is skipped, TaskB will not run (which is where the optional results come in)

But @afrittoli maybe you're not talking about making the result available to consuming Tasks, you're talking specifically about putting it into the status of the TaskRun?

Yes, indeed.

The status will stay in etcd, might be gobbled up by chains, stored in results, sent over cloud events, trigger other parts of the workflow, so there is value in having that result even if it won't be read by other tasks.

It may also be used by tasks in finally, once the corresponding TEP is implemented.

ghost · 2021-02-09T14:17:03Z

Thought about this some more the past few days and my resistance has waned. I do think there are downsides to introducing this, and I think we'll see them quickly picked up just for error message delivery, but I can see that the practical utility might outweigh those issues? Paging @pritidesai here too for any thoughts or feedback.

If we get to a point where we see lots of Tasks added to the Catalog which utilize results to specifically emit error messages on failure I would consider that a point to introduce something like an "error result". UIs and tools could then probe and surface those messages differently to regular results. At that point though the Tekton cat might be out of the bag and users would never use them? I'm not really sure.

bobcatfish · 2021-02-09T17:00:24Z

If we're talking about ONLY making these results available in the TaskRun status then I think the behavior sounds reasonable. (i.e. the TaskRun is failling, but previous steps - or even the failing step maybe? - have successfully written results to their termination messages, so we might as well expose them in the status). If we're talking about making the results available to other Tasks in a Pipeline then it feels like a bigger feature.

I don't 100% see how that serves your use case - I agree with Scott that if the motivation is to expose info about why the TaskRun failed, results dont seem like the right way to do that, and the best answer there might even be to make it easier to surface related logs.

pritidesai · 2021-02-10T19:30:06Z

If we're talking about ONLY making these results available in the TaskRun status then I think the behavior sounds reasonable.

It does sound reasonable but would not justify.

If the results are part of the status, the users would like to consume those results in finally or make them part of pipeline results.

expose info about why the TaskRun failed, results dont seem like the right way to do that

Yup, I agree. Results are mainly produced to be consumed by other tasks in the pipeline. We do not have any other means of communicating from a step to the users. We could introduce a step level output steps.<step-name>.output or similar and make it part of the status to make it available to other workflows/APIs.

bobcatfish · 2021-02-16T21:50:36Z

If the results are part of the status, the users would like to consume those results in finally or make them part of pipeline results.

I think it depends on who is consuming this information - if you are consuming it within the pipeline, you'll need some mechanism to provide the info via a variable (and allow for the failure of hte task :O ); but if it's a system built on top of Tekton (e.g. a dashboard) then making the results available in the status would be enough.

tekton-robot · 2021-05-17T22:01:08Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

ppitonak · 2021-05-21T15:25:30Z

/remove-lifecycle stale

It would make sense to at least allow publishing pipeline results that are not depending on results of a failing task, the minimal reproducer is below. In my real-world use-case, I have a pipeline with task A publishing a result, task B using task A's result, pipeline publishes a result based on task A result. When task A succeeds and task B fails, pipeline doesn't publish result.

apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: hello-pipeline
spec:
  tasks:
  - name: hello
    taskSpec:
      results:
      - name: greeting
      steps:
      - name: greet
        image: registry.access.redhat.com/ubi8/ubi-minimal
        script: |
          #!/usr/bin/env bash
          set -e
          echo -n "Hello World!" | tee $(results.greeting.path)
          exit 1
  results:
  - name: myresult
    value: absolutely unrelated to hello task

afrittoli · 2021-09-16T08:53:40Z

Yup, I agree. Results are mainly produced to be consumed by other tasks in the pipeline.

Uhm, I kind of disagree on this statement.
Results are the outputs of taskruns and pipelines.

Results of taskruns, when running within a pipeline are likely to be used by other tasks.
They are also going to be in attestations, they are written in the status and this accessible to other controllers watching the taskrun and they are sent as part of cloudevents for remote systems to use.

Results of standalone taskruns and pipelineruns are mainly meant for other pipelines or systems to use, whether is displaying them to the user or using them as input of the next part of the workflow.

afrittoli · 2021-09-16T09:12:07Z

I was re-reading the whole thread, and I think there are three different things describe in the thread:

taskruns emitting results in case of failure. We do not today because we don't know if the results are valid. We could change this to emitting them (relatively small code change), we would be telling our users - this is what we go from the task, but since the task fail, it may be or may be not valid.
pipelinerun emitting results in case of failure. Pipelineruns are different because they basically only relay results. They may do some string concatenation but that's about it. We could at least change the current behaviour to emit results that depend on tasks that did not fail. If we implemented (1), then we could emit all results regardless.
error specific result. One option here would be to pass the stderr of the failed step in a result, but that might be large and/or not helpful. We could allow task authors to specify the error, but that would be a lot more complex to implement since we would need to provide authors with a facility to create those results in case of failure, kind of a "finally step" or so. And it would require adding this facility to all existing catalog tasks.

Implementing (2) alone would solve @ppitonak issue and would be easy to implement.
We could include a fixed string (i.e. NOT_AVAILABLE) in results that are not available, but if we did that we would need then to user no breaking changes on it.

dalbar · 2021-09-17T09:52:25Z

Yup, I agree. Results are mainly produced to be consumed by other tasks in the pipeline.

Uhm, I kind of disagree on this statement.
Results are the outputs of taskruns and pipelines.

Results of taskruns, when running within a pipeline are likely to be used by other tasks.
They are also going to be in attestations, they are written in the status and this accessible to other controllers watching the taskrun and they are sent as part of cloudevents for remote systems to use.

Results of standalone taskruns and pipelineruns are mainly meant for other pipelines or systems to use, whether is displaying them to the user or using them as input of the next part of the workflow.

I am in a similar situation where the system is build on top of Tekton and its controllers are collecting TaskRun results. So far we are only collecting results for successful outcomes. However, a use-case for error messages has been raised where we would like to give our users a hint for the failure. For example our git-wrapper can report missing credentials within a TaskRun. We would like to collect the errors which is currently not possible in a straightforward way.

I was re-reading the whole thread, and I think there are three different things describe in the thread:

taskruns emitting results in case of failure. We do not today because we don't know if the results are valid. We could change this to emitting them (relatively small code change), we would be telling our users - this is what we go from the task, but since the task fail, it may be or may be not valid.

pipelinerun emitting results in case of failure. Pipelineruns are different because they basically only relay results. They may do some string concatenation but that's about it. We could at least change the current behaviour to emit results that depend on tasks that did not fail. If we implemented (1), then we could emit all results regardless.

error specific result. One option here would be to pass the stderr of the failed step in a result, but that might be large and/or not helpful. We could allow task authors to specify the error, but that would be a lot more complex to implement since we would need to provide authors with a facility to create those results in case of failure, kind of a "finally step" or so. And it would require adding this facility to all existing catalog tasks.

Implementing (2) alone would solve @ppitonak issue and would be easy to implement.
We could include a fixed string (i.e. NOT_AVAILABLE) in results that are not available, but if we did that we would need then to user no breaking changes on it.

The first option would be sufficient for the described use-case and I wouldn't mind contributing to Tekton.

pritidesai · 2021-10-05T22:05:55Z

I have opened a discussion addressing pipeline results with a failed pipelineRun.

tekton-robot · 2022-01-03T22:35:27Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot · 2022-02-02T22:48:28Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

pritidesai · 2022-04-28T19:02:26Z

@Yablargo onError might be beneficial for this use case. Let me know your thoughts?

S4lt5 · 2022-04-28T19:37:42Z

Gotcha, Producing task result with onError might actually do the trick

wzhanw · 2022-05-09T05:42:14Z

@pritidesai applying onError: continue will change the pipelinerun and taskrun from failed to Succeeded.
In our use case, we don't want that, we only need the task results available.

Publish Pipeline Results from successful TaskRuns in Failed PipelineRuns: tektoncd#3749 Prior to this change a failed PipelineRun does not emit any PipelineResult even when some PipelineTasks referenced by the PipelineResult succeeded. This commit changes to publish Pipeline Results only from successful PipelineTaskRuns in the failed PipelineRun Such Results are valid because they are generated by successful TaskRuns. They can provide more context and successful artifacts to users of the partially successful PipelinRun.

danielfbm · 2022-08-30T03:01:37Z

I was re-reading the whole thread, and I think there are three different things describe in the thread:

taskruns emitting results in case of failure. We do not today because we don't know if the results are valid. We could change this to emitting them (relatively small code change), we would be telling our users - this is what we go from the task, but since the task fail, it may be or may be not valid.

pipelinerun emitting results in case of failure. Pipelineruns are different because they basically only relay results. They may do some string concatenation but that's about it. We could at least change the current behaviour to emit results that depend on tasks that did not fail. If we implemented (1), then we could emit all results regardless.

error specific result. One option here would be to pass the stderr of the failed step in a result, but that might be large and/or not helpful. We could allow task authors to specify the error, but that would be a lot more complex to implement since we would need to provide authors with a facility to create those results in case of failure, kind of a "finally step" or so. And it would require adding this facility to all existing catalog tasks.

Implementing (2) alone would solve @ppitonak issue and would be easy to implement. We could include a fixed string (i.e. NOT_AVAILABLE) in results that are not available, but if we did that we would need then to user no breaking changes on it.

+1 for emitting taskrun results despite of the status. In our case we want to do execute a task, collect data, and implement validation over the collected data. Although the task failed we really would like to emit the results for clarity of context and display. Currently it is possible to parse the container exit message and achieve the same result but if 1 is implemented this should be easier to integrate

This commit aims to fix tektoncd#6383, tektoncd#6139 and supports tektoncd#3749. This commits skip the validation if the taskrun is not successful (e.g. failed or skipped) and omit the pipelinerun coresponding result. This way the skipped taskrun won't get validation error for pipelinerun. And the pipelinerun error won't be overwritten by the validation error. Signed-off-by: Yongxuan Zhang yongxuanzhang@google.com

This commit aims to fix #6383, #6139 and supports #3749. This commits skip the validation if the taskrun is not successful (e.g. failed or skipped) and omit the pipelinerun coresponding result. This way the skipped taskrun won't get validation error for pipelinerun. And the pipelinerun error won't be overwritten by the validation error. Signed-off-by: Yongxuan Zhang yongxuanzhang@google.com

This commit aims to fix tektoncd#6383, tektoncd#6139 and supports tektoncd#3749. This commits skip the validation if the taskrun is not successful (e.g. failed or skipped) and omit the pipelinerun coresponding result. This way the skipped taskrun won't get validation error for pipelinerun. And the pipelinerun error won't be overwritten by the validation error. Signed-off-by: Yongxuan Zhang yongxuanzhang@google.com

This commit aims to fix #6383, #6139 and supports #3749. This commits skip the validation if the taskrun is not successful (e.g. failed or skipped) and omit the pipelinerun coresponding result. This way the skipped taskrun won't get validation error for pipelinerun. And the pipelinerun error won't be overwritten by the validation error. Signed-off-by: Yongxuan Zhang yongxuanzhang@google.com

lcarva · 2023-09-11T18:52:14Z

#5749 which has been implemented via #6510 seems to achieve the desired state. Should this issue be closed, or is this potentially about something else?

pritidesai · 2023-09-12T17:03:01Z

Hey @lcarva We do not have a test/example to support the use of task results from a failed task in pipeline results. At least I can not recollect seeing it and remember that as something pending.

afrittoli added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 4, 2021

afrittoli mentioned this issue Feb 5, 2021

Fix the check labels job tektoncd/plumbing#712

Merged

2 tasks

afrittoli changed the title ~~Results for failed task and pipeline runs~~ Publish results when task and pipeline runs fail Feb 8, 2021

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 17, 2021

tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 21, 2021

pritidesai mentioned this issue Jul 27, 2021

Finally task is skipped when dependent on results from a failed task #4118

Closed

afrittoli mentioned this issue Oct 28, 2021

Automate Pull Request Checklist - Check Tasks Completed tektoncd/plumbing#911

Merged

2 tasks

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 3, 2022

This was referenced Jun 30, 2022

Publish Pipeline Results from successful TaskRuns in Failed PipelineRuns #5060

Merged

Publish Results produced by failed TaskRun #5064

Closed

wlynch mentioned this issue Jul 7, 2022

Chains creates attestation for non-successful TaskRuns tektoncd/chains#484

Open

QuanZhang-William mentioned this issue Aug 29, 2022

[TEP-0050] Add proposed design details tektoncd/community#785

Merged

gee4vee mentioned this issue Sep 1, 2022

feat: refactor alert task IBM/devex-skit-monitor-toolchain#53

Merged

xchapter7x added this to Tekton Community Roadmap Sep 20, 2022

xchapter7x moved this to Todo in Tekton Community Roadmap Sep 20, 2022

tlawrie mentioned this issue Sep 20, 2022

Output parameters not shown in case of task failure boomerang-io/community#363

Open

jerop mentioned this issue Feb 2, 2023

able to get task results from failed tasks #5749

Closed

Yongxuanzhang mentioned this issue Mar 17, 2023

divide ApplyTaskResultsToPipelineResults tests into success and error #6388

Merged

7 tasks

Yongxuanzhang mentioned this issue Mar 20, 2023

don't return validation error when taskrun failed/skipped #6395

Merged

7 tasks

QuanZhang-William mentioned this issue Dec 7, 2023

Ignore Task Failure - promotion to beta tracking issues #7469

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish results when task and pipeline runs fail #3749

Publish results when task and pipeline runs fail #3749

afrittoli commented Feb 4, 2021

ghost commented Feb 4, 2021 •

edited by ghost

Loading

ghost commented Feb 4, 2021

afrittoli commented Feb 8, 2021 •

edited

Loading

ghost commented Feb 8, 2021

afrittoli commented Feb 8, 2021

ghost commented Feb 8, 2021

bobcatfish commented Feb 8, 2021

afrittoli commented Feb 8, 2021

ghost commented Feb 9, 2021 •

edited by ghost

Loading

bobcatfish commented Feb 9, 2021

pritidesai commented Feb 10, 2021

bobcatfish commented Feb 16, 2021

tekton-robot commented May 17, 2021

ppitonak commented May 21, 2021

afrittoli commented Sep 16, 2021 •

edited

Loading

afrittoli commented Sep 16, 2021

dalbar commented Sep 17, 2021

pritidesai commented Oct 5, 2021

tekton-robot commented Jan 3, 2022

tekton-robot commented Feb 2, 2022

pritidesai commented Apr 28, 2022

S4lt5 commented Apr 28, 2022

wzhanw commented May 9, 2022

danielfbm commented Aug 30, 2022

lcarva commented Sep 11, 2023

pritidesai commented Sep 12, 2023

Publish results when task and pipeline runs fail #3749

Publish results when task and pipeline runs fail #3749

Comments

afrittoli commented Feb 4, 2021

Feature request

Use case

ghost commented Feb 4, 2021 • edited by ghost Loading

ghost commented Feb 4, 2021

afrittoli commented Feb 8, 2021 • edited Loading

ghost commented Feb 8, 2021

afrittoli commented Feb 8, 2021

ghost commented Feb 8, 2021

bobcatfish commented Feb 8, 2021

afrittoli commented Feb 8, 2021

ghost commented Feb 9, 2021 • edited by ghost Loading

bobcatfish commented Feb 9, 2021

pritidesai commented Feb 10, 2021

bobcatfish commented Feb 16, 2021

tekton-robot commented May 17, 2021

ppitonak commented May 21, 2021

afrittoli commented Sep 16, 2021 • edited Loading

afrittoli commented Sep 16, 2021

dalbar commented Sep 17, 2021

pritidesai commented Oct 5, 2021

tekton-robot commented Jan 3, 2022

tekton-robot commented Feb 2, 2022

pritidesai commented Apr 28, 2022

S4lt5 commented Apr 28, 2022

wzhanw commented May 9, 2022

danielfbm commented Aug 30, 2022

lcarva commented Sep 11, 2023

pritidesai commented Sep 12, 2023

ghost commented Feb 4, 2021 •

edited by ghost

Loading

afrittoli commented Feb 8, 2021 •

edited

Loading

ghost commented Feb 9, 2021 •

edited by ghost

Loading

afrittoli commented Sep 16, 2021 •

edited

Loading