Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orquestra task transition on failure for parallel tasks will not join #4810

Closed
MichaelMcClure opened this issue Nov 4, 2019 · 6 comments
Closed

Comments

@MichaelMcClure
Copy link

SUMMARY

When running tasks in parallel in Orquesta (forking the workflow), tasks that are parallel will not transition to the barrier task (the join task) when the parallel task has a defined task transition based on task failure.
Also working with @jdmeyer3 on this.

STACKSTORM VERSION

st2 3.1.0, on Python 2.7.5
Paste the output of st2 --version:

OS, environment, install method

ENV: Docker
OS: Centos 7.6

Steps to reproduce the problem

With this workflow defined as so:

---
version: '1.0'
input:
  - list1
  - a
vars:
  - p_task_1: unknown
output:
  - p_task_1: '{{ ctx().p_task_1 }}'
tasks:
  setup_task:
    action: core.noop
    # Run tasks in parallel
    next:
      - do: parallel_task_1, parallel_task_2

  parallel_task_1:
    action: core.noop
    next:
      - when: "{{ succeeded() }}"
        publish: p_task_1="succeeded"
        do: barrier_task
      - when: "{{ failed() }}"
        publish: p_task_1="failed"
        #do: barrier_task
      - when: "{{ completed() }}"
        publish: p_task_1="completed"
        do: barrier_task

  parallel_task_2:
    action: core.noop
    next:
      - do: barrier_task

  barrier_task:
    join: all
    action: core.noop

And the workflow Action defined as so:

---
name: test_fork_o
pack: modem_clone
runner_type: orquesta
entry_point: workflows/test_fork_o.yaml
enabled: true
parameters:
  list1:
    type: array
    description: "['one','two']"
    required: true
  a:
    type: string
    description: ""
    required: true
  1. Execute the workflow. It will execute tasks: setup_task, parallel_task_1, parallel_task_2, and barrier_task. ( This is expected and correct and not a defect).

  2. Uncomment line 27 ( #do: barrier_task ) to include a task transition in the case of failure

  3. Execute the workflow.

Expected Results

I expect the workflow to execute all the same tasks including the barrier_task. Like this:
image

Actual Results

It will execute tasks: setup_task, parallel_task_1, parallel_task_2, and stop. It will not execute the barrier task.
image

Attached are the workflows and the log output set on INFO - first block is the success case, 2nd block is the failure case
task_transition_bug.txt

@m4dcoder
Copy link
Contributor

m4dcoder commented Nov 4, 2019

In the current orquesta implementation, the barrier_task is waiting on all incoming branches. Which means it is waiting on all the when branching out from parallel_task_1.

@m4dcoder
Copy link
Contributor

m4dcoder commented Nov 4, 2019

Please see previous documented issue at StackStorm/orquesta#120. There is a workaround that you can try.

@m4dcoder
Copy link
Contributor

m4dcoder commented Nov 4, 2019

In case the workaround is not clear, instead of using join: all, do join: 2 for the number of tasks that will transition into the barrier task.

@MichaelMcClure
Copy link
Author

@m4dcoder Ah. In concrete terms, you are saying it is waiting on all defined incoming branches - so on parallel_task_1 there are 2 branches (one do is commented out), plus one branch from parallell_task_2 for a total of 3.
Correct?
That seems to indicate that, when this does run fully that 3 paths reached the barrier_task. As specified by the "join: all"
Given the parallel_task_1

  parallel_task_1:
    action: core.noop
    next:
      - when: "{{ succeeded() }}"
        publish: p_task_1="succeeded"
        do: barrier_task
      - when: "{{ failed() }}"
        publish: p_task_1="failed"
        #do: barrier_task
      - when: "{{ completed() }}"
        publish: p_task_1="completed"
        do: barrier_task

That means the above task had to succeed AND complete? Is that how tasks work (succeed --> complete and fail --> complete)? (I cannot find a state machine diagram for tasks).

If so, I do understand that part.

Regarding : StackStorm/orquesta#120. -- I agree with that one - that would be useful.

@m4dcoder
Copy link
Contributor

m4dcoder commented Nov 5, 2019

@MichaelMcClure Yes, that's how current join: all works. It is based on the incoming branches that are defined. In the current implementation, the workflow engine is not looking at the context of the incoming branches. We may change that in the future. But for now and for your use case, where you know the number of tasks that need to transition to the barrier task, you can simply use join: 2 instead to workaround the current limitation.

@m4dcoder
Copy link
Contributor

m4dcoder commented Nov 5, 2019

I will be closing this issue since this is already tracked in the orquesta repo.

@m4dcoder m4dcoder closed this as completed Nov 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants