Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove active option of tasks? #1064

Open
BoPeng opened this issue Sep 23, 2018 · 8 comments
Open

Remove active option of tasks? #1064

BoPeng opened this issue Sep 23, 2018 · 8 comments

Comments

@BoPeng
Copy link
Contributor

BoPeng commented Sep 23, 2018

There is an active option of task. The usage is generally as demontrated in the example

!echo "hello" > a.txt
task: active=not path('a.txt').exists()
sh:
    echo "Task executed"

or

task: active=-1
    echo "Only the last substep"

However, both usages can be replaced by the stop_if action

!echo "hello" > a.txt

stop_if(path('a.txt').exists())
task:
sh:
    echo "Task executed"

with an important difference.

For example,

input: for_each={'i': range(5)}
output: f'test_{i}.txt'

task: active=1
sh: expand=True
 touch {_output}

will generate the following

$ sos run test
INFO: Running default:
INFO: a647808578f06138 started
INFO: a647808578f06138 completed
INFO: output:   test_0.txt, test_1.txt, ... (5 items)
ERROR: [default]: Output target test_0.txt does not exist after the completion of step default

because the step expects output test_0.txt, test_1.txt etc but only test_1.txt will be generated by the task.

However, the following works:

input: for_each={'i': range(5)}
output: f'test_{i}.txt'

stop_if(_index != 1)
task: 
sh: expand=True
 touch {_output}

because action stop_if will actually allow sos to discard the entire substep and associated _output.

There are a few solutions to improve the situation:

  1. Remove active and use stop_if to avoid duplicated feature.
  2. Move active to input so that the option can be processed before _output is generated.
  3. Enhance active to discard _output as well. However,

Similar to the task option, we also have an active option for actions, which was designed for something like

input: multiple_input, group_by=1
sh:
do_something
report: active=-1
  generate a report for the entire step

So in this case, allowing active=-1 to discard _output of all other substeps does not make sense.

In summary, active is currently a questionable option that is error prone, has function overlapping with other mechanisms, and should be removed before it can be implemented better.

BoPeng pushed a commit that referenced this issue Sep 23, 2018
@gaow
Copy link
Member

gaow commented Sep 23, 2018

Good move! But I did not realize we can do it for input? I have tried your example:

[1]
input: for_each={'i': range(5)}, active=1
output: f'test_{i}.txt'
sh: expand=True
 touch {_output}

Got this using sos-0.16.11, though

ERROR: [default_1]: Failed to process input statement for_each={'i': range(5)}, active=1
: Unrecognized input option active

@BoPeng
Copy link
Contributor Author

BoPeng commented Sep 23, 2018

Sorry, I mixed what I planned and what is actually implemented. I will update the ticket.

@gaow
Copy link
Member

gaow commented Sep 23, 2018

Okay am trying to dig up related ticket:

#206

seems like we were once attempted that. I'm trying to find the ticket that motivated this feature in the first place and try to see if it was something we needed but cannot be implemented easily using existing features ... do you have an impression?

@BoPeng
Copy link
Contributor Author

BoPeng commented Sep 23, 2018

active was introduced to allow the writing of

if _index == 1:
  sh(script)

as

sh: active=1
  script

because the former does not allow the use of script format. It was later extended to task to allow controlling entire tasks.

My major concern is

input: for_each={'i': range(5)}
output: f'test_{i}.txt'

task: active=1
sh: expand=True
 touch {_output}

because it causes a problem that is not easy to fix, and I am not particularly happy with either of the solutions.

@gaow
Copy link
Member

gaow commented Sep 23, 2018

Ahh sure, I actually use this a lot:

if _index == 1:
  sh(script)

(yes, without script it is not very convenient. That's why I often wrap it with sos_run).

So is this usercase the only reason active now exists? If so, we might instead try to think of how to make this case easier, then we get rid of active

@BoPeng
Copy link
Contributor Author

BoPeng commented Oct 30, 2018

My point was that task is always the last part of the substep, and there is no syntax to wrap it in if or for. Therefore,

task: active =1

can be achieved by

stop_if(_index!=1)
task:

and the latter is safer because it removes _output of _index==1 from step_output.

Basically, task is usually the main part of the substep that produces useful _output, Cancelling it with active= without removing _output from step_output makes this option more harmful than useful.

@gaow
Copy link
Member

gaow commented Oct 30, 2018

Okay, but I cannot do:

stop_if(_index!=1)
R: ...

right? I'd have no hesitation to remove active if i can do this. But I guess we want to remove it anyways because as you pointed out it is more harmful than useful, given all things considered.

@BoPeng
Copy link
Contributor Author

BoPeng commented Oct 30, 2018

Nothing stops you from doing

stop_if(_index!=1)
R: ...

but in this case you might not want the side effect of removing _output from step_output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants