Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: workflow continue #202

Open
tiborsimko opened this issue Nov 9, 2018 · 3 comments
Open

cli: workflow continue #202

tiborsimko opened this issue Nov 9, 2018 · 3 comments

Comments

@tiborsimko
Copy link
Member

Use case scenario

A user submits a serial workflow consisting of say seven steps, and the workflow execution run number ten fails at "step four" with an exception due to an error in the analysis code. The user spots a problem in the big myfilter.py code touching the execution of step four, inspects the temporary results to make sure that previous steps are not affected, resubmits the corrected version of the myfilter.py code, and would like to resume the execution from the last good known state, i.e. from "step three".

Notes

Note that we cannot use job cache to run previous steps, because the same big file myfilter.py is use for previous steps.

Implementation

We should decited whether we shall:

  • reuse the same run number 10 and simply continue as if nothing happened; this is alike to what users would typically do locally when developing workflows;

  • create a new run number 11 using new workspace directory and hard-copy or symlink files from run number 10's workspace to seed it;

  • create a new run number 11 using the same workspace directory as run number 10; this would necessitate to create a DB table run-number <--> workdir;

Note that the latter functionality may come handy later for sharing workspace directories amongst several people.

CLI

A fictive usage scenario:

$ wget mydata.csv http://example.org/mymeasurement/mydata.csv
$ vim myfilter.py
$ vim myplot.py
$ vim reana.yaml
$ reana-client create
$ reana-client upload ./myfilter.py ./myplot.png ./mydata.csv
$ reana-client start
[Step 1 of 7] python myfilter.py --step1 --param P1 V1
OK myfiltereddata1.csv created
[Step 2 of 7] python myfilter.py --step2 --param P2 V2
OK myfiltereddata2.csv created
[Step 3 of 7] python myfilter.py --step3 --param P3 V3
OK myfiltereddata3.csv created
[Step 4 of 7] python myfilter.py --step4 --param P4 V4
[ERROR] ZeroDivisionError: division by zero
$ reana-client list
$ reana-client download myfilterededata3.csv
$ less myfiltereddata3.csv
$ vim myfilter.py
$ reana-client upload myfilter.py
$ reana-client continue --from step 3

Note that the continue command may be called resume or restart or perhaps even an option to the already-existing start command.

@diegodelemos
Copy link
Member

As discussed live, we will opt for the first approach using a single workflow run with a single workspace. This will lead to the creation of a new dirty state since users should be able to change an stopped/failed workflow and continue:

$ reana-client run
$ # Workflow fails
$ reana-client status
NAME       RUN_NUMBER   CREATED               STATUS    PROGRESS
workflow   4            2018-11-21T08:03:02   failed    1/1
$ # user modifies the spec or files
$ reana-client status
NAME       RUN_NUMBER   CREATED               STATUS    PROGRESS
workflow   4            2018-11-21T08:03:02   dirty     1/1
$ reana-client continue

@tiborsimko
Copy link
Member Author

Note that incremental development is possible via CWL new "target" option. We are going to add this option to Serial and Yadage as part of Live-Mode-Plus improvements. The "dirty" status can be taken then.

@tiborsimko
Copy link
Member Author

The aims behind this issue can be achieved by allowing users to rerun the same workflow on the same workspace. (Useful for debugging code until it is ready without having to upload files etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants