Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement named flows, and improve task logging. #4287

Closed
wants to merge 1 commit into from

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Jul 2, 2021

This has been discussed, but we don't seem to have an issue up.

Motivation: while investigating "pre-initial dependency ignore" for individual flows (viewed as essential by @dpmatthews) I realized several problems with the current automatic flow-label management ... which makes user-named flows more than just a nice-to-have. Currently we use a set pool of ascii characters for flow names: users don't have to do anything, just use the given names, and merged flows can be labeled by simply concatenating characters rather than dealing with sets of multiple separate names. BUT it turns out:

  • you can't automatically prune flows that (seem to have) taken over the whole graph, because they might not have taken over, in which case pruning them can result in "conditional reflow" problems [UPDATE I'm no longer sure this is true, but see next point anyway]
  • and even if you could prune them, you can't return pruned labels for re-use because flow labels have to be unique for the life of the workflow (unless you can guarantee the same-named flows won't overlap ... or you want them to overlap), which means a finite pool of automatic names can run out. (And non-finite methods require sets for merged flows ... so let's go straight to user-named flows already).

named flows

To trigger a new flow you have to give it a name:

$ cylc trigger --flow=warm1 <workflow> foo.2021  # trigger a new flow

Otherwise, it's a one-off trigger (no reflow):

$ cylc trigger <workflow> foo.2021  # trigger task only

The original flow is called original. Merged flows are represented by the set of constituent flow names. E.g. if a warm1 task catches up with its original counterpart, the resulting merged task will be labelled with flows = {"original", "warm1"} and downstream spawning is considered to belong to both flows (for conditional reflow prevention, for example).

To stop a flow:

$  cylc stop --flow=warm1 <workflow>

This removes the warm1 flow label from all task proxies. A task with no flow labels won't spawn children, which stops the flow. (Flows can also just peter out, e.g. if they affect only a sub-graph off the side of the main graph).

Flow labels are task proxy attributes (also in the datastore). At the moment, to see them you have to use cylc dump --flows or watch the scheduler log.

logging changes

This sort of change always results in a lot of log-following to analyse why functional tests broke. This time the annoyingness of the current logging mess got the better of me so I made some improvements. All task state (and state attribute) changes are now logged automatically (via the state reset method) and consistently, e.g.:

$ cylc cat-log demo/run1 | grep foo
2021-07-02T16:36:35+12:00 INFO - [foo.1 waiting(runahead) job:00] => waiting
2021-07-02T16:36:35+12:00 INFO - [foo.1 waiting job:00] => waiting(queued)
2021-07-02T16:36:35+12:00 INFO - [foo.1 waiting(queued) job:00] => waiting
2021-07-02T16:36:35+12:00 INFO - [foo.1 waiting job:00] => preparing
2021-07-02T16:36:35+12:00 INFO - [foo.1 preparing job:01] host=niwa-1007823l
2021-07-02T16:36:35+12:00 INFO - [foo.1] -triggered off []
2021-07-02T16:36:36+12:00 INFO - [foo.1 preparing job:01] (internal)submitted at 2021-07-02T16:36:36+12:00
2021-07-02T16:36:36+12:00 INFO - [foo.1 preparing job:01] submitted to localhost:background[4953]
2021-07-02T16:36:36+12:00 INFO - [foo.1 preparing job:01] => submitted
2021-07-02T16:36:36+12:00 INFO - [foo.1 submitted job:01] health: submission timeout=None, polling intervals=PT15M,...
2021-07-02T16:36:37+12:00 INFO - [foo.1 submitted job:01] (received)started at 2021-07-02T16:36:36+12:00
2021-07-02T16:36:37+12:00 INFO - [foo.1 submitted job:01] => running
2021-07-02T16:36:37+12:00 INFO - [foo.1 running job:01] health: execution timeout=None, polling intervals=PT15M,...
2021-07-02T16:36:42+12:00 INFO - [foo.1 running job:01] (received)succeeded at 2021-07-02T16:36:41+12:00

Flows are ignored in logging if original is the only one (above). Otherwise:

...
2021-07-02T16:36:37+12:00 INFO - [foo.4 submitted job:01 flows:warm1] => running
...
2021-07-02T16:42:21+12:00 INFO - [foo.5 running job:01 flows:original,warm1] Merged flows warm1
2021-07-02T16:36:37+12:00 INFO - [foo.5 submitted job:01 flows:original,warm1] => running

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.py and conda-environment.yml. (None)
  • Already covered by (modified) existing tests.
  • Appropriate change log entry included.
  • (master branch) I have opened a documentation PR at cylc/cylc-doc/pull/XXXX.
  • No documentation update required.

@hjoliver hjoliver added this to the cylc-8.0.0 milestone Jul 2, 2021
@hjoliver hjoliver self-assigned this Jul 2, 2021
@hjoliver hjoliver force-pushed the named-flows branch 2 times, most recently from c74819f to aa4a7ea Compare July 2, 2021 08:49
@wxtim
Copy link
Member

wxtim commented Jul 5, 2021

@hjoliver What is the status of this PR? Would you like me to:

  • Resolve the conflict with changes.md?
  • Review?

@hjoliver
Copy link
Member Author

hjoliver commented Jul 6, 2021

@hjoliver What is the status of this PR? Would you like me to:

* Resolve the conflict with changes.md?

* Review?

Thanks for asking @wxtim . It's blocked for a short while pending an off-side meeting with @dpmatthews and @oliver-sanders on several spawn-on-demand topics (scheduled for Wednesday). I should have converted to draft ..will do that now.

@hjoliver hjoliver marked this pull request as draft July 6, 2021 00:58
@oliver-sanders
Copy link
Member

This makes a lot of sense, nice.

All task state (and state attribute) changes are now logged automatically (via the state reset method)

👍

To stop a flow ... This removes the warm1 flow label from all task proxies

🎉

@hjoliver
Copy link
Member Author

hjoliver commented Jul 6, 2021

Outstanding questions:

  • is a user-supplied name sufficient as flow metadata? (I think it probably is)
    • or do we need additional metadata to be associated with the flow?
  • is it a problem that users need to supply a name that must be unique in the lifetime of the workflow (unless there is no chance of overlap)?
    • maybe not, they could just increment an integer suffix each time (--flow=warm9) or use the current datetime (--flow=$(date +%Y/%m/%d:%H:%m))
    • we could automatically maintain an integer flow counter, to use with a user-supplied name prefix
    • or automatically add the datetime, but resulting flow labels become pretty big and harder to read and type out accurately

@hjoliver
Copy link
Member Author

hjoliver commented Jul 7, 2021

Discussed in meeting:

  • general approach agreed
  • automatically incrementing integer flow labels will do
    • much easier to read and type than massive datetime strings
    • with user-supplied name as an option

@oliver-sanders
Copy link
Member

I think we may want to accumulate a bit of metadata around [re]flows so happy with a short auto-generated integer rather than a timestamp (nicer for logging and the CLI).

Some metadata we could add in due course:

  • timestamp (time of the first occasion the flow label was created)
  • user (if actioned from the WUI, the user who actioned it)
  • start tasks? (the tasks [or cycles?] at the head of the flow)
  • description (unlikely to be provided in most cases, however, might be of interest to operations centres)

Looking ahead we will likely want to merge the --flow CLI opt into the universal identifier for a few reasons:

  • This allows the WUI to filter by flow
    • e.g. user might want want a graph with the two flows shown separately
    • related issue reflow cylc-ui#468
  • Possible future [re]flow development.
  • Multi workflow functionality (cylc stop a// b// 'c*//')
  • General completeness.
# current proposal / reference implementation
~user/workflow//cycle/task/job
~user/workflow:status//cycle:status/task:status/job:status  # with all selectors

# possible extension for [re]flow
~user/workflow^flow//

# example
cylc stop myflow^orig

@hjoliver
Copy link
Member Author

Closing as superseded by #4300

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants