-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
custom xtriggers: undesired behaviour w/ illegal dict #3237
Comments
@sadielbartholomew - see my comments in #3232 If an xtrigger function returns an illegal result, it is technically broken and your suite is presumably going to stall with tasks waiting on it. (Because we - correctly - don't treat a broken xtrigger as "satisfied"). So what to do about that? We could:
Number 2. may be preferable so long as you can fix the trigger function in the live suite, but I don't think that 1. is "strange behaviour" 😁 |
(Sorry, I see your "strange behaviour" comment doesn't relate to the suite shutdown). |
Re-reading more carefully (sorry)! I guess we need to distinguish between "broken" xtrigger functions that return totally illegal results (i.e. anything but In the latter case, it is correct that the downstream task triggered (because the trigger was satisfied) but the illegal results dict caused problems. |
I ran into this as well while trying to write some custom xtrigger functions. In my case, the suite didn't stall because the tasks were waiting on it (as @hjoliver) but rather crashed in a json decode that led to the suite controller itself becoming unresponsive - probably not the intended behavior. I'm OK with the second option above, but having an xtrigger function return totally illegal results shouldn't freeze things up. |
Thanks for reporting your observations @trwhitcomb. That is definitely a prompt for us to investigate further. In general I agree that a bad
is correct, so would not be a bug. Actually when I wrote up the Issue (probably only very clearly, sorry!) as well as the shutdown it appeared that they were tasks emerging or persisting in states that were not right, as depicted somewhat in the Cylc Review screenshot, but I still need to the bottom of the 'strangeness' occurring in that case. Also, for the clean shutdown, I think a specific error regarding an illegal xtrigger return To go forward, I'll try out various cases for an illegal (Sorry @hjoliver for the delay in responding to your previous comments, & those on the other xtrigger Issue I raised at a similar time, #3232, when I went back to look into these after some priority assigned work, I ran into #3275 which seemed the most pressing xtrigger aspect to address!) |
@sadielbartholomew @trwhitcomb - yes, agreed that:
|
I vote for suite staying up and remain responsive in all cases. The events should be logged as errors like task failures. |
Should be closed with #3497 |
In the
results
dictionary returned in the satisfied signature(True, results)
from a custom external trigger function, certain "bad" dictionary values will cause the suite to shut down.I will at some point survey the nature of what constitutes a "bad" value; possibly they are just values which are other structures such as (further) dictionaries (forming an overall nested
results
dict), as in the example used here.We can document that
results
must be in a certain form, but can we also catch such errors for invalid return values from xtriggers so that they do not propagate & cause strange task states & suite shutdown, as described below? Validating the environment variables and their values produced by theresults
dict to ensure they are sound before passing them to downstream tasks seems like the ideal solution, but that may be too tricky (or inefficient, etc.) to set up.Release version(s) and/or repository branch(es) affected?
master
branch i.e.8.0a0
(+ earlier versions).Steps to reproduce the bug
Example with dictionary values in
results
Using the xtrigger given in cylc/cylc-doc#41 in this example suite, which with the "nested" argument returns
results
dict values that are themselves dictionaries:produces the logically wrong task state even for the independent task
to_get_some_cycles_going
, as shown in the screenshot below, & a KeyError shuts down the suite, with the following in thelog/suite/log
:Expected behavior
A above, I'm not sure the best way to manage this, but there should be way to prevent bad xtrigger return values from making the suite behave strangely in runtime (not just for downstream tasks) &/or shutting down.
Screenshots
The text was updated successfully, but these errors were encountered: