Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: handle error on config (re)load #4216

Closed
oliver-sanders opened this issue May 17, 2021 · 11 comments · Fixed by #4720
Closed

config: handle error on config (re)load #4216

oliver-sanders opened this issue May 17, 2021 · 11 comments · Fixed by #4720
Assignees
Labels
could be better Not exactly a bug, but not ideal.
Milestone

Comments

@oliver-sanders
Copy link
Member

oliver-sanders commented May 17, 2021

  • A ParsecError raised in cylc.flow.config should be an "expected error" that gets formatted nicely for the user.
  • A ParsecError raised during workflow runtime is an "unexpected error" (i.e. bug) that should be reported and fixed.

Additionally, a ParsecError (or CylcError) raised during config reload will cause the scheduler to crash. This is a change in behaviour for those moving from Rose 2019 as rose suite-run --reload invoked cylc validate --strict before attempting reload.

Suggest:

  • The scheduler should not die on config errors at reload.
  • ParsecError should be handled differently at config-time as at run-time.

One approach for separating config-time errors is to add a flag on the exception:

try:
    self.cfg = WorkflowConfig(...)
except ParsecError as exc:
    exc.config_time_error = True
    raise

Pull requests welcome!

@MetRonnie
Copy link
Member

A ParsecError raised during workflow runtime is an "expected error" (i.e. bug)

Did you mean "unexpected error"?

Regardless, I don't think we should be logging traceback for ParsecErrors, as the traceback is irrelevant to the user. They just need to know what config item is illegal

@oliver-sanders
Copy link
Member Author

oliver-sanders commented May 17, 2021

Yes.

If the ParsecError occurs at runtime then we do need the traceback because:

  • This error should not be possible, we need to know where it is coming from.
  • Users may report scary traceback but are likely to treat error messages as their own problem.

That is unless there is some legit reason why a ParsecError would be raised at runtime?

@MetRonnie
Copy link
Member

MetRonnie commented May 17, 2021

Why should it not be possible? What if a user runs a workflow with bad setting = P1 - they will get an IllegalItemError (currently with traceback)

@oliver-sanders
Copy link
Member Author

Such errors should only surface in cylc.flow.config when the configuration is loaded, they should not be able to occur whilst the workflow is running after the configuration has been loaded.

@MetRonnie
Copy link
Member

Ah ok, so the workflow config should be loaded before the scheduler starts, instead of in Scheduler.configure()?

@oliver-sanders
Copy link
Member Author

oliver-sanders commented May 17, 2021

That would be nice but isn't really possible ATM (see also reloads which must happen whilst the scheduler is running), this issue suggests catching and handling errors raised from these config loads so they can be formatted nicely whilst leaving any errors raised elsewhere to fall out as traceback.

@oliver-sanders
Copy link
Member Author

oliver-sanders commented May 17, 2021

Aside

Long term it would be nice to move to an interface more like Scheduler(config) than Scheduler(reg)
this would help with the transition to Python API and running workflows from in-memory configs
rather than requiring the flow to be configured on the filesystem. Python API is currently chalked up against Cylc 9.

@MetRonnie
Copy link
Member

MetRonnie commented Feb 10, 2022

@oliver-sanders I'm a little confused - how can we get a ParsecError other than during loading of config?

Assuming ParsecErrors can only occur during loading of config, and we don't want traceback for ParsecErrors during loading of config, then why not just treat them as expected during Scheduler.shutdown()?

@oliver-sanders
Copy link
Member Author

I'm a little confused - how can we get a ParsecError other than during loading of config?

A ParsecError shouldn't occur at runtime, however, lots of things shouldn't happen at runtime, that's why we catch unexpected errors and log them.

(note we do reload the config with cylc reload and reload the global config as part of the auto restart functionality, if a ParsecError were to slip out of one of them, that would be a bug).

@MetRonnie
Copy link
Member

MetRonnie commented Feb 10, 2022

Wait so if a user makes a syntax mistake and reloads, what is supposed to happen? I would have thought:

  • log a single line warning
  • don't shutdown

Currently we

  • log (at warning level) the whole traceback
  • don't shutdown

@oliver-sanders
Copy link
Member Author

  • log a single line warning
  • don't exit/shutdown

Sounds good.

If a ParsecError were to somehow slip outside of the reload context and get into the workflow bubbling up in another location, then we need to log the whole traceback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
could be better Not exactly a bug, but not ideal.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants