-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confine collection errors at lower level possible #6042
Comments
The way to provide error handling at this granular level might be tricky. I wonder if we can solve this architecturally by detaching As an example, we could build dependencies/Output and track them separately, and add them to the stage later on:
Then, Fixing these in |
I think there might be a way to provide that without altering collection process that much. If we allow the error handler to decide how things are settled. Something like conditions and restarts maybe. |
Handled #6082 in Studio with some monkey patches, which hints how outs might be ignored by a custom error handler: from funcy import monkey, post_processing
@monkey(dvc.output, "_get")
def _get_out(stage, p, info=None, **options):
try:
return _get_out.original(stage, p, info, **options)
except KeyError: # See https://github.com/iterative/dvc/issues/6082
logger.warning("Skipping out '%s' cause no remote found", p)
@monkey(dvc.dependency, "_get")
def _get_dep(stage, p, info):
try:
return _get_dep.original(stage, p, info)
except KeyError: # See https://github.com/iterative/dvc/issues/6082
logger.warning("Skipping dep '%s' cause no remote found", p)
# Skip failed outputs and deps
removing_nones = post_processing(lambda xs: [x for x in xs if x is not None])
for func in ["load_from_pipeline", "loadd_from", "loads_from"]:
setattr(dvc.output, func, removing_nones(getattr(dvc.output, func)))
for func in ["loadd_from", "loads_from"]:
setattr(dvc.dependency, func, removing_nones(getattr(dvc.dependency, func))) What I mean there are some bordering points at which we can stop errors from propagating. So it is a question of how we pass a handler down there. |
Closing as stale. |
There are situations when a DVC repo may be broken, sometimes retroactively, i.e. old commits are broken by new version of DVC. In this situation DVC simply refuses to work when we try to do anything with a broken rev. For many operations, however, we don't need things to be completely correct, i.e. we may still do
dvc metrics show
whenstage.cmd
has a broken interpolation lookup like here, we can even show git tracked things in the absense of a lock file like here. This could be used to make DVC do best effort but also is especially important for handling errors in Studio.There is a draft PR #5984, which seeks to confine this to a single revision, i.e.
dvc exp show
will show one line as errored out but correctly display the other ones. This is a good start, but not granular as it might be. Ideally one needs a way to stop error propagation on key level, out level, stage level and file level so that the rest of the repo will be usable. Of cause "usable" is defined by task:dvc.yaml
and hence certain keys readable (but not all of the file)dvc.lock
fileTo make this error handling appropriate for each task we need to make it customizable, i.e. having an error handler on all of the aforementioned levels and provide it a way to decide how to proceed.
The text was updated successfully, but these errors were encountered: