Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import: intercept and rephrase OutputNotFound message #2777

Merged
merged 4 commits into from
Nov 20, 2019

Conversation

pared
Copy link
Contributor

@pared pared commented Nov 11, 2019

  • ❗ Have you followed the guidelines in the Contributing to DVC list?

  • πŸ“– Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.

  • ❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addresses. Please review them carefully and fix those that actually improve code or fix bugs.

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

Related to #2602

@pared
Copy link
Contributor Author

pared commented Nov 11, 2019

Example:

#!/bin/bash

rm -rf repo
mkdir repo

pushd repo
dvc init --no-scm

dvc import git@github.com:iterative/dataset-registry.git invalid/path 

was:
https://asciinema.org/a/280576

after change:
https://asciinema.org/a/280577

@pared pared requested a review from jorgeorpinel November 11, 2019 15:20
@Suor
Copy link
Contributor

Suor commented Nov 11, 2019

The thing with this it affects not only import, but also get as well as dvc.api.open()/read() calls. The idea expressed here for remotes missing for external repos.

It says basically wrap the yield in external_repo().

@pared
Copy link
Contributor Author

pared commented Nov 12, 2019

Thanks @Suor for pointing that out. Ill fix that.

@@ -418,10 +418,15 @@ def stages(self):
return get_stages(self.graph)

def find_outs_by_path(self, path, outs=None, recursive=False):
abs_path = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why do we need this if we have find_outs_by_relpath already?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you've removed it.

@@ -49,10 +49,10 @@ def _make_repo(self, **overrides):

def status(self):
with self._make_repo() as repo:
current = repo.find_out_by_relpath(self.def_path).info
current = repo.find_out_by_path(self.def_path).info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this was required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically its not required, though considering change in find_outs_by_path, now find_out_by_relpath will work both with abs and relpath, thats why I decided to rename it.

Copy link
Contributor

@efiop efiop Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pared yeah, but why do we need to make find_out_by_path accept relpaths? Just wondering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To pass original path to OutputNotFoundError. It makes sense because it passes originally inputed path to error, which is easier to understand, since its user input. Calculating relpaths/abspaths was the original reason for ambiguous paths flooding stderr when output could not be found.

@pared pared force-pushed the 2602 branch 3 times, most recently from 27db275 to 5c59d74 Compare November 12, 2019 12:29
Comment on lines 421 to 423
abs_path = (
os.path.join(self.root_dir, path)
if not os.path.isabs(path)
else path
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used to support paths relative to curdir, it won't after this change. Now it understands non absolute paths as relative to root dir. So this is a change in semantics. Will it break something? Why do we need it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right, should make abspath out of it.
I moved abspath calculation here so that I could pass path to OutputNotFoundError. In case of relative path passed by user, OutputNotFoundError can use the input for error message, which is more meaningful than printing full path or path relative to curdir, as it used to be.
https://asciinema.org/a/280576
vs
https://asciinema.org/a/280577

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it should stay relative to curdir for outs within current repo. Like it or not, but it is what we are doing everywhere now.

Comment on lines 24 to 31
try:
yield repo
except OutputNotFoundError as cause:
raise NoOutputInExternalRepoError(url, cause.failed_output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I suggested it, but it is fragile. We don't know whether that OutputNotFoundError we catch here comes from the yielded repo. To fix this we will probably need to pass repo to OutputNotFoundError and check it here:

if cause.repo is repo:
    # wrap
else:
    raise

I also wouldn't call it cause but simply exc, it only becomes a cause when we wrap it not when we catch it.

dvc/exceptions.py Outdated Show resolved Hide resolved
@pared pared force-pushed the 2602 branch 2 times, most recently from efd50fc to a55292b Compare November 13, 2019 17:22
@pared pared requested a review from Suor November 13, 2019 17:29
@pared pared force-pushed the 2602 branch 4 times, most recently from 291dfc3 to 822f5c1 Compare November 15, 2019 15:52
@pared pared changed the title import: intercept and rephrase OutputNotFound message [WIP] import: intercept and rephrase OutputNotFound message Nov 15, 2019
@efiop efiop requested a review from jorgeorpinel November 19, 2019 13:37
@efiop
Copy link
Contributor

efiop commented Nov 19, 2019

@Suor @jorgeorpinel Please review.

EDIT: sorry, nevermind, this is a WIP, seems like

@pared pared changed the title [WIP] import: intercept and rephrase OutputNotFound message import: intercept and rephrase OutputNotFound message Nov 19, 2019
@pared
Copy link
Contributor Author

pared commented Nov 19, 2019

@Suor, @jorgeorpinel I think it should be ready now.

@efiop
Copy link
Contributor

efiop commented Nov 20, 2019

For the record: unrelated tests failed, that are already fixed on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants