Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix import issue without remote config in the target #3120

Merged
merged 1 commit into from
Jan 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion dvc/dependency/repo.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from dvc.exceptions import OutputNotFoundError
from dvc.exceptions import NoOutputInExternalRepoError
from dvc.exceptions import PathMissingError
from dvc.config import NoRemoteError
from dvc.utils.fs import fs_copy
from dvc.path_info import PathInfo
from dvc.scm import SCM
Expand Down Expand Up @@ -91,7 +92,14 @@ def fetch(self):

out = repo.find_out_by_relpath(self.def_path)
with repo.state:
repo.cloud.pull(out.get_used_cache())
try:
repo.cloud.pull(out.get_used_cache())
except NoRemoteError:
# It would not be good idea to raise exception if the
# file is already present in the cache
if not self.repo.cache.local.changed_cache(out.checksum):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a correct logic for dir outs, it should be just out.changed_cache().

However, I don't understand the overall approach, we just created that repo it has empty cache. What's the point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, this is for local import/shared cache case. I would say we should change .pull() implementation to check local cache first and only go for upstream when something is missing in local cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Suor please see #3120 (comment) . out.change_cache() would be better, true.

return out
raise
Comment on lines +95 to +102
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a side note: we might later consider making pull in general not require remote if everything is already locally available. This is out of focus for this PR, current implementation works great for what we need it for πŸ‘ So no actions required here.


return out

Expand Down
22 changes: 22 additions & 0 deletions tests/func/test_import.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@
from dvc.exceptions import DownloadError
from dvc.exceptions import PathMissingError
from dvc.exceptions import NoOutputInExternalRepoError
from dvc.config import NoRemoteError
from dvc.stage import Stage
from dvc.system import System
from dvc.utils.fs import makedirs
from dvc.compat import fspath
import dvc.data_cloud as cloud
from tests.utils import trees_equal


Expand Down Expand Up @@ -56,6 +58,26 @@ def test_import_git_file(erepo_dir, tmp_dir, dvc, scm, src_is_dvc):
}


def test_import_cached_file(erepo_dir, tmp_dir, dvc, scm, monkeypatch):
src = "some_file"
dst = "some_file_imported"

with erepo_dir.chdir():
erepo_dir.dvc_gen({src: "hello"}, commit="add a regular file")

tmp_dir.dvc_gen({dst: "hello"})
(tmp_dir / dst).unlink()

remote_exception = NoRemoteError("dvc import")
with patch.object(cloud.DataCloud, "pull", side_effect=remote_exception):
tmp_dir.dvc.imp(fspath(erepo_dir), src, dst)

assert (tmp_dir / dst).is_file()
assert filecmp.cmp(
fspath(erepo_dir / src), fspath(tmp_dir / dst), shallow=False
)


@pytest.mark.parametrize("src_is_dvc", [True, False])
def test_import_git_dir(erepo_dir, tmp_dir, dvc, scm, src_is_dvc):
if not src_is_dvc:
Expand Down