Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add: do not verify hardlink if file is empty #3428

Merged
merged 3 commits into from
Mar 11, 2020
Merged

Conversation

skshetry
Copy link
Member

@skshetry skshetry commented Mar 2, 2020

Fixes #3390

  • ❗ Have you followed the guidelines in the Contributing to DVC list?

  • πŸ“– Check this box if this PR does not require documentation updates, or if it does and you have created a separate PR in dvc.org with such updates (or at least opened an issue about it in that repo). Please link below to your PR (or issue) in the dvc.org repo.

  • ❌ Have you checked DeepSource, CodeClimate, and other sanity checks below? We consider their findings recommendatory and don't expect everything to be addressed. Please review them carefully and fix those that actually improve code or fix bugs.

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

@skshetry skshetry self-assigned this Mar 2, 2020
dvc/remote/local.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Mar 2, 2020

Codecov Report

Merging #3428 into master will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3428   +/-   ##
=======================================
  Coverage   93.08%   93.08%           
=======================================
  Files         140      140           
  Lines        8515     8515           
=======================================
  Hits         7926     7926           
  Misses        589      589           

Continue to review full report at Codecov.

Legend - Click here to learn more
Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data
Powered by Codecov. Last update 05cc023...4e9289f. Read the comment docs.

dvc/remote/local.py Outdated Show resolved Hide resolved
Copy link
Contributor

@gurobokum gurobokum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think to fix the issue another way:

  1. in _verify_link check the case when the file is not a link and has size 0 bytes
  2. if it's True - just don't raise an Exception
  3. put the comment with the issue and short description about the case

It allows to fix it in one place without passing inconsistent return value

UPD
I see @pared propsed the same in the comment above

@skshetry
Copy link
Member Author

We (@efiop and I) discussed this on 1o1. A better way to fix this would be to try to create a temporary file to check if the System supports {ref,hard,sym}link. Though, it'll need some refactor, so, I'll create an issue for working on this later.

Comment on lines +96 to +98
if link_type == "hardlink" and self.getsize(path_info) == 0:
return

Copy link
Contributor

@efiop efiop Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not great that it will be checking the size for each link it creates, might get expensive. Though, it does that in hardlink anyway...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did that anyway on the hardlink() though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skshetry Maybe there is some nicer way to do this? Like making hardlink(and other link class methods) verify themselves? This is an honest question, I don't know myself either πŸ™‚

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skshetry Could keep it as is and create a ticket for it to reconsider later. Just asking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this suggestion #3428 (comment) more. Running verification on hardlink itself for only once (by setting self.cache_type_confirmed) is also hackish.

I'd say, we revisit this on next sprint and fix it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skshetry Ok, please create an issue and add it to the next sprint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just check for cache_type_verified at the beggining again. I know that is duplication, but don't see any obvious workaround.

@skshetry skshetry requested review from pared and gurobokum March 10, 2020 10:00
@efiop
Copy link
Contributor

efiop commented Mar 10, 2020

@skshetry Check your tests, travis failed.

Copy link
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the conclusion for this issue is that we will crete issue to fix the way we handle confirmation of cache type?

tests/func/test_add.py Outdated Show resolved Hide resolved
Copy link
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional comment in discussion with @efiop, not anything major.

tests/func/test_add.py Outdated Show resolved Hide resolved
@efiop
Copy link
Contributor

efiop commented Mar 11, 2020

@skshetry Please check the tests.

@efiop efiop merged commit 682275d into iterative:master Mar 11, 2020
@skshetry skshetry deleted the fix-3390 branch March 12, 2020 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add: empty files add broken when cache mode is hardlinks
4 participants