Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tree hashing with nested empty directories #1522

Merged
merged 1 commit into from
Dec 3, 2019

Conversation

staticfloat
Copy link
Member

@staticfloat staticfloat commented Dec 3, 2019

When testing for directories to exclude from hashing, we must exclude
not only empty directories, but also directories that themselves contain
nothing but empty directories; in essence, we suppress adding
directories that have no files contained within their entire subtree.

While fixing this, it seemed prudent to eliminate the names argument
to tree_hash(), especially as it was not actually used to iterate
over.

This will fix the error reported in JuliaLang/julia#33979 (comment)

When testing for directories to exclude from hashing, we must exclude
not only empty directories, but also directories that themselves contain
nothing but empty directories; in essence, we suppress adding
directories that have no files contained within their entire subtree.

While fixing this, it seemed prudent to eliminate the `names` argument
to `tree_hash()`, especially as it was not actually used to iterate
over.

Fixes JuliaLang/julia#33979 (comment)
@staticfloat
Copy link
Member Author

I'm explicitly requesting a review from @StefanKarpinski just to make sure that my changing the signature of tree_hash() doesn't effect anything anywhere else; I figured the names parameter was intended only for efficiency reasons, but since it wasn't used anyway (and has less of a chance to be used now) I figure it's okay to drop it.

fredrikekre pushed a commit that referenced this pull request Dec 3, 2019
When testing for directories to exclude from hashing, we must exclude
not only empty directories, but also directories that themselves contain
nothing but empty directories; in essence, we suppress adding
directories that have no files contained within their entire subtree.

While fixing this, it seemed prudent to eliminate the `names` argument
to `tree_hash()`, especially as it was not actually used to iterate
over.

Fixes JuliaLang/julia#33979 (comment)

(cherry picked from commit 648d66c, PR #1522)
@codecov
Copy link

codecov bot commented Dec 3, 2019

Codecov Report

Merging #1522 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1522      +/-   ##
==========================================
+ Coverage   86.73%   86.74%   +<.01%     
==========================================
  Files          25       25              
  Lines        5451     5454       +3     
==========================================
+ Hits         4728     4731       +3     
  Misses        723      723
Impacted Files Coverage Δ
src/GitTools.jl 90.71% <100%> (+0.2%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c2dddd...648d66c. Read the comment docs.

@StefanKarpinski
Copy link
Member

Not using the names argument was a mistake. I meant to replace the call to readdir with it. Looks good to me. I have to say this makes it seems even wonkier there git doesn’t just allow empty directories.

@KristofferC KristofferC merged commit 49ab53e into master Dec 3, 2019
@KristofferC KristofferC deleted the sf/fix_empty_tree_hashing branch December 3, 2019 15:20
fredrikekre pushed a commit that referenced this pull request Dec 3, 2019
When testing for directories to exclude from hashing, we must exclude
not only empty directories, but also directories that themselves contain
nothing but empty directories; in essence, we suppress adding
directories that have no files contained within their entire subtree.

While fixing this, it seemed prudent to eliminate the `names` argument
to `tree_hash()`, especially as it was not actually used to iterate
over.

(cherry picked from commit 49ab53e, PR #1522)
fredrikekre pushed a commit that referenced this pull request Dec 3, 2019
When testing for directories to exclude from hashing, we must exclude
not only empty directories, but also directories that themselves contain
nothing but empty directories; in essence, we suppress adding
directories that have no files contained within their entire subtree.

While fixing this, it seemed prudent to eliminate the `names` argument
to `tree_hash()`, especially as it was not actually used to iterate
over.

(cherry picked from commit 49ab53e, PR #1522)
@staticfloat
Copy link
Member Author

Yeah, git is very "files" focused; so it does kind of feel like directories are kind of this strange addition that was bolted on, rather than designed in from the beginning.

@StefanKarpinski
Copy link
Member

It's bizarre for a number of reasons:

  1. a feature that people want
  2. it would be more straightforward to compute tree hashes if it was allowed
  3. there exists workarounds like putting an empty .gitignore file in the directory to commit it
  4. it would be mostly non-breaking for git to start allowing it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants