Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git status returns different results when core.fscache enabled #679

Closed
1 task done
latkin opened this issue Mar 1, 2016 · 19 comments
Closed
1 task done

git status returns different results when core.fscache enabled #679

latkin opened this issue Mar 1, 2016 · 19 comments

Comments

@latkin
Copy link

latkin commented Mar 1, 2016

  • I was not able to find an open
    or closed issue
    matching what I'm seeing

Setup

Win10, x64
git version 2.6.3.windows.1
Default install as far as I recall
Problem repros in all shells

Details

  • Clone https://github.com/latkin/filetest
    • foo:bar file has invalid Windows filename. It is copied as size-0 file foo.
    • Add foo to gitignore
  • With core.fscache false, working directory is shown as clean by git status
  • With core.fscache true, file foo:bar is flagged with deleted status.

Expected: Working directory status is reported the same with/without core.fscache, in all cases.

Motivation: I work on windows but most of my team is on OSX. They put files named like foo:bar in the repo. I can happily ignore them with a private gitignore when core.fscache is turned off, but they pollute my working directory when I try to use core.fscache true (which I'd like to do for perf reasons).

@kgybels
Copy link

kgybels commented Mar 1, 2016

@latkin gitignore is only for untracked files. Use git update-index --assume-unchanged foo:bar instead, see git-update-index for more information.

The behavior with core.fscache on seems to be correct, there is no file foo:bar, so it is reported as deleted. That being said, having git deal with paths in your tree, that it can't represent in the working copy seems futile.

@latkin
Copy link
Author

latkin commented Mar 1, 2016

git update-index --assume-unchanged solves my problem perfectly, thank you for the pointer.

I will be on my merry way, but it does seem like something to investigate that fscache true/false changes results here. FWIW I agree the fscache=true result seems more accurate in this case.

@latkin
Copy link
Author

latkin commented Mar 2, 2016

Poked around a bit and see where the root cause is (but not how to fix it).

For whatever reason, _wfopen creates files on Windows even if there are bogus chars in the path (at least :, haven't tested others), and you can write content into such file.

If you look up such a file by full path with GetFileAttributesEx (which is what mingw_lstat does), Windows tells you the file is there, has nonzero length, etc. Thus in fscache = false case, Windows says the file is there so git does not report it as deleted.

If you instead enumerate files in a directory via FindFirstFile/FindNextFile (which is what fscache_lstsat ultimately does), the file is not reported. Thus with fscache = true, the file is reported as deleted.

There is also a non-hidden file that's created, matching the filename before the : char. e.g. create file foo:bar and a standard file foo is created with 0 content, and also a phantom file foo:bar that's invisible to FindNextFile

@latkin
Copy link
Author

latkin commented Mar 2, 2016

Alright, finally have real clarity. Git for windows doesn't currently handle Alternate File Streams correctly, this issue is simply a compilcation due to that.

File paths in Windows can't truly contain a : character, but : can be used in a path-like way when calling various system APIs, in order to access alternate streams. e.g. fopen("C:\\src\\foo:bar") opens the bar stream of the file C:\src\foo. The file c:\src\foo will be created if it doesn't already exist.

As mentioned above, if you call APIs like fopen, GetFileAttributesEx, etc using such a stream-qualified path, they will not report an invalid file name, they will simply start working with the custom stream. An API like FindFirstFile will only return proper file names, i.e. stream-unqualified. Thus there is inconsistency between behavior of fscache true/false.

Without being too expert on mingw or git internals, it seems to me like it would make sense for the various mingw_XYZ and fscache_XYZ functions to check for : in filenames that could be misinterpreted as a stream qualifier, and report the same error as if it was an invalid file path. I don't see any scenario where direct access to alternate file streams is the right behavior in git.

I'm happy to provide such a patch if this sounds right.

@dscho
Copy link
Member

dscho commented Mar 3, 2016

We already disallow tracking such illegal file names in Git for Windows, so I do not really see why we need to care all that much: you are talking about untracked files which are completely uninteresting to Git until the time when they are staged.

Please read this as my vote to close this ticket as WONTFIX.

@PhilipOakley
Copy link

I think the original problem is that the 'file with colon' is already in the repo he is trying to use - see the original

Motivation:I work on windows but most of my team is on OSX. They put files named likefoo:barin the repo. I can happily ignore them with a privategitignorewhencore.fscacheis turned off, but they pollute my working directory when I try to usecore.fscache` true (which I'd like to do for perf reasons).'

However, he has an existing workaround, which isn't great but works, so it may be useful if it's possible to give @latkin a pointer as to if his line of reasoning as to a suggested 'fix' (to the particular problem of ignoring those colon-files under core.fscache true) is in the right direction (it's his itch to scratch after that ..)

@PhilipOakley
Copy link

@latkin Seeing as you have a possible solution, a draft pull request may be a suitable method of exploring the issue (of ignoring colon-files when core.fscache is true). At least then @dscho can see if it is a small tweak, or a major code churn, and how far it reaches.

@latkin
Copy link
Author

latkin commented Mar 3, 2016

@PhilipOakley yes, that's correct - the issue is mysterious/confusing behavior for unsuspecting Windows users when working with repos of unix origin. I'm not concerned with malicious/determined Windows users who are deliberately trying to do bogus stuff.

Consider this toy repo: https://github.com/latkin/filetest It contains 3 files: foo:bar1, foo:bar2, and foo:bar3.

If you clone that repo today on Windows, the clone succeeds and no errors are reported. However there are problems:

  • A mysterious untracked, size-0 file foo has been created
  • Although none of the actual files has been created, for some reason nothing is reported as deleted
    • Until you change to core.fscache = true and now suddenly all 3 files are reported as deleted

This is very confusing and unintuitive. Expected experience is to simply report all 3 files as having invalid paths.

I've sent a proposal PR which is very tiny and fixes this for the 99% case - it simply blocks checkout of files with : in the name. Malicious windows users can still screw with things by deliberately creating alternate streams, but that's probably not worth worrying about.

@PhilipOakley
Copy link

----- Original Message -----

@PhilipOakley yes, that's correct - the issue is mysterious/confusing behavior for unsuspecting Windows users when working with repos of unix origin. I'm not concerned with malicious/determined Windows users who are deliberately trying to do bogus stuff.

Consider this toy repo: https://github.com/latkin/filetest It contains 3 files: foo:bar1, foo:bar2, and foo:bar3.

If you clone that repo today on Windows, the clone succeeds and no errors are reported. However there are problems:

a.. A mysterious untracked, size-0 file foo has been created 
b.. Although none of the actual files has been created, for some reason nothing is reported as deleted 
  a.. Until you change to core.fscache = true and now suddenly all 3 files are reported as deleted 

This is very confusing and unintuitive. Expected experience is to simply report all 3 files as having invalid paths.

I've sent a proposal PR which is very tiny and fixes this for the 99% case - it simply blocks checkout of files with : in the name. Malicious windows users can still screw with things by deliberately creating alternate streams, but that's probably not worth worrying about.

I think the commit message is missing the original "why" explanation that in the wild, in mixed communities, G4W should be able to cope with repositories where files-with-colons are already present.

The commit message should also cover the "how to use it" after the fix - does it still require a local gitignore to be set?, and confirm the patch fixes the core.fscache=true issue (and it still works when false).

Finally, as an aside, have you considered the sparse checkout options to mark those files as --assume-unchanged, so they can be in the index, but not in the file system (most descriptions on the web get that option wrong - it's actually about personal FS 'memory' management).

If you amend the commit message you can force push the commit to github and re-use the existing PR - this keeps the thread together.

If some of the confirmatory points don't need to be in the main part of the commit, then stick them after a three dashes line.

Philip

latkin added a commit to latkin/git that referenced this issue Mar 4, 2016
ref git-for-windows#679

Windows disallows the colon `:` character in file names. However many
win32 file APIs allow path specifications of the form `<file path>:<stuff>`
when reading or writing files. These are interpreted as pointing to
the *alternate data stream* named `<stuff>` within the `<file path>` file.

Documentation on alternate data streams:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa364404(v=vs.85).aspx

Git for Windows, ignorant of file streams, will incorrectly map a Unix
file named like `foo:bar` into the `bar` alternate stream of a file
`foo`. This results in an unexpected file `foo` with size 0 in the working
tree, and (depending on core.fscache setting), the expected "foo:bar" file
being flagged as deleted (or maybe not).

It would be preferrable if Git for Windows detected such files and issued
errors, similar to how it does for various other invalid path situations.
This would help reduce pain and make things less confusing for those
working in a mixed Unix/Windows team.

This change adds a check for ':' so that we never accidentally unpack a file
into an alternate stream by accident. Any file path with a ':' is considered
invalid, which is perfectly sensible for the purposes of git.

If such a file is indeed detected and blocked, users can instruct git to
totally ignore it via `git update-index --assume-unchanged`, just like
they need to today for other invalid path situations.

NB - a determined Windows user can still confuse the system in certain ways
by explicitly creating alternate streams, but that requires exceptional
user effort and is judged to be not worth pursuing at this time.

Signed-off-by: Lincoln Atkinson <lincoln.atkinson@hotmail.com>
@latkin
Copy link
Author

latkin commented Mar 4, 2016

Thanks @PhilipOakley, I've amended the commit message and PR comment.

The change doesn't fix the underlying inconsistency between standard lstat and fscache lstat, it simply blocks problematic files from being checked out in the first place. Not perfect, but it makes the fix considerably less invasive, while still covering the primary scenario.

A full fix would require changes to a bunch of mingw APIs, which seemed too much for an admittedly minor bug.

@LunNova
Copy link

LunNova commented Jul 5, 2016

I am experiencing a similar problem when using sparse checkout. Is it the same or should I open a new issue?

core.sparseCheckout is set to true.
core.fscache can be either true or false
core.preloadindex can be either true or false
git version shows git version 2.9.0.windows.1
A sparse-checkout file is used to exclude a folder with many small files in it.

I checkout a branch. git status shows no changes. I cherry-pick a commit from a different branch. The cherry-pick succeeds without needing to correct any conflicts. Now running git status shows a deleted file for every single file which was excluded by the sparse-checkout file. None of these files were changed by the cherry-picked commit.

Running git reset --hard HEAD at this point (since there are no actual changes and it's fine to reset) works as a workaround to fix the broken git status results.

Does this sound like the same problem?

edit: Sorry, after reading through the comments it seems obvious that this doesn't have the same root cause. Not sure why I posted this here. Opened a new issue #811

@amalic
Copy link

amalic commented Jul 21, 2016

Same here when checking out cBioPortal on Windows.
--> https://github.com/cBioPortal

@dscho
Copy link
Member

dscho commented May 15, 2017

So in case anybody wondered why I did not do anything about this ticket yet: you can thank the people who gave me the thumbs down on a wontfix, without spending any time nor effort to work toward a real fix. That was a real demotivator.

The real fix may actually be a fallout of something I try to work on, but do not get the time: it would appear that accessing paths via the \\?\ name space can actually be faster than via plain files, as the \\?\ name space sidesteps the Win32 file name checks (including the test for a colon, IIUC). This would allow Git for Windows to write those files as a side effect of a performance enhancement.

The major holdup in that endeavor is that the current core.longpaths feature (which does use the \\?\ name space to create files with an overly long path), which would almost provide everything needed for that performance improvement, uses a slow Win32 API to normalize the file path first, which goes against the goal of speeding everything up.

@jeffhostetler
Copy link

Something to keep in mind though is that even if git does successfully create "foo:bar" (using "\?" or whatever), you're still likely to have problems if the other tools in your tool chain don't support them too. I'm thinking here about your compiler, editor, make/build scripts, and etc.

It may be better to spend the time educating the Mac folks.

@dscho
Copy link
Member

dscho commented May 16, 2017

Something to keep in mind though is that even if git does successfully create "foo:bar" (using "?" or whatever), you're still likely to have problems if the other tools in your tool chain don't support them too. I'm thinking here about your compiler, editor, make/build scripts, and etc.

Right. I still want to support core.longpaths = false (maybe with a synonym that explains better what this does), but I would like to change the default to true. Granted, many tools do not handle those files gracefully. But that's not Git for Windows' problem, and hence it does not add to my maintenance burden anymore 😁

It may be better to spend the time educating the Mac folks.

I do not believe that will work.

@h110hawk
Copy link

h110hawk commented Oct 5, 2018

Just wanted to say this appears to still be an issue. I was going through my usual "I know how to solve git problems" troubleshooting (ahem rm -rf and reclone) but it persisted. I could not reproduce it on my linux vm doing the same "troubleshooting", just to make sure I wasn't crazy. Went to update git-for-windows and found a new build cut just 3 hours ago and am still encountering this.

It would be nice if there is not a long term fix to add something which detects and warns users of this behavior. I realize it's not strictly gits "job" to warn users like this, but it's certainly unexpected behavior for most users.

Thanks! I do really like the availability of git-for-windows, it certainly helps my workflow out.

$ git --version
git version 2.19.1.windows.1

@PhilipOakley
Copy link

@h110hawk Would you have any spare time to help with curating this issue? Even if it is just nailing down a specific aspect of the issue to allow the multiple views to be cross-compared.

A 'good' solution is better than one that only resolves the issue(s) for half the users. The Windows compatibility (relative to Linux expectations) is part of the problem.

@dscho
Copy link
Member

dscho commented Feb 27, 2019

I still want to support core.longpaths = false (maybe with a synonym that explains better what this does), but I would like to change the default to true.

In the meantime, I performed some testing and it seems that the performance benefits I have come to expect did not materialize. Worse, since Git for Windows typically works on relative paths, switching to longpaths=true would impact performance negatively.

Coming back to the reported problem, I would tend to mark this as a "Known Issue". And the underlying issue is not that git status returns different results when core.fscache is enabled vs when it is disabled, but the underlying issue is that Git thinks it just wrote a file called with:colons when it wrote a stream called colons for a file called with.

dscho added a commit to dscho/git that referenced this issue Dec 19, 2019
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses git-for-windows#679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho added a commit to dscho/git that referenced this issue Dec 27, 2019
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses git-for-windows#679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
git-for-windows-ci pushed a commit that referenced this issue Dec 29, 2019
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses #679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho added a commit that referenced this issue Dec 29, 2019
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses #679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member

dscho commented Jan 2, 2020

  • With core.fscache false, working directory is shown as clean by git status

  • With core.fscache true, file foo:bar is flagged with deleted status.

In the next version, foo:bar will trigger an error: the colon is not a legal character in a file name on Windows.

@dscho dscho closed this as completed Jan 2, 2020
dscho added a commit that referenced this issue Jan 4, 2020
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses #679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho added a commit that referenced this issue Jan 13, 2020
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses #679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
git-for-windows-ci pushed a commit that referenced this issue Jan 16, 2020
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses #679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
git-for-windows-ci pushed a commit that referenced this issue Jan 17, 2020
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses #679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
git-for-windows-ci pushed a commit that referenced this issue Jan 22, 2020
This topic branch refuses to create files with reserved file names, such
as `aux.c` or `nul.txt`.

This addresses #679.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants