Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mingw: Fix unlink for open files #1666

Closed
wants to merge 1 commit into from

Conversation

orgads
Copy link

@orgads orgads commented May 8, 2018

This is an attempt to implement @cbuchacher's idea in #1653

Files that were opened by a (possibly another) process either for read or for failed to be replaced by Git on checkout.

If a file is opened with FILE_SHARE_DELETE share mode, it is possible to delete or rename the file, but it is not possible to create a new file until the file's handle is closed.

To overcome this, unlink now renames the file before actually unlinking it.

If rename fails, then the file either doesn't exist, or it is open without share permissions. On this case, it is still possible to unlink the file. The file will effectively be deleted when it is closed. On this case, preserve the existing behavior.

If rename succeeds, unlink the temporary file, making it possible for the real file name to be reused.

Fixes #1653.

@orgads orgads changed the title mingw: Fix unlink for open files - take 2 RFC: mingw: Fix unlink for open files - take 2 May 8, 2018
@orgads orgads force-pushed the fix-unlink-by-rename branch 2 times, most recently from 51233d7 to 2dc671e Compare May 9, 2018 20:27
@orgads orgads changed the title RFC: mingw: Fix unlink for open files - take 2 mingw: Fix unlink for open files May 9, 2018
@orgads orgads force-pushed the fix-unlink-by-rename branch 3 times, most recently from fabf929 to 6662cf1 Compare May 9, 2018 20:39
@orgads orgads mentioned this pull request May 9, 2018
1 task
compat/mingw.c Outdated Show resolved Hide resolved
Copy link

@drizzd drizzd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is reasonable, and it fixes a real issue. It would be good to add a test as confirmation and to detect future regressions. The test could look similar to the shell scripts we have been using to reproduce the issue.

I would still like to check the impact on performance. The rename will add a second operation in addition to the unlink for each checked out file. Checkout performance could therefore degrade by up to a factor of 2. Let's make a few tests with largish repos.

We should also maybe limit the operation only to the checkout operation. In particular, I don't think it is necessary to handle .git directory files in this way. We could add an unlink_safely operation which we only use in checkout_entry for now.

compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
@orgads
Copy link
Author

orgads commented May 10, 2018

Thanks for the thorough review. I'll address your comments later.

Regarding performance, I did a test with Qt Creator repository, and it does have a significant impact (although less than 2 factor):

I did several checkouts back and forth between 3.0 and 4.7, to facilitate the system's cache. The result was 4s for direct unlink, and 6.5s for rename+unlink.

$ time git.orig checkout origin/3.0
Checking out files: 100% (12332/12332), done.
Previous HEAD position was 8e7c1bf1ab Android: Remove m_extraAppParams and m_extraEnvVars from Runnable
HEAD is now at 470f7973d2 ios: simulator support for Xcode 5.1

real    0m4.062s
user    0m0.000s
sys     0m0.031s

$ time git checkout origin/3.0
Checking out files: 100% (12332/12332), done.
Previous HEAD position was 8e7c1bf1ab Android: Remove m_extraAppParams and m_extraEnvVars from Runnable
HEAD is now at 470f7973d2 ios: simulator support for Xcode 5.1

real    0m6.444s
user    0m0.015s
sys     0m0.015s

@drizzd
Copy link

drizzd commented May 10, 2018

Another option would be to overwrite the existing file by opening it for writing without unlinking it. There is a comment saying that we unlink so that the file mode will be handled by the system. But maybe that is not an issue on Windows.

@orgads orgads force-pushed the fix-unlink-by-rename branch 3 times, most recently from beabf3b to 3ec215a Compare May 29, 2018 21:12
@orgads orgads force-pushed the fix-unlink-by-rename branch 2 times, most recently from 3edae20 to 231bd3f Compare June 10, 2018 19:47
@orgads orgads force-pushed the fix-unlink-by-rename branch from 231bd3f to da2ad8b Compare June 22, 2018 04:56
@orgads
Copy link
Author

orgads commented Jun 22, 2018

I pushed a revised change that tests for exclusive writability before renaming. Please review it. I'll run some benchmarks later.

@orgads
Copy link
Author

orgads commented Jun 22, 2018

Ok, a similar benchmark as before with Qt Creator 3.0/4.7:

$ time git.orig checkout origin/3.0
Checking out files: 100% (12308/12308), done.
Previous HEAD position was ced5f89235 Kit: When loading from a map, allow empty IDs
HEAD is now at 470f7973d2 ios: simulator support for Xcode 5.1

real    0m5.533s
user    0m0.015s
sys     0m0.015s

$ time git checkout origin/3.0
Checking out files: 100% (12308/12308), done.
Previous HEAD position was ced5f89235 Kit: When loading from a map, allow empty IDs
HEAD is now at 470f7973d2 ios: simulator support for Xcode 5.1

real    0m5.631s
user    0m0.000s
sys     0m0.031s

The difference is negligible (less than 2%). Please consider accepting this change. It doesn't require any special configuration, and is platform-specific, so I believe it is less likely to be rejected by the Git maintainers. @dscho: Do you have an opinion about this?

@orgads orgads force-pushed the fix-unlink-by-rename branch from da2ad8b to d3af0cf Compare June 22, 2018 12:01
@ghost
Copy link

ghost commented Jun 22, 2018

I went to great lengths to prepare and submit a patch to the mailing list. There were no fundamental concerns about the patch, the only question was "what do we need this for?" You gave the following answer:

https://public-inbox.org/git/CAGHpTBJ9WiWdJw=SgxJpWqP9CucANatafx6iwCRCRY15wTBsVg@mail.gmail.com/

Anyway, with Qt Creator 4.7 this should be a non-issue, so I'm reluctant about
this change here.

So, I am wondering why you brought this non-issue up in the first place, or why are still going on about this.

@orgads
Copy link
Author

orgads commented Jun 22, 2018

Hi,

I appreciate your efforts.

It did look to me like the issue was resolved in clang, but apparently it is still incomplete, specifically for rebase, and possibly other scenarios.

The original problem was that clang never released handles. Now it does release them, but while it is parsing the files are still being locked.

See the last comment in https://bugreports.qt.io/browse/QTCREATORBUG-15449

There was some negative feedback in the ML about adding a configuration variable (and changing behavior) for solving a problem of a specific tool. I didn't have good arguments against that, especially when I was convinced that the issue is already solved.

I still tend to agree that it is better to fix the issue in Qt Creator, but until this is done, and in case other similar issues will rise, a fix in Git will be useful.

@ghost
Copy link

ghost commented Jun 22, 2018

The same concern applies to this PR here. Especially Edward's reluctance to support this in libgit2 should worry us, since libgit2 is also used on Windows.

The fact that you make changes only in the Windows specific sections is not an advantage. On the contrary, it puts the maintenance burden on Git for Windows which has much fewer contributors compared to Git as a whole.

@orgads
Copy link
Author

orgads commented Jun 22, 2018

Ok, I can reply on the ML and say that the problem is still relevant. If my patch is acceptable, I'll gladly post it to core Git. I wanted to start here because I preferred to get some feedback, and possibly merge it to GfW before posting upstream, which is typically a longer process.

First, let's try to decide which of these changes we would like to merge.

As I see it, your change should be more efficient than mine (I didn't run my "benchmark" on it yet), but it requires extra configuration, and doesn't cover real unlink (like git clean, or checkout that removes a used file).

My change covers these cases, but it costs some extra system calls.

We can try to merge one of them or both. How would you like to proceed?

@ghost
Copy link

ghost commented Jun 22, 2018

To decide this I think we need to better understand the use case. What prevents us from fixing this on Qt Creator side? If I understand correctly other IDEs do not have this problem with Git. If those IDEs use mmap too, what is different about Qt Creator? If other IDEs do not use mmap, why does Qt Creator have to use it?

@orgads
Copy link
Author

orgads commented Jun 22, 2018

Qt Creator uses Clang as a backend for code model. Earlier versions of Qt Creator that did not use Clang did not have this problem. Other IDEs that don't use Clang don't have this problem. If another IDE uses Clang (e.g. with clangd language server, it is likely to hit the same problem if it tries to refresh open files when they are changed.

Since Qt Creator supports Rebase, the rebase is done while it is in focus, and it is a non-blocking operation (making it blocking will be hard/nearly impossible, since rebase sometimes needs an editor, like for reword or interactive rebase, and this editor is the IDE itself), and files are reparsed when they are changed, if you perform rebase, which changes a file more than once, then git fails because Clang has the file's handle open until it finishes parsing (the previous bug was that it never released this handle until you closed the file). So other IDEs can hit this bug if they reparse files even if they're in the background, or if they support rebase.

I proposed a way to workaround this problem in Qt Creator (particularly for rebase), but I don't have enough experience with the code model to implement it myself, and Ivan, who currently maintains the code model alone (the 2 other maintainers are on long vacations), has other things prioritized.

compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
@drizzd
Copy link

drizzd commented Jun 23, 2018

If another IDE uses Clang (e.g. with clangd language server, it is likely to hit the same problem if it tries to refresh open files when they are changed

The fact that Qt Creator uses Clang is an implementation detail which makes no difference from Git's point of view. For Git we only need to ask ourselves if we should support deleting work tree files which are somehow locked by other processes. We should not support this if it can be easily avoided in the other process, for example by not using mmap. Other IDEs don't do it, so this may indicate that it can be solved for Qt Creator too. If it's Clang doing this underneath, then the question is why is Clang mmap'ing the files. If there is a good reason to do this, and Qt Creator benefits from this method (otherwise the behavior could be disabled with a Clang option), then we have a use case which Git should support.

return res;
}

static int safe_unlink(const wchar_t *wpathname, int (*unlinker)(const wchar_t *))

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

compat/mingw.c Outdated Show resolved Hide resolved
@orgads orgads force-pushed the fix-unlink-by-rename branch from 692ec48 to 7bb5581 Compare November 11, 2018 20:38
compat/mingw.c Outdated Show resolved Hide resolved
@orgads orgads force-pushed the fix-unlink-by-rename branch from 7bb5581 to d6d3710 Compare November 12, 2018 09:51
compat/mingw.c Show resolved Hide resolved
compat/mingw.c Outdated Show resolved Hide resolved
Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that my comments are helpful. If the git reset --hard/git checkout performance is degraded by this patch, we will have to see whether we can hide it behind a config flag.

@dscho
Copy link
Member

dscho commented Nov 26, 2018

@orgads where are we at with this PR? Do you think you can test the performance impact and address @drizzd's and @piscisaureus' suggestions/comments before v2.20.0 is due (on, or short after, December 3rd)?

@orgads
Copy link
Author

orgads commented Nov 26, 2018

Regarding performance, I ran some tests on a rather large repository, and the impact was negligible IMO (search the previous comments).

I'll try to address the comments on time. Thanks for the heads up.

@orgads orgads force-pushed the fix-unlink-by-rename branch from d6d3710 to 0344641 Compare November 28, 2018 09:39
@dscho
Copy link
Member

dscho commented Nov 28, 2018

Thank you for the update. It reads quite nicely.

@drizzd your review is still blocking this, could you have a quick look whether you still have concerns? I think we should take this for v2.20.0.

@drizzd
Copy link

drizzd commented Dec 9, 2018

@dscho My main concern with this change it is the added complexity for the retry loop and the lack of coverage for retry code paths. We do not yet have a common understanding of the scenario which leads to retries. As I understand it, it should not be difficult to add a test for it.

@dscho
Copy link
Member

dscho commented Dec 10, 2018

@drizzd okay. I also think that we could probably use some more time to hammer things out, e.g. seeing whether the SetFileInformationByHandleW() function can be used (as we already call CreateFileW() and I would love to have some perf numbers on larger trees).

So unfortunately, it did not make it into v2.20.0... But I am confident that we can get this polished together before long.

@orgads
Copy link
Author

orgads commented Dec 25, 2018

Sorry for the delay. I still hit these errors frequently, even with this fix. I'll need to hunt and fix this case before proceeding. I'll try to address all your comments, but it might take time.

@dscho
Copy link
Member

dscho commented Dec 27, 2018

I'll need to hunt and fix this case before proceeding.

Thank you for being so diligent!

@orgads
Copy link
Author

orgads commented Feb 3, 2019

Hi,

I don't know when I'll be able to complete this. If any of you is willing to take over this patch, please go ahead.

Some observations:

  • Renaming in the same directory can create new problems. For example, rsync fails for a temp-that-is-about-to-be-deleted file. I think it might be better to use the recycle bin instead.
  • The initial CreateFile can be changed to use FILE_ATTRIBUTE_NORMAL | FILE_ATTRIBUTE_DIRECTORY | FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT | FILE_FLAG_DELETE_ON_CLOSE (without any sharing), instead of calling unlink and rmdir.
  • I sometimes get a message like error: unable to unlink old '<file>': No error. This needs further investigation.
  • Different kinds of open and sharing modes need to be tested. I wrote a test helper that can do this, but I don't have it here. I'll post it later.

Can you please assist?

@git-for-windows-ci git-for-windows-ci changed the base branch from master to main June 17, 2020 18:11
Files that were opened by a (possibly another) process either for
read or for failed to be replaced by Git on checkout.

If a file is opened with FILE_SHARE_DELETE share mode, it is possible
to delete or rename the file, but it is *not possible* to create a
new file until the file's handle is closed.

To overcome this, unlink now renames files that are not writable before
actually unlinking them.

If rename fails, then the file either doesn't exist, or it is open
without share permissions. On this case, it is still possible to unlink
the file. The file will effectively be deleted when it is closed.
On this case, preserve the existing behavior.

If rename succeeds, unlink the temporary file, making it possible for
the real file name to be reused.

Fixes git#1653.

Signed-off-by: Orgad Shaneh <orgads@gmail.com>
@orgads orgads force-pushed the fix-unlink-by-rename branch from 0344641 to 431bb40 Compare March 16, 2021 10:11
Copy link

@Pearlish111 Pearlish111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(invalid)

@dscho
Copy link
Member

dscho commented Dec 2, 2023

Closing in favor of #4719.

@dscho dscho closed this Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Open files are deleted by checkout
5 participants