-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exp run: cannot clean up temp directory runs on Linux + NFS #5641
Comments
For the record: Another user is running into this issue https://discord.com/channels/485586884165107732/485596304961962003/823940098458517581 |
Having the exact same issue! Not able to use the --run-all functionality, but can run experiments individually. |
Looks like this is something specfic to nfs (as in the NFS filesystem, not network mounted storage in general) on linux |
It looks like |
Issue is specific to NFS and pygit2 combination. The NFS client works by creating Disabling the pygit2 backend (so that CLI git is used as a replacement) makes this issue go away, so it appears that whatever is keeping an open file handle to We currently call pygit2's The simplest solution on the DVC side is probably for us to open an issue in pygit2 and/or libgit2, and then direct users running with |
Also for the record, with regard to the original issue:
|
After some more investigation, it looks like libgit2s So it's a side effect of mixing git backends. When only using pygit/libgit, pygit2's As a workaround in DVC, we can explicitly free the pygit2 repo after any operations that would potentially touch packfiles so that the file handles are released immediately when we are done with them. |
Hello, I am having this issue still. When I check the pack directories, one is empty and one has files in it:
Below is the debug output:
|
@pmrowla , sorry, I don't have two computers to simulate this condition. Any ideas about it? |
There's new calls which have been implemented in pygit2 since this issue was originally closed, we likely just need to add the wrapper to free the handles in the new calls. also, you don't really need 2 computers to test this, you just have to mount some export using nfs inside a vm or container to reproduce the issue |
Bug Report
Description
When using the
exp run --run-all
feature, the command can never finish due to a error deleting a git folder, even though that folder seems to be empty.Reproduce
dvc exp run --queue -S training.loss.name=mse
dvc exp run --run-all -j1 --verbose
Expected
I expect the same output as running
dvc exp run -S training.loss.name=mse
, being a successful experiment with proper metrics shown indvc exp show
. Instead, when using the queueing functionality, the command errors and the metrics are not properly saved. The JSON representation ofdvc exp show
also states the experiment is still queued.Environment information
Output of
dvc doctor
:Additional Information (if any):
Please note that at the end of this, it states the .git/object/pack directory is not empty, even though it can be deleted using rmdir without issue.
The text was updated successfully, but these errors were encountered: