-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WinError 5] Access is denied during cleanup/rmtree #7280
Comments
I view this as an issue with the virus scanner. If it's not able to scan the file quickly enough to avoid issues with programs using temporary files, it should open those files in a mode that does not prevent deletion. If it's not possible to get the virus scanner fixed, you could exclude pip's temporary directory from virus scanning. Increasing the timeouts on deletions would cause problems for cases where files genuinely couldn't be deleted, resulting in longer delays for the user before a useful error was produced. |
@pfmoore What is actually happening is this: the download succeeded. but pip died because it could not delete the temporary directory it was using. IMHO the install should not fail because the cleanup handler for the download failed. That said, I can see your point about longer delays for useful errors on to address your suggestion that users reconfigure their virus scanner - in many workplaces, the user does not have control over virus scanner configuration. so it's not possible to exclude any directory. besides |
In principle, that sounds reasonable.
I work in precisely such an environment, and Python is highly non-core to our business, so requesting changes for the benefit of Python is unlikely to get any sympathy. So I understand your point here, but I still strongly believe that pip shouldn't be catering for the quirks and failures of (IMO badly written or badly configured) virus scanners. General principles like "we shouldn't fail just because cleanup errored" are OK, but that's different. |
Typically when I execute a command, I expect the behavior to be the same across multiple invocations. In the event that the behavior is not the same I would want that indicated. A warning may be good enough when executed from the command line by a user, but we have to consider that pip's CLI is it's only public API. Add on to this that we make hardly any guarantees about output format, then the exit code is essentially the only way we have to communicate that something unexpected happened and someone should take a look. A command being unable to clean up after itself, especially if the files may be large, is something unexpected that I'd want to know about if running in an automated way. |
@chrahunt so then to ensure backward compatibility perhaps we need to add CLI options so that users can opt-in to dealing with system software that scans downloaded files. So I'll propose the following CLI options:
either of them will work. second one is probably nicer. but more code i think. |
I'm generally weakly against adding more options given the number and disorganization of the ones we have (see #6221). That does bring up a good question though, which is why don't we see these kinds of options in other apps? How do they handle it? Maybe if we find good examples of common applications that have made these kinds of decisions, they'll have different approaches or justification than what we're considering here. Conda could be one example. |
In the most recent code base, the section of code that fails can be found here: pip/src/pip/_internal/operations/prepare.py Lines 253 to 281 in de736c6
It fails on the |
In my experience, it's because this type of error simply isn't that common, and people typically accept that when anti-virus programs do this type of thing, it's down to the antivirus.
I'm definitely against proliferating options for edge cases like this. We should be making the decision, not avoiding the issue and making the user choose. In this case, I personally think the current behaviour is fine, but if the consensus is to change, then OK, as long as we don't dump the choice on the user. |
@chrahunt what conda is doing is rather elaborate and sophisticated - essentially doing whatever it takes to remove the directory structure as fast as possible, with special optimizations for windows. it even does an exponential backoff to optimize the waiting between retries. in the worst case (for windows only), they will rename the directory with a special batch script. I would prefer to keep this change minimal and just set pip/src/pip/_internal/utils/misc.py Lines 136 to 137 in de736c6
Hacking this file is one of the approaches I've been recommending for folks who have this problem. The pip/src/pip/_vendor/retrying.py Lines 59 to 71 in de736c6
but I haven't looked into it yet. |
Well I was able to try out the exponential backoff by modifying the decoration like so @retry(stop_max_delay=XXXX, wait_exponential_multiplier=100, wait_exponential_max=2000)
def rmtree(dir, ignore_errors=False): Not sure this buys us anything except slightly faster retry on transient delete problems. How do you want to proceed? Shall I submit a PR with basic changes to the retry decorator? |
IMO the exponential backoff isn't critical, I was hoping there was some fool-proof solution with no downsides that would just need to be implemented. :) At this point we just need to haggle over the max delay. You suggested 15s above, did that number come from somewhere specific? Given that not many people are complaining about this, the increase should probably be pretty modest. |
15 seconds was an attempt at balancing what I'd prefer (60 sec) and what @pfmoore wants (leave as-is at 3 sec). Based on some testing, 12 seconds is probably fine too. I just did an unscientific experiment to measure the delay between python attempting to delete the file and the virus scanner cylance releasing the file. I did this measurement 10 times on my machine. In seconds, these were the timings: 2, 16, 8, 18, 48, 11, 12, 13, 15, 17. If I did my math correctly, the 25th, 50th, 75th, and 95th percentile are 11.25, 14, 16.75, and 34.5 respectively. The testing methodology is as follows:
By both reducing the max delay and the lowering virus scanner priority, I can reproduce the problem consistently. Ideally, the test would download the largest file hosted at files.pythonhosted.org on a slower windows machine and/or network. Slowing down the virus scanner is not really a fair test because it inflates the difference between pip and the virus scanner. But it could be considered a simulation of a worst case scenario. As a side note, for a few of the test runs I measured the time between when python started the download and when the virus scanner started reading the file. This delay ranged from 1 to 16 sec but mostly under 10 sec. I would argue that there have been complaints about this bug for a long time and occasional mumblings about this over the internet. For example, not sure what happened to this issue #2892 |
I've written a utility that can create similar problems even without Cylance Protect, and can make the failure more consistent. This is https://github.com/danizen/watchandlock. I can run it with my own package, confsecrets, which is quite small.
In another command-prompt in the same virtual environment pip install confsecrets TL;DR - I'm not getting exactly the same error, possibly because I'm opening the file and not the directory. I've also confirmed using procmon that Cylance Protect is simply doing the following:
|
I wonder whether the answer is simply to ignore the error on Windows only, and attempt to delete a matching directory before doing and installation of de-installation: shutil.rmtree(..., ignore_errors=platform.platform().startswith('Windows')) Or something like the above. |
I have several questions:
|
Also, after extending @choyrim's pull request to add a test and code to work-around WinError 32 as well as 5, I see that the problem moves (in the real world) to another location:
|
Please ignore these questions... I see now the |
The more I think about this, the more I like simply ignoring the error when it happens if we are on Windows. There are a couple of reasons to like this:
I prototyped a simple change to I will pursue this alternate fix and be ready to test pip with @choyrim's fix once it has resolved. I will submit a pull request if I think it is good behavior, and we can see; I like that @choyrim's fix helps the error/warning message. |
- Try to delete the temporary directory, as most of the time this works - If it fails on windows due to an Access error, it clearly is not an Access error because we just created the directory. Warn the user and continue.
- Add issue description file
wow, i missed a lot. thank you for joining the fun.
I also found that spawning another process to hold open the file did not reproduce the "access denied" error and instead got a "... being used by another process ..." error which is what is reported by python 2.7 i think. I've gone over the procmon logs in detail to see if there is some other special thing that cylance is doing and couldn't find anything. the only difference is that cylance is running under a different user (SYSTEM). but i haven't figured out how to test that. |
I suggested something similar earlier in the thread - to ignore the error on failure. @pfmoore and @chrahunt felt that it was not appropriate to accomodate virus scanners. however, I had suggested ignoring the failure in general. @pfmoore @chrahunt would you be more open to ignoring the failure on windows only? |
@pfmoore @chrahunt , I'm talking about a very specific ignore in --- a/src/pip/_internal/utils/temp_dir.py
+++ b/src/pip/_internal/utils/temp_dir.py
@@ -4,6 +4,7 @@ import errno
import itertools
import logging
import os.path
+import sys
import tempfile
from contextlib import contextmanager
@@ -124,7 +125,21 @@ class TempDirectory(object):
"""
self._deleted = True
if os.path.exists(self._path):
- rmtree(self._path)
+ try:
+ rmtree(self._path)
+ except OSError as e:
+ skip_error = (
+ sys.platform == 'win32' and
+ e.errno == errno.EACCES and
+ getattr(e, 'winerror', 0) in {5, 32}
+ )
+ if skip_error:
+ logger.warning(
+ "%s (virus scanner may be holding it)."
+ "cannot remove '%s'",
+ e.strerror, e.filename)
+ else:
+ raise |
Also note if this problem occurs and is not ignored, pip fails, and the pip temporary directory in %TEMP% remains anyway, so we are making it better not worse:
|
From MSDN:
I've not tested it, but this is clearly intended to allow deletion by others. |
Both python (pip) and CylanceSvc.exe are opening the wheel file with this flag. Like @pfmoore said, I'd rather avoid going way too deep in a solution. |
- TempDirectory() tries to delete the directory as normal - if cleanup fails on Windows due to EACCES, warn about virus scanner - This is a more specific error than previous error, but does not change the behavior when a user is attempting to pip uninstall in a system directory - This changes the messaging, but risks leaving the directory behind. - Leaving the directory behind, however, is what is already happening - The fix for pypa#7479 gives the Virus scanner more time to finish its work, but there is still a possibility for a race condition that leaves the impression that there is an error in pip itself.
@pfmoore @chrahunt, pull request #7544 is my attempt to fix by issuing a warning. It complements the change for issue #7479; thanks. It is very important that the original behavior of warning about permissions remains, even on Windows, because pip install is globally used by my engineers, despite my rants about virtual environments. So, I like fixing this at the For my engineers who do use virtual environments, this problem has just become too frequent for me to ignore, but from the level of pip collaborators, I can see why it is minor. |
- IOError persists as an issue in Python 2.7.17 on win32
- TempDirectory() tries to delete the directory as normal - if cleanup fails on Windows due to EACCES, warn about virus scanner - This is a more specific error than previous error, but does not change the behavior when a user is attempting to pip uninstall in a system directory - This changes the messaging, but risks leaving the directory behind. - Leaving the directory behind, however, is what is already happening - The fix for pypa#7479 gives the Virus scanner more time to finish its work, but there is still a possibility for a race condition that leaves the impression that there is an error in pip itself.
One of my programmers posts this morning to slack (8:33 am):
|
I have only seen this issue twice since pip 20 was released, and in each case, the virtual environment involved had not been upgraded. Checking in - should this issue be resolved then? |
If this still happens occasionally for users, we should keep this open IMO - to indicate that this needs investigation eventually. |
Is there some way for to diagnose / infer that the issue is caused by a virus scanner or something? @pfmoore Would it make sense to include "maybe it's your virus scanner" in the error message we're printing? |
Due to the changes in pip 20+, when/if this happens, it will not prevent the package from installing, but instead prevent the directory from being removed. |
Looking at #7479 it seems like this problem isnt an issue anymore. I've moved on to using WSL and then finally getting a linux laptop. so closing this seems appropriate. |
I'm mildly against doing so. Virus scanners are the most likely culprit, but it would be awfully easy for a message like that to get seen as "pip is advising people to turn off their virus scanners" 🙁 The OP's message shows that the user was aware that it was the virus scanner - so adding a clarification to the message wouldn't have helped here in any case. |
hinting that the antivirus software may be preventing the deletion would've helped a lot. When I first saw the message, I would just keep retrying. Sometimes older versions would install more easily (depending on the size of the download I suppose). I eventually hacked up a workaround by changing my local copy. But when I heard about other ppl at work with the same issue, I dug in and tried to find a root cause - which was far from obvious. however, it looks like the change in #7479 no longer deletes at the end of the download so it should let the install finish, right? so then this becomes a non-issue if i understand correctly. |
Agreed.
pip still tries to cleanup the directory, but after the installation completes. So, what happens if this happens again then? The developer/user gets the error and retries... I'm for closing this because if the developers retries the install, they will see that the install already happened. And ... they still have the message to indicate that the directory was unable to be deleted. |
That's True. OTOH, I think we can phrase this nicely. if WINDOWS:
print("We've received reports that aggressive antivirus software can also result in such errors, in which case, re-running the pip command usually succeeds.") ;) |
Apparently the order of operations goes:
If 4 gets interrupted because of an antivirus, 5 doesn't happen. If the order of 4 and 5 also get flipped, that'll be great! Then the only issue will be be that a Temp directory wasn't deleted and there'd be an error message to indicate it. When upgrading multiple packages in one command, the cleanup error prevents the subsequent upgrades from installing. If the download directory cleanup is done together after installing all the upgrades, that'll be great. But not sure how much work that would be. Besides, re-running the upgrade command is not really a big deal. |
Environment
Description
pip install
ordownload
fails on Windows during cleanup because (real-time) virus protection is still scanning files that pip is trying to remove.using Process Monitor (procmon), I was able to confirm that
CylanceSvc.exe
(virus scanner) was reading the files while python was trying to the same files.I tried both increasing the retry wait (in
misc.py
) and ignoring the rmtree error (in temp_dir.py). both approaches worked.This issue indicates that other anti-virus scanners cause similar problems:
aws/aws-cli#2654
Expected behavior
The package should install.
How to Reproduce
pip install
ordownload
a large package. I was usually able to reproduce when installing or downloadingawscli
. If I have difficulty reproducing, I can lower the priority of theCylance.exe
to Background (4), and the symptoms are easier to reproduce consistently.using install:
pip install awscli
using download:
pip download awscli
pip download
is better because it provides a stacktrace.Output
using
pip install
, the error message is quite terse.using
pip download
, the stacktrace is more helpful.The text was updated successfully, but these errors were encountered: