-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pathlib.Path().rglob() is fooled by symlink loops #70200
Comments
I created a symlink loop as follows: mkdir tmp Then I tried to list it recursively using rglob():
This caused an infinite regress:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 1065, in rglob
for p in selector.select_from(self):
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 549, in _select_from
for p in successor_select(starting_point, is_dir, exists, listdir):
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 548, in _select_from
for starting_point in self._iterate_directories(parent_path, is_dir, listdir):
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 538, in _iterate_directories
for p in self._iterate_directories(path, is_dir, listdir):
[...]
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 538, in _iterate_directories
for p in self._iterate_directories(path, is_dir, listdir):
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 537, in _iterate_directories
if is_dir(path):
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 1303, in is_dir
return S_ISDIR(self.stat().st_mode)
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 1111, in stat
return self._accessor.stat(self)
File "/Users/guido/src/cpython36/Lib/pathlib.py", line 371, in wrapped
return strfunc(str(pathobj), *args)
OSError: [Errno 62] Too many levels of symbolic links: 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz' |
So would the goal of a fix be something like dev/inode memoization to prevent traversing the same link twice? To provide an argument to prevent following symlinks at all (as in stuff like os.walk), possibly defaulting to followlinks=False? It looks like languages like Ruby never follow symlinks when performing recursive globs to avoid this issue ( https://stackoverflow.com/questions/357754/can-i-traverse-symlinked-directories-in-ruby-with-a-glob ). It's probably easiest to just block recursing through symlinks (at least by default); memoizing risks unbounded memory overhead for large directory trees. |
glob.glob() just stops too deep recursion. >>> import glob
>>> glob.glob('tmp/**/*')
['tmp/baz/baz']
>>> glob.glob('tmp/**/*', recursive=True)
['tmp/baz', 'tmp/baz/baz', 'tmp/baz/baz/baz', 'tmp/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz', 'tmp/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz/baz'] |
I agree it's easiest just not to traverse symlinks matching **. glob.py |
I'm going to fix this by skipping all symlinks in _RecursiveWildcardSelector. |
New changeset 18f5b125a863 by Guido van Rossum in branch '3.4': New changeset 9826dbad1252 by Guido van Rossum in branch '3.5': New changeset 36864abbfe02 by Guido van Rossum in branch 'default': |
This seems to have broken test_pathlib, both on my computer and some Linux and BSD buildbots: ====================================================================== Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/test/test_pathlib.py", line 1455, in test_rglob_symlink_loop
self.assertEqual(given, {p / x for x in expect})
AssertionError: Items in the second set but not the first:
PosixPath('/media/disk/home/proj/python/cpython/build/test_python_27756/@test_27756_tmp/dirA/linkC')
PosixPath('/media/disk/home/proj/python/cpython/build/test_python_27756/@test_27756_tmp/dirB/fileB')
PosixPath('/media/disk/home/proj/python/cpython/build/test_python_27756/@test_27756_tmp/dirB/linkD') ====================================================================== Traceback (most recent call last):
File "/media/disk/home/proj/python/cpython/Lib/test/test_pathlib.py", line 1455, in test_rglob_symlink_loop
self.assertEqual(given, {p / x for x in expect})
AssertionError: Items in the second set but not the first:
PosixPath('/media/disk/home/proj/python/cpython/build/test_python_27756/@test_27756_tmp/dirA/linkC')
PosixPath('/media/disk/home/proj/python/cpython/build/test_python_27756/@test_27756_tmp/dirB/fileB')
PosixPath('/media/disk/home/proj/python/cpython/build/test_python_27756/@test_27756_tmp/dirB/linkD') |
Oops, sorry about that. I will see if I can figure out how to repro this -- on my own Mac it succeeds. In the mean time if you could have a look yourself (since you can repro it on your own computer) I'd be grateful. Initial hunches: maybe a case-insensitivity crept into the test? |
Looks like the root cause of the breakage is actually that issue bpo-24120 isn't quite fixed yet. I'm now able to repro on a Linux VM. |
New changeset 4043e08e6e52 by Guido van Rossum in branch '3.4': New changeset 8a3b0c1fb3d3 by Guido van Rossum in branch '3.5': New changeset 398cb8c183da by Guido van Rossum in branch 'default': |
Should be fixed for real now. |
Thanks, your latest change seems to have fixed it on my Linux computer, and most of the buildbots. However now there is a failure on <http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%203.x\>: ====================================================================== Traceback (most recent call last):
File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_pathlib.py", line 1460, in test_rglob_common
"linkB/fileB", "dirA/linkC/fileB"])
File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_pathlib.py", line 1451, in _check
self.assertEqual(set(glob), { P(BASE, q) for q in expected })
AssertionError: Items in the second set but not the first:
WindowsPath('C:/buildbot.python.org/3.x.kloth-win64/build/build/test_python_1700/@test_1700_tmp/linkB/fileB')
WindowsPath('C:/buildbot.python.org/3.x.kloth-win64/build/build/test_python_1700/@test_1700_tmp/dirB/linkD/fileB')
WindowsPath('C:/buildbot.python.org/3.x.kloth-win64/build/build/test_python_1700/@test_1700_tmp/dirA/linkC/fileB') ====================================================================== Traceback (most recent call last):
File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_pathlib.py", line 1460, in test_rglob_common
"linkB/fileB", "dirA/linkC/fileB"])
File "C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_pathlib.py", line 1451, in _check
self.assertEqual(set(glob), { P(BASE, q) for q in expected })
AssertionError: Items in the second set but not the first:
WindowsPath('C:/buildbot.python.org/3.x.kloth-win64/build/build/test_python_1700/@test_1700_tmp/linkB/fileB')
WindowsPath('C:/buildbot.python.org/3.x.kloth-win64/build/build/test_python_1700/@test_1700_tmp/dirB/linkD/fileB')
WindowsPath('C:/buildbot.python.org/3.x.kloth-win64/build/build/test_python_1700/@test_1700_tmp/dirA/linkC/fileB') I don’t have a Windows setup so I can’t really help much with this one. |
I think I understand the Windows failure. I uncommented some tests that were previously broken due to the symlink loop and/or PermissionError, but one of these has a different expected outcome if symlinks don't work. I've pushed a hopeful fix for that (no Windows box here either, but the buildbots are good for this :-). |
Looks like that Windows buildbot is now green. Of the stable bots only OpenIndiana is red, and it's unrelated. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: