Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-37609: Update ntpath.splitdrive to use native functionality where possible #25261

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions Doc/whatsnew/3.10.rst
Original file line number Diff line number Diff line change
Expand Up @@ -793,6 +793,11 @@ Added :data:`~os.O_EVTONLY`, :data:`~os.O_FSYNC`, :data:`~os.O_SYMLINK`
and :data:`~os.O_NOFOLLOW_ANY` for macOS.
(Contributed by Dong-hee Na in :issue:`43106`.)

Changed :func:`os.path.splitdrive()` on Windows to rely on a native API.
This has improved support for special prefixes, but may have changed the
results for some invalid UNC paths.
(Contributed by Steve Dower in :issue:`37609`.)

pathlib
-------

Expand Down
14 changes: 7 additions & 7 deletions Lib/importlib/_bootstrap_external.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,9 @@ def _path_join(*path_parts):
root = ""
path = []
for new_root, tail in map(_os._path_splitroot, path_parts):
if new_root.startswith(path_sep_tuple) or new_root.endswith(path_sep_tuple):
root = new_root.rstrip(path_separators) or root
path = [path_sep + tail]
if new_root.startswith(path_sep_tuple) or tail.startswith(path_sep_tuple):
root = new_root or root
path = [tail]
elif new_root.endswith(':'):
if root.casefold() != new_root.casefold():
# Drive relative paths have to be resolved by the OS, so we reset the
Expand All @@ -116,8 +116,8 @@ def _path_join(*path_parts):
else:
root = new_root or root
path.append(tail)
path = [p.rstrip(path_separators) for p in path if p]
if len(path) == 1 and not path[0]:
path = [p for p in (p1.rstrip(path_separators) for p1 in path) if p]
if not path:
# Avoid losing the root's trailing separator when joining with nothing
return root + path_sep
return root + path_sep.join(path)
Comment on lines +119 to 123
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • _path_join('//server/share', 'spam') should return "//server/share/spam", not "//server/sharespam"
  • _path_join("C:/", "spam") should return "C:/spam", not "C:spam"
  • _path_join("C:", "") should return "C:", not "C:\".
  • _path_join("C:", "D:") should return "D:", not "D:\"

I don't like the change to make _path_splitroot() shift the separator into the path. It goes against what I expect given the function name, in terms of separating the root from the rest of the path. The root variable here actually has no root path, which is counterintuitive.

As it stands, the following fixes the above case by shifting the separator back into root. The UNC + relative case has to be handled specially at the end.

    if path and path[0].startswith(path_sep_tuple):
        root += path[0][0]
        path[0] = path[0][1:]
    tail = path_sep.join(c for p in path if (c := p.rstrip(path_separators)))
    # Fix up appending a relative path to a UNC root
    if (tail and root.startswith(path_sep_tuple) and
          not root.endswith(path_sep_tuple) and
          not tail.startswith(path_sep_tuple)):
        root += path_sep
    return root + tail

Expand Down Expand Up @@ -173,8 +173,8 @@ def _path_isabs(path):
"""Replacement for os.path.isabs."""
if not path:
return False
root = _os._path_splitroot(path)[0].replace('/', '\\')
return len(root) > 1 and (root.startswith('\\\\') or root.endswith('\\'))
root, tail = _os._path_splitroot(path)
return root.startswith(('\\\\', '//')) or tail.startswith(('\\', '/'))

else:
def _path_isabs(path):
Expand Down
78 changes: 52 additions & 26 deletions Lib/ntpath.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,31 @@ def normcase(s):
# volume), or if a pathname after the volume-letter-and-colon or UNC-resource
# starts with a slash or backslash.

try:
from nt import _path_splitroot
except ImportError:
_path_splitroot = _splitroot = None
else:
def _splitroot(p):
p = os.fspath(p)
r = _path_splitroot(p)
if isinstance(p, bytes):
return tuple(map(os.fsencode, r))
return r
Comment on lines +65 to +71
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change the name to nt._path_splitdrive, since that makes more sense now that the root is shifted over to the path side. Implement the POSIX fallback as _path_splitdrive, which only returns str, just like the builtin function. Move the above bytes support code into a common implementation of ntpath.splitdrive(), which is likely the only place it will ever be needed. _path_splitdrive() will be used directly when bytes support isn't required.

I'm sure you know the os.fspath() call isn't necessary since it's handled for a path_t argument in C, so why is it called explicitly here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isinstance at 69 needs to recognise values for p that have __fspath__ that returns bytes, as well as actual bytes, which is why we fspath once instead of doing it multiple times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isinstance at 69 needs to recognise values for p that have __fspath__ that returns bytes, as well as actual bytes, which is why we fspath once instead of doing it multiple times.

Sigh... sorry about that. :-$



def isabs(s):
"""Test whether a path is absolute"""
if _path_splitroot:
try:
root, tail = _path_splitroot(s)
except ValueError:
pass
else:
# UNC roots are always absolute
if root.startswith(sep + sep) or root.startswith(altsep + altsep):
return True
return tail.startswith((sep, altsep))
s = os.fspath(s)
# Paths beginning with \\?\ are always absolute, but do not
# necessarily contain a drive.
Expand Down Expand Up @@ -141,17 +164,24 @@ def splitdrive(p):

"""
p = os.fspath(p)
if _splitroot:
try:
return _splitroot(p)
except ValueError:
pass

if isinstance(p, bytes):
sep = b'\\'
altsep = b'/'
colon = b':'
else:
sep = '\\'
altsep = '/'
colon = ':'

if len(p) >= 2:
if isinstance(p, bytes):
sep = b'\\'
altsep = b'/'
colon = b':'
else:
sep = '\\'
altsep = '/'
colon = ':'
normp = p.replace(altsep, sep)
if (normp[0:2] == sep*2) and (normp[2:3] != sep):
if (normp[0:2] == sep*2):
# is a UNC path:
# vvvvvvvvvvvvvvvvvvvv drive letter or UNC path
# \\machine\mountpoint\directory\etc\...
Expand All @@ -160,10 +190,11 @@ def splitdrive(p):
if index == -1:
return p[:0], p
index2 = normp.find(sep, index + 1)
# a UNC path can't have two slashes in a row
# (after the initial two)
# a UNC path shouldn't have two slashes in a row
# (after the initial two), but to be consistent with
# the native function, we split before them.
if index2 == index + 1:
return p[:0], p
return p[:index], p[index:]
if index2 == -1:
index2 = len(p)
return p[:index2], p[index2:]
Expand Down Expand Up @@ -262,19 +293,14 @@ def lexists(path):
def ismount(path):
"""Test whether a path is a mount point (a drive root, the root of a
share, or a mounted volume)"""
path = os.fspath(path)
seps = _get_bothseps(path)
path = abspath(path)
path = os.fsdecode(path)
root, rest = splitdrive(path)
if root and root[0] in seps:
return (not rest) or (rest in seps)
if rest in seps:
return True
if root:
return rest in ("/", "\\", "") or root.startswith(("\\\\", "//"))

if _getvolumepathname:
return path.rstrip(seps) == _getvolumepathname(path).rstrip(seps)
else:
return False
return path.rstrip("/\\") == _getvolumepathname(path).rstrip("/\\")
return False


# Expand paths beginning with '~' or '~user'.
Expand Down Expand Up @@ -505,7 +531,7 @@ def _abspath_fallback(path):
"""

path = os.fspath(path)
if not isabs(path):
if not splitdrive(path)[0]:
if isinstance(path, bytes):
cwd = os.getcwdb()
else:
Expand Down Expand Up @@ -671,9 +697,9 @@ def realpath(path):
return path


# Win9x family and earlier have no Unicode filename support.
supports_unicode_filenames = (hasattr(sys, "getwindowsversion") and
sys.getwindowsversion()[3] >= 2)
# All supported platforms have Unicode filenames
supports_unicode_filenames = True


def relpath(path, start=None):
"""Return a relative version of a path"""
Expand Down
Loading