Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-127236: pathname2url(): generate RFC 1738 URL for absolute POSIX path #127194

Merged
merged 4 commits into from
Nov 25, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions Doc/library/urllib.request.rst
Original file line number Diff line number Diff line change
Expand Up @@ -159,12 +159,14 @@ The :mod:`urllib.request` module defines the following functions:
'file:///C:/Program%20Files'

.. versionchanged:: 3.14
Windows drive letters are no longer converted to uppercase.
Paths beginning with a slash are converted to URLs with authority
sections. For example, the path ``/etc/hosts`` is converted to
the URL ``///etc/hosts``.

.. versionchanged:: 3.14
On Windows, ``:`` characters not following a drive letter are quoted. In
previous versions, :exc:`OSError` was raised if a colon character was
found in any position other than the second character.
Windows drive letters are no longer converted to uppercase, and ``:``
characters not following a drive letter no longer cause an
:exc:`OSError` exception to be raised on Windows.


.. function:: url2pathname(url)
Expand Down
20 changes: 12 additions & 8 deletions Lib/nturl2path.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,17 @@ def pathname2url(p):
p = p[4:]
if p[:4].upper() == 'UNC/':
p = '//' + p[4:]
drive, tail = ntpath.splitdrive(p)
if drive[1:] == ':':
# DOS drive specified. Add three slashes to the start, producing
# an authority section with a zero-length authority, and a path
# section starting with a single slash.
drive = f'///{drive}'
drive, root, tail = ntpath.splitroot(p)
if drive:
if drive[1:] == ':':
# DOS drive specified. Add three slashes to the start, producing
# an authority section with a zero-length authority, and a path
# section starting with a single slash.
drive = f'///{drive}'
drive = urllib.parse.quote(drive, safe='/:')
elif root:
# Add explicitly empty authority to path beginning with one slash.
root = f'//{root}'

drive = urllib.parse.quote(drive, safe='/:')
tail = urllib.parse.quote(tail)
return drive + tail
return drive + root + tail
10 changes: 5 additions & 5 deletions Lib/test/test_urllib.py
Original file line number Diff line number Diff line change
Expand Up @@ -1434,7 +1434,7 @@ def test_pathname2url_win(self):
self.assertEqual(fn('C:\\foo:bar'), '///C:/foo%3Abar')
self.assertEqual(fn('foo:bar'), 'foo%3Abar')
# No drive letter
self.assertEqual(fn("\\folder\\test\\"), '/folder/test/')
self.assertEqual(fn("\\folder\\test\\"), '///folder/test/')
self.assertEqual(fn("\\\\folder\\test\\"), '//folder/test/')
self.assertEqual(fn("\\\\\\folder\\test\\"), '///folder/test/')
self.assertEqual(fn('\\\\some\\share\\'), '//some/share/')
Expand All @@ -1447,7 +1447,7 @@ def test_pathname2url_win(self):
self.assertEqual(fn('//?/unc/server/share/dir'), '//server/share/dir')
# Round-tripping
urls = ['///C:',
'/folder/test/',
'///folder/test/',
'///C:/foo/bar/spam.foo']
for url in urls:
self.assertEqual(fn(urllib.request.url2pathname(url)), url)
Expand All @@ -1456,12 +1456,12 @@ def test_pathname2url_win(self):
'test specific to POSIX pathnames')
def test_pathname2url_posix(self):
fn = urllib.request.pathname2url
self.assertEqual(fn('/'), '/')
self.assertEqual(fn('/a/b.c'), '/a/b.c')
self.assertEqual(fn('/'), '///')
self.assertEqual(fn('/a/b.c'), '///a/b.c')
self.assertEqual(fn('//a/b.c'), '////a/b.c')
self.assertEqual(fn('///a/b.c'), '/////a/b.c')
self.assertEqual(fn('////a/b.c'), '//////a/b.c')
self.assertEqual(fn('/a/b%#c'), '/a/b%25%23c')
self.assertEqual(fn('/a/b%#c'), '///a/b%25%23c')

@unittest.skipUnless(os_helper.FS_NONASCII, 'need os_helper.FS_NONASCII')
def test_pathname2url_nonascii(self):
Expand Down
5 changes: 2 additions & 3 deletions Lib/urllib/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -1667,9 +1667,8 @@ def url2pathname(pathname):
def pathname2url(pathname):
"""OS-specific conversion from a file system path to a relative URL
of the 'file' scheme; not recommended for general use."""
if pathname[:2] == '//':
# Add explicitly empty authority to avoid interpreting the path
# as authority.
if pathname[:1] == '/':
# Add explicitly empty authority to absolute path.
pathname = '//' + pathname
encoding = sys.getfilesystemencoding()
errors = sys.getfilesystemencodeerrors()
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
:func:`urllib.request.pathname2url` now adds an empty authority when
generating a URL for a path that begins with exactly one slash. For example,
the path ``/etc/hosts`` is converted to the scheme-less URL ``///etc/hosts``.
As a result of this change, URLs without authorities are only generated for
relative paths.
Loading