Skip to content

Commit

Permalink
Improve pathname2url() and url2pathname() docs (#127125)
Browse files Browse the repository at this point in the history
These functions have long sown confusion among Python developers. The
existing documentation says they deal with URL path components, but that
doesn't fit the evidence on Windows:

    >>> pathname2url(r'C:\foo')
    '///C:/foo'
    >>> pathname2url(r'\\server\share')
    '////server/share'  # or '//server/share' as of quite recently

If these were URL path components, they would imply complete URLs like
`file://///C:/foo` and `file://////server/share`. Clearly this isn't right.
Yet the implementation in `nturl2path` is deliberate, and the 
`url2pathname()` function correctly inverts it.

On non-Windows platforms, the behaviour until quite recently is to simply
quote/unquote the path without adding or removing any leading slashes. This
behaviour is compatible with *both* interpretations -- 1) the value is a
URL path component (existing docs), and 2) the value is everything
following `file:` (this commit)

The conclusion I draw is that these functions operate on everything after
the `file:` prefix, which may include an authority section. This is the
only explanation that fits both the  Windows and non-Windows behaviour.
It's also a better match for the function names.
  • Loading branch information
barneygale authored Nov 24, 2024
1 parent 97b2cea commit 307c633
Showing 1 changed file with 19 additions and 7 deletions.
26 changes: 19 additions & 7 deletions Doc/library/urllib.request.rst
Original file line number Diff line number Diff line change
Expand Up @@ -148,9 +148,15 @@ The :mod:`urllib.request` module defines the following functions:

.. function:: pathname2url(path)

Convert the pathname *path* from the local syntax for a path to the form used in
the path component of a URL. This does not produce a complete URL. The return
value will already be quoted using the :func:`~urllib.parse.quote` function.
Convert the given local path to a ``file:`` URL. This function uses
:func:`~urllib.parse.quote` function to encode the path. For historical
reasons, the return value omits the ``file:`` scheme prefix. This example
shows the function being used on Windows::

>>> from urllib.request import pathname2url
>>> path = 'C:\\Program Files'
>>> 'file:' + pathname2url(path)
'file:///C:/Program%20Files'

.. versionchanged:: 3.14
Windows drive letters are no longer converted to uppercase.
Expand All @@ -161,11 +167,17 @@ The :mod:`urllib.request` module defines the following functions:
found in any position other than the second character.


.. function:: url2pathname(path)
.. function:: url2pathname(url)

Convert the given ``file:`` URL to a local path. This function uses
:func:`~urllib.parse.unquote` to decode the URL. For historical reasons,
the given value *must* omit the ``file:`` scheme prefix. This example shows
the function being used on Windows::

Convert the path component *path* from a percent-encoded URL to the local syntax for a
path. This does not accept a complete URL. This function uses
:func:`~urllib.parse.unquote` to decode *path*.
>>> from urllib.request import url2pathname
>>> url = 'file:///C:/Program%20Files'
>>> url2pathname(url.removeprefix('file:'))
'C:\\Program Files'

.. versionchanged:: 3.14
Windows drive letters are no longer converted to uppercase.
Expand Down

0 comments on commit 307c633

Please sign in to comment.