Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document various options for getting the absolute path from pathlib.Path objects #83271

Closed
brettcannon opened this issue Dec 18, 2019 · 16 comments
Labels
3.9 only security fixes docs Documentation in the Doc dir topic-pathlib type-feature A feature request or enhancement

Comments

@brettcannon
Copy link
Member

BPO 39090
Nosy @brettcannon, @pfmoore, @eryksun, @vedgar, @PythonCHB, @designerzim, @florisla, @barneygale, @John-Hennig

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-12-18.18:20:50.641>
labels = ['type-feature', '3.9', 'docs']
title = 'Document various options for getting the absolute path from pathlib.Path objects'
updated_at = <Date 2022-04-04.13:40:35.399>
user = 'https://github.com/brettcannon'

bugs.python.org fields:

activity = <Date 2022-04-04.13:40:35.399>
actor = 'barneygale'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2019-12-18.18:20:50.641>
creator = 'brett.cannon'
dependencies = []
files = []
hgrepos = []
issue_num = 39090
keywords = []
message_count = 15.0
messages = ['358638', '358854', '361698', '361877', '361878', '361885', '362030', '362578', '362579', '362627', '387634', '416465', '416470', '416488', '416668']
nosy_count = 11.0
nosy_names = ['brett.cannon', 'paul.moore', 'docs@python', 'eryksun', 'veky', 'ChrisBarker', 'Zim', '4-launchpad-kalvdans-no-ip-org', 'florisla', 'barneygale', 'John-Hennig']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue39090'
versions = ['Python 3.9']

@brettcannon
Copy link
Member Author

The question on how best to get an absolute path from a pathlib.Path object keeps coming up (see https://bugs.python.org/issue29688, https://discuss.python.org/t/add-absolute-name-to-pathlib-path/2882/, and https://discuss.python.org/t/pathlib-absolute-vs-resolve/2573 as examples).

As pointed out across those posts, getting the absolute path is surprisingly subtle and varied depending on your needs. As such we should probably add a section somewhere in the pathlib docs explaining the various ways and why you would choose one over the other.

@brettcannon brettcannon added docs Documentation in the Doc dir type-feature A feature request or enhancement labels Dec 18, 2019
@brettcannon brettcannon added docs Documentation in the Doc dir type-feature A feature request or enhancement labels Dec 18, 2019
@PythonCHB
Copy link
Mannequin

PythonCHB mannequin commented Dec 24, 2019

Yes Please!

I'd offer to help, but I really don't get the intricacies involved. I will offer to proofread and copy-edit though, if that's helpful.

And I note that coincidentally, just in the last week, I needed to make an absolute path from a Path, and it took me far too long to figure out that .resolve() would do it for me. Then I needed to do it again three days later, and it again took a while -- "resolve" is simply not mnemonic for me, and I'm guessing a lot of people have the same issue.

And I didn't find .absolute(), cause it's not documented. I see in issue bpo-29688 that there are reasons for that, but I'll make a plea:

Please document .absolute(), even if those docs say something like "may not work in all circumstances, not well tested". Alternatively, if it's decided that folks should just use .resolve() in all cases anyway, then make .absolute() an alias for .resolve().

Or if that's not a good option, then at least put some prominent notes in resolve() so people will find it.

Also -- I needed to read the resolve() docs carefully (and then test) to see if it was what I wanted - which I know, is what this issue is about.

In short -- I understand that this is a complex issue, but making an absolute path is a pretty common use case, and we've had os.path.abspath() for decades, so there should be one obvious way to do it, and it should be easily discoverable.

NOTE: even if there is no one to do the work of properly testing .absolute() at this point, it would b nice to at least decide now what the long term goal is -- will there be an absolute() or is resolve() all we really need?

@pfmoore
Copy link
Member

pfmoore commented Feb 10, 2020

In short -- I understand that this is a complex issue, but making an absolute path is a pretty common use case, and we've had os.path.abspath() for decades, so there should be one obvious way to do it, and it should be easily discoverable.

+1 on this.

Given that (as far as I can tell from the various discussions) resolve works fine as long as the file exists, maybe the key distinction to make is whether you have an existing file or not.

(More subtle questions like UNC path vs drive letter, mentioned on the Discourse thread, are probably things that we can defer to a "more advanced cases" discussion in the docs).

@florisla
Copy link
Mannequin

florisla mannequin commented Feb 12, 2020

I've written an "Absolute paths" section based on the knowledge I found in the various threads.

Any review is appreciated.

https://github.com/florisla/cpython/tree/pathlib-chapter-absolute-paths

With some related documentation changes:

https://github.com/florisla/cpython/tree/absolute-path-related-improvements

@pfmoore
Copy link
Member

pfmoore commented Feb 12, 2020

You've provided links to your branches, but not to the specific text you're proposing to add. Can you link to a diff or something that shows what you've added more precisely?

@florisla
Copy link
Mannequin

florisla mannequin commented Feb 12, 2020

This is the new chapter:

florisla@c146ad3

@vedgar
Copy link
Mannequin

vedgar mannequin commented Feb 15, 2020

If we want something mnemonic, I'm sure nothing beats __abs__. (Hey, we have __truediv__ already!;)

@florisla
Copy link
Mannequin

florisla mannequin commented Feb 24, 2020

@chrisbarker,

Could you review the proposed addition to the documentation?

florisla@c146ad3

@florisla
Copy link
Mannequin

florisla mannequin commented Feb 24, 2020

(sorry, didn't see the GitHub comments before... I'll process those first.)

@florisla
Copy link
Mannequin

florisla mannequin commented Feb 25, 2020

Based on the feedback received in GitHub here:
florisla@c146ad3

I made a new revision of the 'Absolute paths' chapter here:
https://github.com/florisla/cpython/blob/pathlib-chapter-absolute-paths-2/Doc/library/pathlib.rst#absolute-paths

Further feedback is welcome.

Changes:

  • Be more 'in your face' about Path.resolve() being the recommended
    approach.
  • Add separate section on Windows considerations
  • Explain difference between Path.resolve() and os.path.isabs() w.r.t.
    checking for drive.
  • Refer to 'mapped share' instead of 'mapped network share'.
  • Explain replacement of substitute drive with final path.
  • Mention os.path.abspath's upcasing of drive letter in case of
    a path missing a root.
  • Mention different handling of junctions versus symlinks w.r.t.
    relative parts.

For brevity, I've kept the wording on substitute drive and handling of
junctions very short.

For the same reason I did not not include eryksun's (interesting!) info
on why mapped and substitute drives are non-canonical.

Not mentioning Path.resolve()'s behavior w.r.t. non-existing files since
that's documented in resolve() itself.

@John-Hennig
Copy link
Mannequin

John-Hennig mannequin commented Feb 24, 2021

@Floris:

Not mentioning Path.resolve()'s behavior w.r.t. non-existing files since
that's documented in resolve() itself.

I don't see it mentioned in the documentation of resolve(), or anywhere else in the docs, that on Windows (but not on other platforms) resolve() does not resolve a relative path to an absolute path if the file does not exist. As opposed to absolute(), which works as expected on any platform.

Linux:

Python 3.6.9 (default, Oct  8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pathlib import Path
>>> file = Path('new.txt')
>>> file.exists()
False
>>> file.resolve()
PosixPath('/home/user/new.txt')

Windows:

Python 3.9.2 (tags/v3.9.2:1a79785, Feb 19 2021, 13:44:55) [MSC v.1928 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pathlib import Path
>>> file = Path('new.txt')
>>> file.exists()
False
>>> file.resolve()
WindowsPath('new.txt')
>>> file.absolute()
WindowsPath('d:/home/new.txt')
>>> file.touch()
>>> file.resolve()
WindowsPath('D:/home/new.txt')
>>> file.unlink()
>>> file.resolve()
WindowsPath('new.txt')
>>>

@designerzim
Copy link
Mannequin

designerzim mannequin commented Mar 31, 2022

First, I hope we all agree:
'C:\Windows' and '/usr/bin' == absolute path
'Windows' and 'bin' == relative path
'C:\Program Files' and '/bin' == absolute path
'C:\Windows\..\Program Files' and '/usr/../bin' == relative path

It is very confusing between these two, but despite claims otherwise, absolute() does not work as expected. However, to sum up the findings below:

  • absolute() fails to resolve paths with relative steps (esp "..") but will always add the dir structure even if the file doesn't exist.
  • resolve() will always give an absolute path.
    ** unless the file doesn't exist --
    ** unless unless the path includes a '..'!
  • There's also a related problem with is_absolute() being incorrect with relative paths (such as '..'), which usually results in it saying absolute().is_absolute() is True when it is obviously False.

Done on Windows 10, python 3.9.5

>>> ini
WindowsPath('desktop.ini/..')
>>> ini.resolve().is_absolute()
True
>>> ini.absolute()
WindowsPath('C:/Users/zim/Downloads/desktop.ini/..')
>>> ini.absolute().is_absolute()
True

This second should not be True, there is a trailing '..' not resolved by absolute()

Now let's create a truly messy path:
>>> ini.resolve()
WindowsPath('C:/Users/zim/Downloads')
>>> ini = ini / "ntuser.ini"
>>> ini.exists()
False
>>> ini.resolve()
WindowsPath('C:/Users/zim/Downloads/ntuser.ini')
>>> ini = ini / "../ntuser.ini"
>>> ini.exists()
False
>>> ini.resolve()
WindowsPath('C:/Users/zim/Downloads/ntuser.ini')
>>> ini = ini / "../../ntuser.ini"
>>> ini.resolve()
WindowsPath('C:/Users/zim/ntuser.ini')
>>> ini.exists()
True
>>> ini.absolute()
WindowsPath('C:/Users/zim/Downloads/desktop.ini/../ntuser.ini/../ntuser.ini/../../ntuser.ini')
>>> ini.absolute().is_absolute()
True

absolute() not only doesn't give an absolute path, but is_absolute() is somehow ok with that.

Now a file that doesn't exist:
>>> mike = Path("palin.jpg")
>>> mike.resolve()
WindowsPath('palin.jpg')
>>> mike.resolve().is_absolute()
False
>>> mike.absolute()
WindowsPath('C:/Users/zim/Downloads/palin.jpg')
>>> mike.absolute().is_absolute()
True

Finally, absolute() is right about the right thing, but resolve() is not terribly wrong. is_absolute() is correctly False here (for once).

The problem is that the after a resolve() call, a Path object can still be used to create a file (good), but if resolve() is used before file creation, then the full path will not be there as should be expected (bad). This seems like a bug with resolve()

What if a file is non existent AND relative? Things get more confusing.

>>> badrel = Path('../circus.jpg')
>>> badrel
WindowsPath('../circus.jpg')
>>> badrel.absolute()
WindowsPath('C:/Users/zim/Downloads/../circus.jpg')
>>> badrel.resolve()
WindowsPath('C:/Users/zim/circus.jpg')
>>> badrel.exists()
False

So, absolute() still acts like the normal trash fire it is with relative paths, but what's this, resolve() actually gives an absolute path?!

I should note resolve() only behaves unpredictably on Windows. It correctly resolves non-existent files no matter what on macOS and Linux (caveat: my linux test was done with python 3.6). However, absolute() always fails to distill paths with relative steps regardless of OS.

So, it seems clear:
Bug 1: resolve() should work the same with non-existent files with incomplete paths on Windows as it does on *nix platforms, as it does on Windows when handling existent files and non-existent ones with parent path notation.
Bug 2: Obviously if absolute() is supposed to be in the lib, it should be documented, and it likely should be distinct from resolve(), but most of all: it should return actual absolute paths! If these cannot be fulfilled, it should be set to be deprecated (after resolve() is fixed, hopefully)
Bug 3: is_absolute() should actually detect absolute paths, instead it seems to report True if the path contains a root starting point, but ignores relative changes in between. (this issue exists on all three major OSs)

@designerzim designerzim mannequin added 3.9 only security fixes labels Mar 31, 2022
@vedgar
Copy link
Mannequin

vedgar mannequin commented Apr 1, 2022

First, I hope we all agree:
'C:\Windows\..\Program Files' and '/usr/../bin' == relative path

I don't agree. To me, absolute means regardless of a reference point. So, absolute path would be a path that refers to the same entity from whichever directory you reference it. And that is surely the case for these two.

@eryksun
Copy link
Contributor

eryksun commented Apr 1, 2022

Now a file that doesn't exist:
>>> mike = Path("palin.jpg")
>>> mike.resolve()
WindowsPath('palin.jpg')

This is a bug in resolve(). It was fixed in 3.10+ by switching to ntpath.realpath(). I don't remember why a fix for 3.9 was never applied. Work on the PR may have stalled due to a minor disagreement.

'C:\Windows\..\Program Files' and '/usr/../bin' == relative path

No, a relative path depends on either the current working directory or, for a symlink target, the path of the directory that contains the symlink.

In Windows, a rooted path such as r"\spam" is a relative path because it depends on the drive of the current working directory. For example, if the current working directory is r"Z:\eggs", then r"\spam" resolves to r"Z:\spam". Also, a drive-relative paths such as "Z:spam" depends on the working directory of the given drive. Windows supports a separate working directory for each drive. For example, if the working directory of drive "Z:" is r"Z:\eggs", then "Z:spam" resolves to r"Z:\eggs\spam".

@barneygale
Copy link
Mannequin

barneygale mannequin commented Apr 4, 2022

The docs for PurePath.is_absolute() say:

A path is considered absolute if it has both a root and (if the flavour allows) a drive

This does not preclude it from having ".." segments.

PurePath.absolute() is documented as of bpo-29688 / 3.11, see: https://docs.python.org/3.11/library/pathlib.html#pathlib.Path.absolute

The documentation for the absolute() method is deliberately placed alongside resolve() for ease of comparison. Both methods make a path absolute, but resolve() also follows symlinks, and consequently is able to safely elide ".." segments.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
olsen232 added a commit to koordinates/kart that referenced this issue Nov 11, 2022
on python3.9 - it should make the path absolute, but it doesn't.
python/cpython#83271
@barneygale
Copy link
Contributor

Completed in #26153

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes docs Documentation in the Doc dir topic-pathlib type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants