Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize joining of pathlib.PurePath() arguments. #104996

Closed
barneygale opened this issue May 26, 2023 · 1 comment
Closed

Optimize joining of pathlib.PurePath() arguments. #104996

barneygale opened this issue May 26, 2023 · 1 comment
Labels
3.13 bugs and security fixes performance Performance or resource usage topic-pathlib type-feature A feature request or enhancement

Comments

@barneygale
Copy link
Contributor

barneygale commented May 26, 2023

In Python 3.12, when multiple arguments are given to PurePath(), the initialiser calls os.path.join() to join them. This is reasonably slow. For Python 3.13 we can make it faster by:

  1. Deferring joining of arguments until strictly needed
  2. (Maybe) re-implementing os.path.join(), as pathlib did before gh-94909: fix joining of absolute and relative Windows paths in pathlib  #95450.

Linked PRs

@barneygale barneygale added type-feature A feature request or enhancement performance Performance or resource usage topic-pathlib 3.13 bugs and security fixes labels May 26, 2023
barneygale added a commit to barneygale/cpython that referenced this issue May 26, 2023
Joining of arguments is moved to `_load_parts`, which is called when a
normalized path is needed.
@barneygale
Copy link
Contributor Author

barneygale commented May 26, 2023

The reason for (maybe) re-implementing os.path.join():

Any implementation of path joining must keep a running record of the current drive, root and tail, and update them by calling splitroot() on each argument. The final drive, root and tail would be useful in pathlib to implement drive, root and _tail, but os.path.join() only makes available the joined path, which is a combination of the drive, root and tail. As a consequence, we need to call splitroot() all over again in _parse_path().

barneygale added a commit to barneygale/cpython that referenced this issue Jun 7, 2023
barneygale added a commit that referenced this issue Jun 7, 2023
Joining of arguments is moved to `_load_parts`, which is called when a
normalized path is needed.
barneygale added a commit to barneygale/cpython that referenced this issue Jun 7, 2023
…thonGH-104999)

Joining of arguments is moved to `_load_parts`, which is called when a
normalized path is needed.

(cherry picked from commit ffeaec7)
barneygale added a commit to barneygale/cpython that referenced this issue Jun 7, 2023
…ts. (pythonGH-104999)

Joining of arguments is moved to `_load_parts`, which is called when a
normalized path is needed..
(cherry picked from commit ffeaec7)

Co-authored-by: Barney Gale <barney.gale@gmail.com>
barneygale added a commit that referenced this issue Jun 7, 2023
…H-104999) (GH-105483)

Joining of arguments is moved to `_load_parts`, which is called when a
normalized path is needed.

(cherry picked from commit ffeaec7)
barneygale added a commit to barneygale/cpython that referenced this issue Jun 8, 2023
Copy the `ntpath.join()` algorithm into pathlib and adjust it to remove
string concatenation. The resulting drive, root and tail are stored on the
path object without creating an intermediate joined path.
barneygale added a commit to barneygale/cpython that referenced this issue Jul 19, 2023
barneygale added a commit to barneygale/cpython that referenced this issue Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes performance Performance or resource usage topic-pathlib type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant