Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-110109: Drop use of new regex features in pathlib._abc. #113292

Closed
wants to merge 2 commits into from

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Dec 19, 2023

A regex group with a ?+ possessive quantifier was used to match empty paths, which were represented as a single dot, preventing the dot from being matched by other wildcards. This quantifier is only available from Python 3.11+, but pathlib's _abc.py file will be made available as a PyPI package for Python 3.8+.

This commit adds a new private _pattern_str property that works like __str__() but represents empty paths as '' rather than '.'. This string is used for pattern matching, which removes the need for the possessive group. We also replace re.NOFLAG with 0, again for backwards-compatibility with older Python.

Improves compatibility with older Python; no other change of behaviour.

See external pathlib_abc issue: barneygale/pathlib-abc#8

A regex group with a `?+` possessive quantifier was used to match empty
paths, which were represented as a single dot, preventing the dot from
being matched by other wildcards. This quantifier is only available from
Python 3.11+, but pathlib's `_abc.py` file will be made available as a PyPI
package for Python 3.8+.

This commit adds a new private `_pattern_str` property that works like
`__str__()` but represents empty paths as `''` rather than `'.'`. This
string is used for pattern matching, which removes the need for the
possessive group.

Improves compatibility with older Python; no other change of behaviour.
@barneygale
Copy link
Contributor Author

I think this is slower, but only very slightly:

$ ./python -m timeit -s "from pathlib import Path; p = Path('foo.py')" "p.match('*.py')"
50000 loops, best of 5: 4.75 usec per loop  # before
50000 loops, best of 5: 4.81 usec per loop  # after

$ ./python -m timeit -s "from pathlib import Path" "list(Path.cwd().glob('**/*', follow_symlinks=True))"
5 loops, best of 5: 81 msec per loop  # before
5 loops, best of 5: 81 msec per loop  # after

@barneygale
Copy link
Contributor Author

@AlexWaygood pointed me towards this video, which shows how to implement atomic groups/posessive qualifiers in older Python using a capturing lookahead and a backref. It's a one-line patch so I'll apply it only in the external pathlib_abc package. Withdrawing this PR.

@barneygale barneygale closed this Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant