Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-73435: Add pathlib.PurePath.full_match() #114350

Merged
merged 9 commits into from
Jan 26, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Doc/library/glob.rst
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ The :mod:`glob` module defines the following functions:

.. seealso::

:meth:`pathlib.PurePath.match` and :meth:`pathlib.Path.glob` methods,
:meth:`pathlib.PurePath.globmatch` and :meth:`pathlib.Path.glob` methods,
which call this function to implement pattern matching and globbing.

.. versionadded:: 3.13
Expand Down
59 changes: 29 additions & 30 deletions Doc/library/pathlib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -552,55 +552,54 @@ Pure paths provide the following methods and properties:
PureWindowsPath('c:/Program Files')


.. method:: PurePath.match(pattern, *, case_sensitive=None)
.. method:: PurePath.globmatch(pattern, *, case_sensitive=None)

Match this path against the provided glob-style pattern. Return ``True``
if matching is successful, ``False`` otherwise.

If *pattern* is relative, the path can be either relative or absolute,
and matching is done from the right::
if matching is successful, ``False`` otherwise. For example::

>>> PurePath('a/b.py').match('*.py')
True
>>> PurePath('/a/b/c.py').match('b/*.py')
>>> PurePath('a/b.py').globmatch('a/*.py')
True
>>> PurePath('/a/b/c.py').match('a/*.py')
>>> PurePath('a/b.py').globmatch('*.py')
False
>>> PurePath('/a/b/c.py').globmatch('/a/**')
True
>>> PurePath('/a/b/c.py').globmatch('**/*.py')
True

If *pattern* is absolute, the path must be absolute, and the whole path
must match::
As with other methods, case-sensitivity follows platform defaults::

>>> PurePath('/a.py').match('/*.py')
True
>>> PurePath('a/b.py').match('/*.py')
>>> PurePosixPath('b.py').globmatch('*.PY')
False
>>> PureWindowsPath('b.py').globmatch('*.PY')
True

The *pattern* may be another path object; this speeds up matching the same
pattern against multiple files::
Set *case_sensitive* to ``True`` or ``False`` to override this behaviour.

>>> pattern = PurePath('*.py')
>>> PurePath('a/b.py').match(pattern)
True
.. versionadded:: 3.13

.. versionchanged:: 3.12
Accepts an object implementing the :class:`os.PathLike` interface.

As with other methods, case-sensitivity follows platform defaults::
.. method:: PurePath.match(pattern, *, case_sensitive=None)

>>> PurePosixPath('b.py').match('*.PY')
False
>>> PureWindowsPath('b.py').match('*.PY')
Match this path against the provided non-recursive glob-style pattern.
Return ``True`` if matching is successful, ``False`` otherwise.

This method is similar to :meth:`~PurePath.globmatch`, but the recursive
wildcard "``**``" is not supported (it acts like non-recursive "``*``"),
and if a relative pattern is given, then matching is done from the right::

>>> PurePath('a/b.py').match('*.py')
True
>>> PurePath('/a/b/c.py').match('b/*.py')
True
>>> PurePath('/a/b/c.py').match('a/*.py')
False

Set *case_sensitive* to ``True`` or ``False`` to override this behaviour.
.. versionchanged:: 3.12
The *pattern* parameter accepts a :term:`path-like object`.

.. versionchanged:: 3.12
The *case_sensitive* parameter was added.

.. versionchanged:: 3.13
Support for the recursive wildcard "``**``" was added. In previous
versions, it acted like the non-recursive wildcard "``*``".


.. method:: PurePath.relative_to(other, walk_up=False)

Expand Down
3 changes: 2 additions & 1 deletion Doc/whatsnew/3.13.rst
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,8 @@ pathlib
object from a 'file' URI (``file:/``).
(Contributed by Barney Gale in :gh:`107465`.)

* Add support for recursive wildcards in :meth:`pathlib.PurePath.match`.
* Add :meth:`pathlib.PurePath.globmatch` for matching paths with
shell-style wildcards, including the recursive wildcard "``**``".
(Contributed by Barney Gale in :gh:`73435`.)

* Add *follow_symlinks* keyword-only argument to :meth:`pathlib.Path.glob`,
Expand Down
32 changes: 24 additions & 8 deletions Lib/pathlib/_abc.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,15 +48,15 @@ def _is_case_sensitive(pathmod):


@functools.lru_cache(maxsize=256)
def _compile_pattern(pat, sep, case_sensitive):
def _compile_pattern(pat, sep, case_sensitive, recursive=True):
"""Compile given glob pattern to a re.Pattern object (observing case
sensitivity)."""
global re, glob
if re is None:
import re, glob

flags = re.NOFLAG if case_sensitive else re.IGNORECASE
regex = glob.translate(pat, recursive=True, include_hidden=True, seps=sep)
regex = glob.translate(pat, recursive=recursive, include_hidden=True, seps=sep)
# The string representation of an empty path is a single dot ('.'). Empty
# paths shouldn't match wildcards, so we consume it with an atomic group.
regex = r'(\.\Z)?+' + regex
Expand Down Expand Up @@ -450,13 +450,29 @@ def match(self, path_pattern, *, case_sensitive=None):
if case_sensitive is None:
case_sensitive = _is_case_sensitive(self.pathmod)
sep = path_pattern.pathmod.sep
if path_pattern.anchor:
pattern_str = str(path_pattern)
elif path_pattern.parts:
pattern_str = str('**' / path_pattern)
else:
our_parts = self.parts[::-1]
pat_parts = path_pattern.parts[::-1]
if not pat_parts:
raise ValueError("empty pattern")
match = _compile_pattern(pattern_str, sep, case_sensitive)
if len(our_parts) < len(pat_parts):
return False
if len(our_parts) > len(pat_parts) and path_pattern.anchor:
return False
for our_part, pat_part in zip(our_parts, pat_parts):
match = _compile_pattern(pat_part, sep, case_sensitive, recursive=False)
if match(our_part) is None:
return False
return True

def globmatch(self, pattern, *, case_sensitive=None):
"""
Return True if this path matches the given glob-style pattern.
"""
if not isinstance(pattern, PurePathBase):
pattern = self.with_segments(pattern)
if case_sensitive is None:
case_sensitive = _is_case_sensitive(self.pathmod)
match = _compile_pattern(str(pattern), pattern.pathmod.sep, case_sensitive)
return match(str(self)) is not None


Expand Down
91 changes: 68 additions & 23 deletions Lib/test/test_pathlib/test_pathlib_abc.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,39 +249,84 @@ def test_match_common(self):
self.assertFalse(P('/ab.py').match('/a/*.py'))
self.assertFalse(P('/a/b/c.py').match('/a/*.py'))
# Multi-part glob-style pattern.
self.assertTrue(P('a').match('**'))
self.assertTrue(P('c.py').match('**'))
self.assertTrue(P('a/b/c.py').match('**'))
self.assertTrue(P('/a/b/c.py').match('**'))
self.assertTrue(P('/a/b/c.py').match('/**'))
self.assertTrue(P('/a/b/c.py').match('/a/**'))
self.assertTrue(P('/a/b/c.py').match('**/*.py'))
self.assertTrue(P('/a/b/c.py').match('/**/*.py'))
self.assertFalse(P('/a/b/c.py').match('/**/*.py'))
self.assertTrue(P('/a/b/c.py').match('/a/**/*.py'))
self.assertTrue(P('/a/b/c.py').match('/a/b/**/*.py'))
self.assertTrue(P('/a/b/c.py').match('/**/**/**/**/*.py'))
self.assertFalse(P('c.py').match('**/a.py'))
self.assertFalse(P('c.py').match('c/**'))
self.assertFalse(P('a/b/c.py').match('**/a'))
self.assertFalse(P('a/b/c.py').match('**/a/b'))
self.assertFalse(P('a/b/c.py').match('**/a/b/c'))
self.assertFalse(P('a/b/c.py').match('**/a/b/c.'))
self.assertFalse(P('a/b/c.py').match('**/a/b/c./**'))
self.assertFalse(P('a/b/c.py').match('**/a/b/c./**'))
self.assertFalse(P('a/b/c.py').match('/a/b/c.py/**'))
self.assertFalse(P('a/b/c.py').match('/**/a/b/c.py'))
self.assertRaises(ValueError, P('a').match, '**a/b/c')
self.assertRaises(ValueError, P('a').match, 'a/b/c**')
# Case-sensitive flag
self.assertFalse(P('A.py').match('a.PY', case_sensitive=True))
self.assertTrue(P('A.py').match('a.PY', case_sensitive=False))
self.assertFalse(P('c:/a/B.Py').match('C:/A/*.pY', case_sensitive=True))
self.assertTrue(P('/a/b/c.py').match('/A/*/*.Py', case_sensitive=False))
# Matching against empty path
self.assertFalse(P('').match('*'))
self.assertTrue(P('').match('**'))
self.assertFalse(P('').match('**'))
self.assertFalse(P('').match('**/*'))

def test_globmatch_common(self):
P = self.cls
# Simple relative pattern.
self.assertTrue(P('b.py').globmatch('b.py'))
self.assertFalse(P('a/b.py').globmatch('b.py'))
self.assertFalse(P('/a/b.py').globmatch('b.py'))
self.assertFalse(P('a.py').globmatch('b.py'))
self.assertFalse(P('b/py').globmatch('b.py'))
self.assertFalse(P('/a.py').globmatch('b.py'))
self.assertFalse(P('b.py/c').globmatch('b.py'))
# Wildcard relative pattern.
self.assertTrue(P('b.py').globmatch('*.py'))
self.assertFalse(P('a/b.py').globmatch('*.py'))
self.assertFalse(P('/a/b.py').globmatch('*.py'))
self.assertFalse(P('b.pyc').globmatch('*.py'))
self.assertFalse(P('b./py').globmatch('*.py'))
self.assertFalse(P('b.py/c').globmatch('*.py'))
# Multi-part relative pattern.
self.assertTrue(P('ab/c.py').globmatch('a*/*.py'))
self.assertFalse(P('/d/ab/c.py').globmatch('a*/*.py'))
self.assertFalse(P('a.py').globmatch('a*/*.py'))
self.assertFalse(P('/dab/c.py').globmatch('a*/*.py'))
self.assertFalse(P('ab/c.py/d').globmatch('a*/*.py'))
# Absolute pattern.
self.assertTrue(P('/b.py').globmatch('/*.py'))
self.assertFalse(P('b.py').globmatch('/*.py'))
self.assertFalse(P('a/b.py').globmatch('/*.py'))
self.assertFalse(P('/a/b.py').globmatch('/*.py'))
# Multi-part absolute pattern.
self.assertTrue(P('/a/b.py').globmatch('/a/*.py'))
self.assertFalse(P('/ab.py').globmatch('/a/*.py'))
self.assertFalse(P('/a/b/c.py').globmatch('/a/*.py'))
# Multi-part glob-style pattern.
self.assertTrue(P('a').globmatch('**'))
self.assertTrue(P('c.py').globmatch('**'))
self.assertTrue(P('a/b/c.py').globmatch('**'))
self.assertTrue(P('/a/b/c.py').globmatch('**'))
self.assertTrue(P('/a/b/c.py').globmatch('/**'))
self.assertTrue(P('/a/b/c.py').globmatch('/a/**'))
self.assertTrue(P('/a/b/c.py').globmatch('**/*.py'))
self.assertTrue(P('/a/b/c.py').globmatch('/**/*.py'))
self.assertTrue(P('/a/b/c.py').globmatch('/a/**/*.py'))
self.assertTrue(P('/a/b/c.py').globmatch('/a/b/**/*.py'))
self.assertTrue(P('/a/b/c.py').globmatch('/**/**/**/**/*.py'))
self.assertFalse(P('c.py').globmatch('**/a.py'))
self.assertFalse(P('c.py').globmatch('c/**'))
self.assertFalse(P('a/b/c.py').globmatch('**/a'))
self.assertFalse(P('a/b/c.py').globmatch('**/a/b'))
self.assertFalse(P('a/b/c.py').globmatch('**/a/b/c'))
self.assertFalse(P('a/b/c.py').globmatch('**/a/b/c.'))
self.assertFalse(P('a/b/c.py').globmatch('**/a/b/c./**'))
self.assertFalse(P('a/b/c.py').globmatch('**/a/b/c./**'))
self.assertFalse(P('a/b/c.py').globmatch('/a/b/c.py/**'))
self.assertFalse(P('a/b/c.py').globmatch('/**/a/b/c.py'))
self.assertRaises(ValueError, P('a').globmatch, '**a/b/c')
self.assertRaises(ValueError, P('a').globmatch, 'a/b/c**')
# Case-sensitive flag
self.assertFalse(P('A.py').globmatch('a.PY', case_sensitive=True))
self.assertTrue(P('A.py').globmatch('a.PY', case_sensitive=False))
self.assertFalse(P('c:/a/B.Py').globmatch('C:/A/*.pY', case_sensitive=True))
self.assertTrue(P('/a/b/c.py').globmatch('/A/*/*.Py', case_sensitive=False))
# Matching against empty path
self.assertFalse(P('').globmatch('*'))
self.assertTrue(P('').globmatch('**'))
self.assertFalse(P('').globmatch('**/*'))

def test_parts_common(self):
# `parts` returns a tuple.
sep = self.sep
Expand Down