Skip to content

Commit

Permalink
string: Use 're.Pattern.search' instead of 're.Pattern.match'
Browse files Browse the repository at this point in the history
Rust's 'Regex.is_match' behaves like Python's 're.Pattern.search',
here's a snippet from the docs of the former:

 Returns true if and only if there is a match for the regex *anywhere*
 in the haystack given.

Given the implementation of the regex matching mechanism for the
'validators.string.Pattern', this leads to an inconsistent behavior:

 import re
 from pydantic import BaseModel, Field

 class A(BaseModel):
     b: str = Field(pattern=r"[a-z]")
     c: str = Field(pattern=re.compile(r"[a-z]"))

 A.model_validate({"b": "Abc", "c": "Abc"})

In this snippet od code, 'b' will validate fine, but 'c' won't.

Since the test cases for string already establish the expected behavior
(the Rust's `Regex.is_match` case), let's use `re.Pattern.search`
instead of `re.Pattern.match` to unify the results when a `re.Pattern`
object is passed.

Signed-off-by: Marcin Sobczyk <msobczyk@redhat.com>
  • Loading branch information
tinez committed Jul 13, 2024
1 parent 0e6b377 commit d02dc18
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 1 deletion.
2 changes: 1 addition & 1 deletion src/validators/string.rs
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,7 @@ impl Pattern {
match &self.engine {
RegexEngine::RustRegex(regex) => Ok(regex.is_match(target)),
RegexEngine::PythonRe(py_regex) => {
Ok(!py_regex.call_method1(py, intern!(py, "match"), (target,))?.is_none(py))
Ok(!py_regex.call_method1(py, intern!(py, "search"), (target,))?.is_none(py))
}
}
}
Expand Down
2 changes: 2 additions & 0 deletions tests/validators/test_string.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ def test_str_not_json(input_value, expected):
({'pattern': r'^\d+$'}, '12345', '12345'),
({'pattern': r'\d+$'}, 'foobar 123', 'foobar 123'),
({'pattern': r'^\d+$'}, '12345a', Err("String should match pattern '^\\d+$' [type=string_pattern_mismatch")),
({'pattern': r'[a-z]'}, 'Abc', 'Abc'),
({'pattern': re.compile(r'[a-z]')}, 'Abc', 'Abc'),
# strip comes after length check
({'max_length': 5, 'strip_whitespace': True}, '1234 ', '1234'),
# to_upper and strip comes after pattern check
Expand Down

0 comments on commit d02dc18

Please sign in to comment.