Skip to content

Commit 468f944

Browse files
authored
Fix wildcard query with escaped backslash followed by wildcard (#19719) (#19756)
The getNonWildcardSequence method incorrectly handled cases where an escaped backslash was followed by a wildcard character. It would check if the character before a wildcard was a backslash, but didn't account for that backslash itself being escaped. For example, the query "*some\\*" (matching strings containing "some\") would incorrectly return "some\\*" instead of "some\\", causing invalid ngrams containing the wildcard character to be generated. The fix counts consecutive backslashes before wildcards: - Even count: wildcard is NOT escaped - Odd count: wildcard IS escaped --------- Signed-off-by: Deven Prajapati <pdevens09@gmail.com> Signed-off-by: Deven009 <pdevens09@gmail.com> Signed-off-by: Deven009 <37845204+Deven009@users.noreply.github.com>
1 parent 9ca9ee8 commit 468f944

File tree

3 files changed

+39
-2
lines changed

3 files changed

+39
-2
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
7777
- Fix issue with updating core with a patch number other than 0 ([#19377](https://github.com/opensearch-project/OpenSearch/pull/19377))
7878
- [Java Agent] Allow JRT protocol URLs in protection domain extraction ([#19683](https://github.com/opensearch-project/OpenSearch/pull/19683))
7979
- Fix potential concurrent modification exception when updating allocation filters ([#19701])(https://github.com/opensearch-project/OpenSearch/pull/19701))
80+
- Fix wildcard query with escaped backslash followed by wildcard character ([#19719](https://github.com/opensearch-project/OpenSearch/pull/19719))
8081
- Fix file-based ingestion consumer to handle start point beyond max line number([#19757])(https://github.com/opensearch-project/OpenSearch/pull/19757))
8182
- Fix IndexOutOfBoundsException when running include/exclude on non-existent prefix in terms aggregations ([#19637](https://github.com/opensearch-project/OpenSearch/pull/19637))
8283
- Fixed assertion unsafe use of ClusterService.state() in ResourceUsageCollectorService ([#19775])(https://github.com/opensearch-project/OpenSearch/pull/19775))

server/src/main/java/org/opensearch/index/mapper/WildcardFieldMapper.java

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -505,10 +505,16 @@ static Set<String> getRequiredNGrams(String value, boolean regexpMode) {
505505
}
506506

507507
private static String getNonWildcardSequence(String value, int startFrom) {
508+
int consecutiveBackslashes = 0;
508509
for (int i = startFrom; i < value.length(); i++) {
509510
char c = value.charAt(i);
510-
if ((c == '?' || c == '*') && (i == 0 || value.charAt(i - 1) != '\\')) {
511-
return value.substring(startFrom, i);
511+
if (c == '\\') {
512+
consecutiveBackslashes++;
513+
} else {
514+
if ((c == '?' || c == '*') && consecutiveBackslashes % 2 == 0) {
515+
return value.substring(startFrom, i);
516+
}
517+
consecutiveBackslashes = 0;
512518
}
513519
}
514520
// Made it to the end. No more wildcards.

server/src/test/java/org/opensearch/index/mapper/WildcardFieldTypeTests.java

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,36 @@ public void testMultipleWildcardsInQuery() {
142142
assertFalse(actualMatchingQuery.getSecondPhaseMatcher().test("abcdzzzefgqqh"));
143143
}
144144

145+
public void testEscapedBackslashFollowedByWildcard() {
146+
MappedFieldType ft = new WildcardFieldMapper.WildcardFieldType("field");
147+
148+
// Test case from issue #19719
149+
// Pattern: *some\\* means "wildcard + 'some\' + wildcard"
150+
// Should match strings like "some\string", "awesome\stuff", etc.
151+
152+
// Verify ngram generation doesn't include wildcard characters
153+
Set<String> ngrams = WildcardFieldMapper.WildcardFieldType.getRequiredNGrams("*some\\\\*", false);
154+
assertFalse("Ngrams should not contain wildcard characters", ngrams.stream().anyMatch(s -> s.contains("*")));
155+
assertTrue(ngrams.contains("som"));
156+
assertTrue(ngrams.contains("ome"));
157+
assertTrue(ngrams.contains("me\\"));
158+
159+
// Test the query
160+
Query query = ft.wildcardQuery("*some\\\\*", null, null);
161+
assertTrue(query instanceof WildcardFieldMapper.WildcardMatchingQuery);
162+
163+
WildcardFieldMapper.WildcardMatchingQuery wildcardQuery = (WildcardFieldMapper.WildcardMatchingQuery) query;
164+
165+
// Second phase matcher should correctly match strings with backslash
166+
assertTrue(wildcardQuery.getSecondPhaseMatcher().test("some\\string"));
167+
assertTrue(wildcardQuery.getSecondPhaseMatcher().test("some\\"));
168+
assertTrue(wildcardQuery.getSecondPhaseMatcher().test("prefix_some\\suffix"));
169+
170+
// Should not match strings without backslash
171+
assertFalse(wildcardQuery.getSecondPhaseMatcher().test("somestring"));
172+
assertFalse(wildcardQuery.getSecondPhaseMatcher().test("some/string"));
173+
}
174+
145175
public void testRegexpQuery() {
146176
String pattern = ".*apple.*";
147177
MappedFieldType ft = new WildcardFieldMapper.WildcardFieldType("field");

0 commit comments

Comments
 (0)