Skip to content

Conversation

@ahkcs
Copy link
Contributor

@ahkcs ahkcs commented Nov 20, 2025

Description

This PR implements the mvfind eval function for PPL, which searches a multivalue array and returns the 0-based index of the first element matching a regular expression pattern. Returns NULL if no match is found.

Implementation Approach

Attempted: Native Calcite SQL Functions

We initially explored using native Calcite SQL functions to avoid creating a UDF. We tried ARRAY-related Calcite functions like ARRAY_POSITION, but we didn't find any functions that support REGEX. For functions that do support REGEX (like REGEXP_CONTAINS, REGEXP_LIKE), they all take string as a parameter instead of array.

Final Solution: UDF Implementation

Implemented as a UDF following the pattern of EXISTS and FORALL functions, using Java's Pattern and Matcher for regex matching.

Examples

-- Basic pattern matching
source=people | eval result = mvfind(array('apple', 'banana', 'apricot'), 'ban.*') | fields result
> 1

-- No match returns NULL
source=people | eval result = mvfind(array('cat', 'dog', 'bird'), 'fish') | fields result
> null

-- Regex with character class
source=people | eval result = mvfind(array('error123', 'info', 'error456'), 'error[0-9]+') | fields result
> 0

-- Case-insensitive matching
source=people | eval result = mvfind(array('Apple', 'Banana', 'Cherry'), '(?i)banana') | fields result
> 1

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

*/
public class MVFindFunctionImpl extends ImplementorUDF {
public MVFindFunctionImpl() {
super(new MVFindImplementor(), NullPolicy.ALL);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be NullPolicy.ANY? (looks it returns null when one of the parameter is null)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to NullPolicy.ANY

@ahkcs ahkcs requested a review from ykmr1224 November 21, 2025 17:52
@dai-chen dai-chen added enhancement New feature or request PPL Piped processing language labels Nov 21, 2025
Signed-off-by: Kai Huang <ahkcs@amazon.com>

# Conflicts:
#	core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java
#	core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
#	ppl/src/main/antlr/OpenSearchPPLLexer.g4
#	ppl/src/main/antlr/OpenSearchPPLParser.g4
#	ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java

# Conflicts:
#	core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java
#	docs/user/ppl/functions/collection.rst

# Conflicts:
#	core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java
#	core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
#	core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
#	ppl/src/main/antlr/OpenSearchPPLLexer.g4
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java (1)

218-301: Consider adding LogicalProject verification and edge-case tests.

The tests cover core functionality and correctly verify Spark SQL translation. However, for consistency with tests like testMvjoinWithStringArray (lines 19-40) and better Calcite integration coverage:

  1. Add LogicalProject verification: Include verifyLogical(root, expectedLogical) assertions to ensure the logical plan structure is correct. This helps catch planning-time issues.

  2. Consider additional edge cases (optional):

    • Empty array: mvfind(array(), 'pattern')
    • Array with null elements: mvfind(array('a', null, 'b'), '.*')
    • Special regex characters: mvfind(array('a.b', 'a*b'), 'a\\.b')
    • Case sensitivity: patterns with different case behavior

As per learnings, testing SQL generation and optimization paths is important for Calcite integration changes.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between adc6736 and b163546.

📒 Files selected for processing (11)
  • core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java (1 hunks)
  • core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1 hunks)
  • core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java (2 hunks)
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (2 hunks)
  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1 hunks)
  • docs/user/ppl/functions/collection.md (1 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java (1 hunks)
  • ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1 hunks)
  • ppl/src/main/antlr/OpenSearchPPLParser.g4 (1 hunks)
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java (1 hunks)
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • ppl/src/main/antlr/OpenSearchPPLLexer.g4
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
  • ppl/src/main/antlr/OpenSearchPPLParser.g4
  • docs/user/ppl/functions/collection.md
  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
  • core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
  • core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure methods are under 20 lines with single responsibility
  • Verify proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
  • core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify test coverage for new business logic

  • Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor

  • Verify SQL generation and optimization paths
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
🧠 Learnings (5)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag generic exception handling during code review

Applied to files:

  • core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review

Applied to files:

  • core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Use specific exception types with meaningful messages for error handling

Applied to files:

  • core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
🧬 Code graph analysis (2)
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java (1)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1)
  • MVFindFunctionImpl (37-178)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1)
core/src/main/java/org/opensearch/sql/expression/function/ImplementorUDF.java (1)
  • ImplementorUDF (17-40)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: test-sql-cli-integration (21)
  • GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (6)
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java (1)

50-50: LGTM! MVFIND operator follows existing patterns.

The import and operator declaration follow the established pattern for collection functions (MVAPPEND, MVZIP, etc.). Naming conventions are correct with PascalCase for the implementation class and lowercase for the UDF name.

Also applies to: 398-398

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (5)

26-51: Well-documented class with correct metadata.

The JavaDoc clearly explains the function's purpose, usage, and behavior with a concrete example. The operand metadata correctly accepts ARRAY and ANY (with runtime coercion), and the comment accurately reflects this. NullPolicy.ANY is appropriate since the function returns null when any parameter is null.


53-104: Excellent planning-time optimization with proper error handling.

The implementor correctly:

  • Compiles literal patterns at planning time for better performance
  • Uses getValueAs(String.class) to properly unwrap Calcite string literals (with fallback for other types)
  • Documents the Calcite-specific approach with a clear comment
  • Converts PatternSyntaxException to IllegalArgumentException for proper 400-error semantics

This addresses past review feedback and follows Calcite best practices.

As per coding guidelines, Calcite-specific workarounds are properly documented.


107-118: Clean core search logic.

The implementation correctly:

  • Iterates the array and skips null elements
  • Converts elements to strings for pattern matching
  • Uses Pattern.find() for substring matching (appropriate for regex search)
  • Returns 0-based index of the first match or null

The method is focused, concise, and follows the single-responsibility principle.


120-133: LGTM! Proper handling of pre-compiled patterns.

The method is correctly marked public static (required for Calcite code generation) and has appropriate JavaDoc. The implementation performs null checks and delegates directly to mvfindCore without unnecessary exception wrapping, as noted in the documentation.

This addresses past review feedback about avoiding generic exception handling.


135-177: Runtime path correctly handles type coercion and errors.

The implementation properly:

  • Supports type coercion via toString() for numeric and other types in evalWithString
  • Catches only PatternSyntaxException (specific exception type) and converts it to IllegalArgumentException with a descriptive message
  • Avoids generic exception handling, letting other runtime exceptions propagate

The separation between evalWithString (type coercion) and mvfind (pattern compilation) provides clear responsibilities. This addresses past review feedback and follows error-handling best practices.

As per learnings, specific exception types with meaningful messages are used for error handling.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1)

1-214: Consider adding test coverage for evalWithPattern.

While the tests comprehensively cover evalWithString, the implementation also exposes evalWithPattern(List<Object> array, Pattern pattern) for literal patterns compiled at planning time. Adding a few tests that directly call evalWithPattern with pre-compiled Pattern objects would improve unit test coverage of this optimization path.

Example test to add:

@Test
public void testEvalWithPrecompiledPattern() {
  List<Object> array = Arrays.asList("apple", "banana", "cherry");
  Pattern pattern = Pattern.compile("ban.*");
  Object result = MVFindFunctionImpl.evalWithPattern(array, pattern);
  assertEquals(1, result);
}

@Test
public void testEvalWithPatternAndNullArray() {
  Pattern pattern = Pattern.compile("test");
  Object result = MVFindFunctionImpl.evalWithPattern(null, pattern);
  assertNull(result);
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a72da4d and 5d67541.

📒 Files selected for processing (1)
  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure methods are under 20 lines with single responsibility
  • Verify proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify test coverage for new business logic

  • Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review

Applied to files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
🧬 Code graph analysis (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1)
  • MVFindFunctionImpl (37-168)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (3)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (3)

1-105: LGTM! Comprehensive coverage of basic functionality, null handling, and edge cases.

The test structure is clear and well-organized. Basic matching scenarios, null handling, and edge cases (empty array, empty pattern, single element) are thoroughly tested with appropriate assertions.


107-158: LGTM! Excellent regex pattern coverage.

Tests thoroughly verify wildcard patterns, character classes, digit classes, case-insensitive flags, and anchors. The case-sensitivity test confirms the default behavior is correct.


177-214: LGTM! Thorough type conversion and coercion testing.

Tests verify correct behavior with numeric, mixed-type, and boolean elements, demonstrating that the function properly converts elements to strings for regex matching.

Signed-off-by: Kai Huang <ahkcs@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1)

18-216: Add unit tests for the evalWithPattern public method.

All current tests exercise MVFindFunctionImpl.evalWithString, which thoroughly validates the runtime pattern compilation path. However, evalWithPattern(List<Object>, Pattern) is also a public method used by the planning-time compilation path (when literal regex patterns are pre-compiled). Consider adding tests that directly call evalWithPattern with pre-compiled Pattern objects to ensure complete unit test coverage of the public API.

Example tests to add:

@Test
public void testMvfindWithPrecompiledPattern() {
  List<Object> array = Arrays.asList("apple", "banana", "cherry");
  Pattern pattern = Pattern.compile("banana");
  Object result = MVFindFunctionImpl.evalWithPattern(array, pattern);
  assertEquals(1, result);
}

@Test
public void testMvfindWithPrecompiledPatternAndNullArray() {
  Pattern pattern = Pattern.compile("test");
  Object result = MVFindFunctionImpl.evalWithPattern(null, pattern);
  assertNull(result);
}

@Test
public void testMvfindWithPrecompiledPatternAndNullPattern() {
  List<Object> array = Arrays.asList("apple", "banana");
  Object result = MVFindFunctionImpl.evalWithPattern(array, null);
  assertNull(result);
}

@Test
public void testMvfindWithPrecompiledCaseInsensitivePattern() {
  List<Object> array = Arrays.asList("Apple", "Banana", "Cherry");
  Pattern pattern = Pattern.compile("banana", Pattern.CASE_INSENSITIVE);
  Object result = MVFindFunctionImpl.evalWithPattern(array, pattern);
  assertEquals(1, result);
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d67541 and 072f610.

📒 Files selected for processing (1)
  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure methods are under 20 lines with single responsibility
  • Verify proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify test coverage for new business logic

  • Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
🧠 Learnings (4)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review

Applied to files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : All new business logic requires unit tests

Applied to files:

  • core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
🧬 Code graph analysis (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1)
  • MVFindFunctionImpl (37-168)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (8)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (8)

1-19: LGTM! Test class structure is well-organized.

The imports, naming conventions, and overall structure follow OpenSearch SQL and Java best practices.


20-56: LGTM! Basic functionality tests are comprehensive.

These tests effectively cover the core behaviors: exact matches, no matches, positional matches (first/last), and first-match priority.


57-84: LGTM! Null handling is thoroughly tested.

The tests cover null inputs (array, regex, both) and null elements within the array, aligning with the expected behavior of skipping null elements during the search.


85-107: LGTM! Edge cases are appropriately tested.

The tests cover empty arrays, empty patterns, and single-element arrays. The empty pattern behavior (matching the first element due to zero-width match) is correctly validated.


108-151: LGTM! Regex pattern tests are comprehensive.

These tests effectively validate various regex features including wildcards, character classes, digit patterns, case-insensitive matching, and anchors. The test data appropriately includes multiple elements to verify correct index identification.


152-160: LGTM! Case sensitivity test validates default behavior.

This test confirms that pattern matching is case-sensitive by default, complementing the earlier test with the explicit case-insensitive flag.


161-174: LGTM! Invalid regex error handling is properly tested.

The test correctly validates that invalid regex patterns throw IllegalArgumentException with descriptive error messages. This aligns with the implementation's error handling strategy.


175-216: LGTM! Type conversion tests are thorough.

These tests effectively validate that the implementation correctly handles non-string types through toString() conversion, covering numeric, boolean, and mixed-type scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants