-
Notifications
You must be signed in to change notification settings - Fork 181
Support mvfind eval function
#4839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| */ | ||
| public class MVFindFunctionImpl extends ImplementorUDF { | ||
| public MVFindFunctionImpl() { | ||
| super(new MVFindImplementor(), NullPolicy.ALL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be NullPolicy.ANY? (looks it returns null when one of the parameter is null)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to NullPolicy.ANY
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Outdated
Show resolved
Hide resolved
...c/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
Outdated
Show resolved
Hide resolved
cb4db3a to
b254cf6
Compare
Signed-off-by: Kai Huang <ahkcs@amazon.com> # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java # core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java # ppl/src/main/antlr/OpenSearchPPLLexer.g4 # ppl/src/main/antlr/OpenSearchPPLParser.g4 # ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java # docs/user/ppl/functions/collection.rst # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java # core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java # core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java # ppl/src/main/antlr/OpenSearchPPLLexer.g4
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
adc6736 to
b163546
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java (1)
218-301: Consider adding LogicalProject verification and edge-case tests.The tests cover core functionality and correctly verify Spark SQL translation. However, for consistency with tests like
testMvjoinWithStringArray(lines 19-40) and better Calcite integration coverage:
Add LogicalProject verification: Include
verifyLogical(root, expectedLogical)assertions to ensure the logical plan structure is correct. This helps catch planning-time issues.Consider additional edge cases (optional):
- Empty array:
mvfind(array(), 'pattern')- Array with null elements:
mvfind(array('a', null, 'b'), '.*')- Special regex characters:
mvfind(array('a.b', 'a*b'), 'a\\.b')- Case sensitivity: patterns with different case behavior
As per learnings, testing SQL generation and optimization paths is important for Calcite integration changes.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java(1 hunks)core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java(1 hunks)core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java(2 hunks)core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java(2 hunks)core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java(1 hunks)docs/user/ppl/functions/collection.md(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java(1 hunks)ppl/src/main/antlr/OpenSearchPPLLexer.g4(1 hunks)ppl/src/main/antlr/OpenSearchPPLParser.g4(1 hunks)ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java(1 hunks)ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- ppl/src/main/antlr/OpenSearchPPLLexer.g4
- core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
- ppl/src/main/antlr/OpenSearchPPLParser.g4
- docs/user/ppl/functions/collection.md
- core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
- integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
- core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java
- ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.javacore/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.javacore/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
🧠 Learnings (5)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag generic exception handling during code review
Applied to files:
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review
Applied to files:
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Use specific exception types with meaningful messages for error handling
Applied to files:
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
🧬 Code graph analysis (2)
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java (1)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1)
MVFindFunctionImpl(37-178)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1)
core/src/main/java/org/opensearch/sql/expression/function/ImplementorUDF.java (1)
ImplementorUDF(17-40)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (6)
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java (1)
50-50: LGTM! MVFIND operator follows existing patterns.The import and operator declaration follow the established pattern for collection functions (MVAPPEND, MVZIP, etc.). Naming conventions are correct with PascalCase for the implementation class and lowercase for the UDF name.
Also applies to: 398-398
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (5)
26-51: Well-documented class with correct metadata.The JavaDoc clearly explains the function's purpose, usage, and behavior with a concrete example. The operand metadata correctly accepts
ARRAYandANY(with runtime coercion), and the comment accurately reflects this.NullPolicy.ANYis appropriate since the function returns null when any parameter is null.
53-104: Excellent planning-time optimization with proper error handling.The implementor correctly:
- Compiles literal patterns at planning time for better performance
- Uses
getValueAs(String.class)to properly unwrap Calcite string literals (with fallback for other types)- Documents the Calcite-specific approach with a clear comment
- Converts
PatternSyntaxExceptiontoIllegalArgumentExceptionfor proper 400-error semanticsThis addresses past review feedback and follows Calcite best practices.
As per coding guidelines, Calcite-specific workarounds are properly documented.
107-118: Clean core search logic.The implementation correctly:
- Iterates the array and skips null elements
- Converts elements to strings for pattern matching
- Uses
Pattern.find()for substring matching (appropriate for regex search)- Returns 0-based index of the first match or null
The method is focused, concise, and follows the single-responsibility principle.
120-133: LGTM! Proper handling of pre-compiled patterns.The method is correctly marked
public static(required for Calcite code generation) and has appropriate JavaDoc. The implementation performs null checks and delegates directly tomvfindCorewithout unnecessary exception wrapping, as noted in the documentation.This addresses past review feedback about avoiding generic exception handling.
135-177: Runtime path correctly handles type coercion and errors.The implementation properly:
- Supports type coercion via
toString()for numeric and other types inevalWithString- Catches only
PatternSyntaxException(specific exception type) and converts it toIllegalArgumentExceptionwith a descriptive message- Avoids generic exception handling, letting other runtime exceptions propagate
The separation between
evalWithString(type coercion) andmvfind(pattern compilation) provides clear responsibilities. This addresses past review feedback and follows error-handling best practices.As per learnings, specific exception types with meaningful messages are used for error handling.
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1)
1-214: Consider adding test coverage forevalWithPattern.While the tests comprehensively cover
evalWithString, the implementation also exposesevalWithPattern(List<Object> array, Pattern pattern)for literal patterns compiled at planning time. Adding a few tests that directly callevalWithPatternwith pre-compiledPatternobjects would improve unit test coverage of this optimization path.Example test to add:
@Test public void testEvalWithPrecompiledPattern() { List<Object> array = Arrays.asList("apple", "banana", "cherry"); Pattern pattern = Pattern.compile("ban.*"); Object result = MVFindFunctionImpl.evalWithPattern(array, pattern); assertEquals(1, result); } @Test public void testEvalWithPatternAndNullArray() { Pattern pattern = Pattern.compile("test"); Object result = MVFindFunctionImpl.evalWithPattern(null, pattern); assertNull(result); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review
Applied to files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
🧬 Code graph analysis (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1)
MVFindFunctionImpl(37-168)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (3)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (3)
1-105: LGTM! Comprehensive coverage of basic functionality, null handling, and edge cases.The test structure is clear and well-organized. Basic matching scenarios, null handling, and edge cases (empty array, empty pattern, single element) are thoroughly tested with appropriate assertions.
107-158: LGTM! Excellent regex pattern coverage.Tests thoroughly verify wildcard patterns, character classes, digit classes, case-insensitive flags, and anchors. The case-sensitivity test confirms the default behavior is correct.
177-214: LGTM! Thorough type conversion and coercion testing.Tests verify correct behavior with numeric, mixed-type, and boolean elements, demonstrating that the function properly converts elements to strings for regex matching.
...c/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Kai Huang <ahkcs@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1)
18-216: Add unit tests for theevalWithPatternpublic method.All current tests exercise
MVFindFunctionImpl.evalWithString, which thoroughly validates the runtime pattern compilation path. However,evalWithPattern(List<Object>, Pattern)is also a public method used by the planning-time compilation path (when literal regex patterns are pre-compiled). Consider adding tests that directly callevalWithPatternwith pre-compiledPatternobjects to ensure complete unit test coverage of the public API.Example tests to add:
@Test public void testMvfindWithPrecompiledPattern() { List<Object> array = Arrays.asList("apple", "banana", "cherry"); Pattern pattern = Pattern.compile("banana"); Object result = MVFindFunctionImpl.evalWithPattern(array, pattern); assertEquals(1, result); } @Test public void testMvfindWithPrecompiledPatternAndNullArray() { Pattern pattern = Pattern.compile("test"); Object result = MVFindFunctionImpl.evalWithPattern(null, pattern); assertNull(result); } @Test public void testMvfindWithPrecompiledPatternAndNullPattern() { List<Object> array = Arrays.asList("apple", "banana"); Object result = MVFindFunctionImpl.evalWithPattern(array, null); assertNull(result); } @Test public void testMvfindWithPrecompiledCaseInsensitivePattern() { List<Object> array = Arrays.asList("Apple", "Banana", "Cherry"); Pattern pattern = Pattern.compile("banana", Pattern.CASE_INSENSITIVE); Object result = MVFindFunctionImpl.evalWithPattern(array, pattern); assertEquals(1, result); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
🧠 Learnings (4)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review
Applied to files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : All new business logic requires unit tests
Applied to files:
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java
🧬 Code graph analysis (1)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (1)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImpl.java (1)
MVFindFunctionImpl(37-168)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (8)
core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MVFindFunctionImplTest.java (8)
1-19: LGTM! Test class structure is well-organized.The imports, naming conventions, and overall structure follow OpenSearch SQL and Java best practices.
20-56: LGTM! Basic functionality tests are comprehensive.These tests effectively cover the core behaviors: exact matches, no matches, positional matches (first/last), and first-match priority.
57-84: LGTM! Null handling is thoroughly tested.The tests cover null inputs (array, regex, both) and null elements within the array, aligning with the expected behavior of skipping null elements during the search.
85-107: LGTM! Edge cases are appropriately tested.The tests cover empty arrays, empty patterns, and single-element arrays. The empty pattern behavior (matching the first element due to zero-width match) is correctly validated.
108-151: LGTM! Regex pattern tests are comprehensive.These tests effectively validate various regex features including wildcards, character classes, digit patterns, case-insensitive matching, and anchors. The test data appropriately includes multiple elements to verify correct index identification.
152-160: LGTM! Case sensitivity test validates default behavior.This test confirms that pattern matching is case-sensitive by default, complementing the earlier test with the explicit case-insensitive flag.
161-174: LGTM! Invalid regex error handling is properly tested.The test correctly validates that invalid regex patterns throw
IllegalArgumentExceptionwith descriptive error messages. This aligns with the implementation's error handling strategy.
175-216: LGTM! Type conversion tests are thorough.These tests effectively validate that the implementation correctly handles non-string types through
toString()conversion, covering numeric, boolean, and mixed-type scenarios.
Description
This PR implements the
mvfindeval function for PPL, which searches a multivalue array and returns the 0-based index of the first element matching a regular expression pattern. Returns NULL if no match is found.Implementation Approach
Attempted: Native Calcite SQL Functions
We initially explored using native Calcite SQL functions to avoid creating a UDF. We tried ARRAY-related Calcite functions like
ARRAY_POSITION, but we didn't find any functions that support REGEX. For functions that do support REGEX (likeREGEXP_CONTAINS,REGEXP_LIKE), they all take string as a parameter instead of array.Final Solution: UDF Implementation
Implemented as a UDF following the pattern of
EXISTSandFORALLfunctions, using Java'sPatternandMatcherfor regex matching.Examples
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.