-
Notifications
You must be signed in to change notification settings - Fork 181
Support mvmap eval function
#4856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java
Outdated
Show resolved
Hide resolved
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java
Outdated
Show resolved
Hide resolved
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java(5 hunks)core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java(4 hunks)core/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java(1 hunks)core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.java(2 hunks)docs/user/ppl/functions/collection.md(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java(1 hunks)ppl/src/main/antlr/OpenSearchPPLParser.g4(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
- docs/user/ppl/functions/collection.md
- ppl/src/main/antlr/OpenSearchPPLParser.g4
🧰 Additional context used
📓 Path-based instructions (2)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.javacore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.javacore/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
🧠 Learnings (3)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Development requires JDK 21 for the OpenSearch SQL project
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
🧬 Code graph analysis (1)
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java (1)
core/src/main/java/org/opensearch/sql/ast/expression/LambdaFunction.java (1)
Getter(20-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
🔇 Additional comments (7)
core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java (3)
300-312: LGTM for captured variable propagation into lambda parameters.The logic correctly retrieves captured variables from the context and adds corresponding
RexLambdaRefentries to the lambda arguments. This aligns with the capture mechanism inCalcitePlanContext.captureVariable()which stores refs with"__captured_" + indexkeys.
426-427: Captured variables from only the last lambda are retained.If a function has multiple lambda arguments (e.g.,
reducehas accumulator and finish lambdas),capturedVarsis reassigned for each lambda iteration, so only the last lambda's captured variables are kept. This is acceptable formvmap/transform(single lambda), but verify this doesn't affectreduceif captured variables are needed there.
438-445: LGTM for appending captured variables to function arguments.The captured variables are correctly appended to the argument list only for
mvmapandtransformfunctions, maintaining backward compatibility with other functions.core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java (2)
65-73: LGTM for new state fields.The
capturedVariableslist andinLambdaContextflag are well-documented and appropriately use Lombok annotations for accessors.
91-102: LGTM for the lambda context cloning constructor.Sharing
relBuilderandrexBuilderwhile isolating lambda-specific state (rexLambdaRefMap,capturedVariables) is the correct approach. This allows lambda expressions to resolve fields from the parent context while maintaining their own variable bindings.core/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java (1)
67-92: LGTM for the refactored resolution with lambda-context handling.The two-phase approach improves clarity:
- Lambda variables take precedence (preventing accidental capture of lambda params)
- Regular fields are resolved and captured when in lambda context
This correctly enables external field references in
mvmapexpressions likemvmap(arr, arr * multiplier)wheremultipliercomes from the outer scope.core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.java (1)
76-80: Clarify captured variable handling in the args structure.The comment documents args structure as
[array, lambda, captured1, captured2, ..., returnType]supporting multiple captures, but verify that the implementation at lines 101-106 correctly processes all captured variables if multiple are present. If only the first captured variable (args[2]) is intentionally used, update the comment to reflect this single-capture design; otherwise, ensure the loop logic handles all captured arguments.
...rc/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.java
Show resolved
Hide resolved
Updated to support singe-value field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java (1)
182-208: Consider minor refactorings for maintainability.The logic correctly handles variable capture and deduplication. However, a few optional improvements could enhance code quality:
Extract magic string constant: The string
"__captured_"is used in three places (lines 187, 200, 205). Consider extracting it to a private static final constant for maintainability.Cache lambda parameter count: The stream/filter operation at lines 199-200 recalculates the lambda parameter count on each capture. For better performance, consider tracking this value separately or caching it.
Method length: The method is 27 lines, slightly exceeding the 20-line guideline. Consider extracting helper methods (e.g.,
findExistingCapture,createCapturedLambdaRef) to improve readability.Example for constant extraction:
+ private static final String CAPTURED_VAR_PREFIX = "__captured_"; + public RexLambdaRef captureVariable(RexNode fieldRef, String fieldName) { // Check if this variable is already captured for (int i = 0; i < capturedVariables.size(); i++) { if (capturedVariables.get(i).equals(fieldRef)) { - return rexLambdaRefMap.get("__captured_" + i); + return rexLambdaRefMap.get(CAPTURED_VAR_PREFIX + i); } } int captureIndex = capturedVariables.size(); capturedVariables.add(fieldRef); int lambdaParamCount = (int) - rexLambdaRefMap.keySet().stream().filter(key -> !key.startsWith("__captured_")).count(); + rexLambdaRefMap.keySet().stream().filter(key -> !key.startsWith(CAPTURED_VAR_PREFIX)).count(); RexLambdaRef lambdaRef = new RexLambdaRef(lambdaParamCount + captureIndex, fieldName, fieldRef.getType()); - rexLambdaRefMap.put("__captured_" + captureIndex, lambdaRef); + rexLambdaRefMap.put(CAPTURED_VAR_PREFIX + captureIndex, lambdaRef); return lambdaRef; }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java(5 hunks)core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.java(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.java
🧰 Additional context used
📓 Path-based instructions (2)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Development requires JDK 21 for the OpenSearch SQL project
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (3)
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java (3)
65-73: Well-designed lambda capture fields.The new fields for lambda support are well documented and appropriately scoped. The JavaDoc clearly explains their purpose in capturing external variables for lambda expressions.
87-102: Excellent design for lambda context isolation.The private constructor properly shares the relBuilder with the parent while maintaining separate lambda variable mappings. This design correctly enables lambda expressions to reference fields from the current row while having independent variable scopes.
133-140: LGTM - clone method properly updated.The updated clone method correctly uses the new private constructor to create lambda contexts that share the relBuilder while maintaining independent lambda state.
dai-chen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a minor comment. Thanks for the changes!
| public QualifiedName visitFunction(Function node, Void context) { | ||
| List<UnresolvedExpression> funcArgs = node.getFuncArgs(); | ||
| if (!funcArgs.isEmpty()) { | ||
| return funcArgs.get(0).accept(this, context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we visit each funcArg and return on first non-null (qualified name found)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the visitFunction method to iterate through all function arguments and return on the first non-null qualified name found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This will also help answer Tomo's comment #4856 (review) which also ask clarify for the behavior.
Signed-off-by: Kai Huang <ahkcs@amazon.com> # Conflicts: # docs/user/ppl/functions/collection.rst # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java # ppl/src/main/antlr/OpenSearchPPLLexer.g4 # ppl/src/main/antlr/OpenSearchPPLParser.g4 # ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
513c11c to
874db06
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java (1)
490-519: Consider documenting the first-argument-only traversal for functions.The
FIELD_NAME_EXTRACTORvisitor correctly extracts field names from expressions. However, thevisitFunctionmethod only examines the first argument of nested functions (line 509). While this is intentional and appropriate for mvmap's use case, consider adding a comment explaining this behavior to help future maintainers understand why other arguments are not checked.Example addition:
@Override public QualifiedName visitFunction(Function node, Void context) { + // For mvmap, the array field is expected in the first argument position + // e.g., mvindex(arr, 1, 3), mvappend(arr, value), etc. List<UnresolvedExpression> funcArgs = node.getFuncArgs(); if (!funcArgs.isEmpty()) { return funcArgs.get(0).accept(this, context); } return null; }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java(5 hunks)core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java(4 hunks)core/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java(1 hunks)core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java(1 hunks)core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.java(2 hunks)core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java(2 hunks)docs/user/ppl/functions/collection.md(1 hunks)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java(1 hunks)ppl/src/main/antlr/OpenSearchPPLLexer.g4(1 hunks)ppl/src/main/antlr/OpenSearchPPLParser.g4(2 hunks)ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java(2 hunks)ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java(2 hunks)ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java(1 hunks)ppl/src/test/java/org/opensearch/sql/ppl/parser/AstBuilderTest.java(1 hunks)ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- ppl/src/main/antlr/OpenSearchPPLLexer.g4
- ppl/src/test/java/org/opensearch/sql/ppl/parser/AstBuilderTest.java
- core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java
- core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
- integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
- ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.javappl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.javappl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.javappl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.javacore/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.javacore/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
⚙️ CodeRabbit configuration file
**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure methods are under 20 lines with single responsibility
- Verify proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.javappl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.javappl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.javappl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.javacore/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.javacore/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Verify SQL generation and optimization paths
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.javacore/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify test coverage for new business logic
- Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.javappl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.javappl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Development requires JDK 21 for the OpenSearch SQL project
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java
🧬 Code graph analysis (2)
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java (1)
core/src/main/java/org/opensearch/sql/ast/expression/LambdaFunction.java (1)
Getter(20-47)
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java (1)
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)
AbstractNodeVisitor(94-464)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (21, doc)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: build-linux (25, doc)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (19)
core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/TransformFunctionImpl.java (2)
76-80: Captured variable detection logic is correct.The detection
args.length > 3correctly identifies when captured variables are present based on the args structure[array, lambda, captured1, ..., returnType].
99-120: LGTM! Limitation is now documented.The branching logic correctly separates captured-variable mode from index-based mode. The documentation comment (lines 101-106) adequately addresses the prior review concern about the single captured variable limitation.
Consider whether
capturedVarbeingnull(when the captured field has a null value) should be handled specially or is acceptable runtime behavior.docs/user/ppl/functions/collection.md (1)
728-794: Documentation is clear and comprehensive.The MVMAP section follows the existing documentation patterns and covers:
- Basic usage with implicit binding
- Arithmetic transformations
- External field references (captured variables)
- Note about nested expressions
The examples are well-structured and demonstrate key use cases.
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1)
900-906: Test coverage for mvmap anonymization is appropriate.The test follows the established pattern of other mv* function tests and validates that field names are masked as
identifierand literals as***.core/src/main/java/org/opensearch/sql/calcite/QualifiedNameResolver.java (1)
67-92: Lambda context handling is well-structured.The refactored resolution flow correctly:
- Prioritizes lambda variable resolution
- Attempts field resolution with external variable capture when in lambda context
- Falls back to correlation fields and coalesce handling
The explicit control flow with early returns is clearer than Optional chaining for this multi-step resolution logic.
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (2)
918-926: Mvmap unwrapping logic is appropriate.The special handling correctly detects
mvmapcalls with an implicit lambda (second argument isLambdaFunction) and renders the original expression formatmvmap(firstArg, lambdaBody)rather than the internal lambda representation.
935-943: Lambda function anonymization follows established patterns.The new
visitLambdaFunctionmethod correctly:
- Masks lambda parameter names using
maskField()- Recursively analyzes the lambda body
- Returns the standard lambda notation format
core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java (1)
81-81: LGTM! MVMAP constant correctly added.The new
MVMAPenum constant follows the established pattern for collection functions and is properly positioned in the Collection functions section. It will be automatically included inALL_NATIVE_FUNCTIONSthrough enum iteration.ppl/src/main/antlr/OpenSearchPPLParser.g4 (2)
851-860: Grammar correctly defines mvmap with precedence over evalFunctionCall.The mvmapFunctionCall alternative is placed before evalFunctionCall, ensuring mvmap syntax is matched specifically rather than falling through to the general eval function handler. This ordering is correct for specialized function syntax.
Based on learnings: AST generation should be verified for this new PPL parser syntax (will check in AstExpressionBuilder.java).
862-864: mvmapFunctionCall production correctly enforces 2-argument structure.The grammar properly defines mvmap syntax as
MVMAP(arg1, arg2)with exactly twofunctionArgparameters separated by a comma. This matches the intended syntaxmvmap(array, expression)described in the PR objectives.ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java (3)
25-26: LGTM! Necessary imports for the FIELD_NAME_EXTRACTOR visitor.The
AbstractNodeVisitorandNodeimports support the new visitor pattern used for extracting field names from expressions for mvmap's implicit lambda binding.
453-473: LGTM! Implicit lambda binding correctly implemented.The
visitMvmapFunctionCallmethod properly transforms mvmap syntax into a lambda-based function call:
- Extracts the field name from the first argument (even from nested expressions like
mvindex(arr, 1, 3))- Creates a
LambdaFunctionusing that field name as the parameter- Validates that the second argument is not already a lambda
- Provides clear error messages for invalid syntax
This matches the PR objectives for implicit field binding (e.g.,
mvmap(arr, arr * 10)implicitly createsarr -> arr * 10).Based on learnings: The dedicated
visitMvmapFunctionCallmethod follows the pattern correctly established after previous review feedback.
481-488: LGTM! Helper method with clear documentation.The
extractFieldNamehelper correctly delegates to the visitor and includes a helpful JavaDoc example showing how field names are extracted from nested function calls.core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java (6)
11-11: LGTM! Necessary import for ArrayList.The
ArrayListimport is required for initializing thecapturedVariableslist used in lambda variable capture.
65-73: LGTM! Well-documented fields for lambda variable capture.The new fields
capturedVariablesandinLambdaContextare properly declared with clear JavaDoc explaining their purpose. The captured variables list stores external field references used in lambda expressions, and the context flag tracks lambda scope.
84-85: LGTM! Proper initialization of capturedVariables.The
capturedVariableslist is correctly initialized as an emptyArrayListin the constructor, consistent with other field initializations.
87-102: LGTM! Well-designed constructor for lambda context isolation.The new private constructor correctly creates a child context that:
- Shares
relBuilderandrexBuilderwith the parent (enabling field resolution from current row)- Creates isolated
rexLambdaRefMapandcapturedVariables(preventing variable binding conflicts)- Sets
inLambdaContext=trueto mark lambda scopeThis design enables proper variable scoping for nested lambda expressions.
133-140: LGTM! Simplified clone() delegates to private constructor.The modified
clone()method now uses the private constructor, providing a cleaner implementation with well-documented sharing semantics. The JavaDoc clearly explains that lambda contexts can reference fields from the parent context while maintaining their own variable mappings.
172-208: LGTM! captureVariable correctly implements deduplication and index calculation.The
captureVariablemethod properly handles external variable capture for lambda functions:
- Deduplication (lines 183-189): Efficiently reuses existing captures by checking if the
RexNodewas already captured- Index calculation (lines 198-202): Correctly filters out captured variables when counting lambda parameters, fixing the bug noted in past review comments
- Storage (line 205): Uses
"__captured_"prefix to distinguish captured variables from lambda parametersThe index calculation fix ensures that multiple captured variables get sequential indices after the lambda parameters:
- 1 lambda param + 0 captures → captured var at index 1 ✓
- 1 lambda param + 1 capture → next captured var at index 2 ✓
* Support mvmap eval function Signed-off-by: Kai Huang <ahkcs@amazon.com> # Conflicts: # docs/user/ppl/functions/collection.rst # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java # ppl/src/main/antlr/OpenSearchPPLLexer.g4 # ppl/src/main/antlr/OpenSearchPPLParser.g4 # ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java * update UT Signed-off-by: Kai Huang <ahkcs@amazon.com> * update error handling and test cases Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixes Signed-off-by: Kai Huang <ahkcs@amazon.com> * update collection.md Signed-off-by: Kai Huang <ahkcs@amazon.com> * Update to support referencing single-value field Signed-off-by: Kai Huang <ahkcs@amazon.com> * update to use visitor pattern Signed-off-by: Kai Huang <ahkcs@amazon.com> * Addressing CodeRabbit Signed-off-by: Kai Huang <ahkcs@amazon.com> * Update Signed-off-by: Kai Huang <ahkcs@amazon.com> --------- Signed-off-by: Kai Huang <ahkcs@amazon.com> (cherry picked from commit 11727a4) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Support mvmap eval function # Conflicts: # docs/user/ppl/functions/collection.rst # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java # ppl/src/main/antlr/OpenSearchPPLLexer.g4 # ppl/src/main/antlr/OpenSearchPPLParser.g4 # ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLArrayFunctionTest.java # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java * update UT * update error handling and test cases * fixes * update collection.md * Update to support referencing single-value field * update to use visitor pattern * Addressing CodeRabbit * Update --------- (cherry picked from commit 11727a4) Signed-off-by: Kai Huang <ahkcs@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
This PR adds support for the
mvmapeval function, which iterates over each element of a multivalue array, applies a given expression, and returns a new array containing the transformed results.Syntax:
mvmap(array, expression)During iteration, the field name in the expression is implicitly bound to each element of the input array (e.g.,
mvmap(arr, arr * 10)).Implementation Details
Core Approach
mvmapis implemented as an alias for PPL's existingTRANSFORMoperator, which applies a lambda function to each array element.To support implicit binding syntax—
mvmap(arr, arr * 10)instead of requiring explicit lambda syntax (mvmap(arr, x -> x * 10))—AST transformation logic was added inAstExpressionBuilder.java.AST Transformation Steps
mvmapfunction calls invisitEvalFunctionCall()arr)LambdaFunction, using the extracted field name as the parameterCheck List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.