Skip to content

feat: add support for array_position expression#3172

Open
andygrove wants to merge 6 commits intoapache:mainfrom
andygrove:feature/array-position
Open

feat: add support for array_position expression#3172
andygrove wants to merge 6 commits intoapache:mainfrom
andygrove:feature/array-position

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Jan 15, 2026

Closes #3157

Summary

  • Adds native Comet support for Spark's array_position function
  • Returns the 1-based position of an element in an array, or 0 if not found

Implementation Details

This required a custom Rust implementation because DataFusion's array_position returns UInt64 and null when not found, while Spark returns Int64 (LongType) and 0.

Key implementation details:

  • Returns Int64 to match Spark's LongType
  • Returns 0 when element is not found (Spark behavior)
  • Returns null when array is null or search element is null
  • Supports both List and LargeList array types

Test Plan

  • Added unit tests for array_position in CometArrayExpressionSuite
  • Tests cover:
    • Finding elements in integer arrays
    • Finding elements in string arrays
    • Element not found (returns 0)
    • Arrays with null elements
    • Column-based queries (not just literals)
  • All existing tests pass

Note: This PR was generated with AI assistance.

Closes #3153

Implements Spark's array_position function which returns the 1-based
position of an element in an array, returning 0 if not found.

This required a custom Rust implementation because DataFusion's
array_position returns UInt64 and null when not found, while Spark
returns Int64 (LongType) and 0.

Key implementation details:
- Returns Int64 to match Spark's LongType
- Returns 0 when element is not found (Spark behavior)
- Returns null when array is null or search element is null
- Supports both List and LargeList array types

Closes apache#3153

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@andygrove andygrove marked this pull request as draft January 15, 2026 02:28
@codecov-commenter
Copy link

codecov-commenter commented Jan 15, 2026

Codecov Report

❌ Patch coverage is 76.92308% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.96%. Comparing base (f09f8af) to head (36c3320).
⚠️ Report is 907 commits behind head on main.

Files with missing lines Patch % Lines
...src/main/scala/org/apache/comet/serde/arrays.scala 75.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3172      +/-   ##
============================================
+ Coverage     56.12%   59.96%   +3.84%     
- Complexity      976     1462     +486     
============================================
  Files           119      175      +56     
  Lines         11743    16180    +4437     
  Branches       2251     2684     +433     
============================================
+ Hits           6591     9703    +3112     
- Misses         4012     5128    +1116     
- Partials       1140     1349     +209     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove andygrove marked this pull request as ready for review January 15, 2026 23:36
# Conflicts:
#	docs/source/user-guide/latest/configs.md
#	native/spark-expr/src/comet_scalar_funcs.rs
@andygrove andygrove marked this pull request as draft January 30, 2026 01:48
@andygrove
Copy link
Member Author

Moving this to draft until #3328 is merged

andygrove and others added 2 commits February 10, 2026 11:07
Move array_position tests from CometArrayExpressionSuite to a SQL file
test and fall back to Spark when all arguments are literals.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@andygrove andygrove marked this pull request as ready for review February 11, 2026 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support Spark expression: array_position [Feature] Support Spark expression: length_of_json_array

2 participants