Avoid handling JSON_ARRAY as multi value JSON during transformation #14738
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The transform pipeline is failing with an
ArrayIndexOutOfBoundsException
when it encounters a JSON column value with empty json array as JSON value is not standardised (empty array -> null) https://github.com/apache/pinot/blob/master/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/DataTypeTransformer.java#L97Earlier empty array for json datatype was getting extracted as string, now its getting extracted as Object[] due to the change at https://github.com/apache/pinot/pull/14547/files#diff-7ac5349f9d75e27a62a063dbf81db3ed30c8de052b4ffa7719187e4babaa60baR66
which leads to
isMultiValue
returning true for empty json arrayconvertMultiValue
returnsObject[]
whileconvertSingleValue
returns astring
https://github.com/apache/pinot/blob/master/pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/BaseRecordExtractor.java#L39
Updating the transform logic to handle the empty array for JSON datatype