Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Array and ExprValue Parsing With Inner Hits #1737

Conversation

forestmvey
Copy link
Collaborator

@forestmvey forestmvey commented Jun 14, 2023

Description

With the addition of the nested function support in the V2 engine the inner_hits portion of the OS response needs to be parsed for OS types. Previously arrays were not entirely parsed due to lack of support.

Parsing Functionality

Parsing of each value in arrays needs to be supported for proper nested functionality. The following gives examples of the current response formats when parsing arrays.


Object Type - Array Of Objects

Dataset:
"obj": [{"key": "val1"}, {"key": "val2"}]

SQL Query:

SELECT obj FROM objects;

V1 Engine Response:
[{'key': 'val1'},{'key': 'val2'}]

V2 Engine Response:
{'key': 'val1'}


Nested Type - Array Of Objects

Dataset:
"obj": [{"key": "val1"}, {"key": "val2"}]

SQL Query:

SELECT obj FROM nested_objects;

V1 and V2 Engine Response:
[{'key': 'val1'},{'key': 'val2'}]


Array Of Values (Nested only supports objects and not arrays of values):

Dataset:
"obj": ["val1", "val2"]

SQL Query:

SELECT obj FROM objects;

V1 Engine Response:
['val1','val2']

V2 Engine Response:
{'val1'}


Issues Resolved

Issue: 1686

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@forestmvey forestmvey force-pushed the integ-support-inner-hits-exprvalue branch from 644ef89 to 71147ff Compare June 14, 2023 16:42
@forestmvey forestmvey changed the title Improve Array and ExprValue Parsing With Inner Hits Support Array and ExprValue Parsing With Inner Hits Jun 14, 2023
@codecov
Copy link

codecov bot commented Jun 14, 2023

Codecov Report

Merging #1737 (5d657c6) into main (c7dfdb3) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##               main    #1737   +/-   ##
=========================================
  Coverage     97.29%   97.30%           
- Complexity     4408     4432   +24     
=========================================
  Files           388      388           
  Lines         10944    10982   +38     
  Branches        774      782    +8     
=========================================
+ Hits          10648    10686   +38     
  Misses          289      289           
  Partials          7        7           
Flag Coverage Δ
sql-engine 97.30% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...earch/sql/opensearch/data/utils/ObjectContent.java 100.00% <100.00%> (ø)
...l/opensearch/data/utils/OpenSearchJsonContent.java 100.00% <100.00%> (ø)
...nsearch/data/value/OpenSearchExprValueFactory.java 100.00% <100.00%> (ø)
...ch/sql/opensearch/response/OpenSearchResponse.java 100.00% <100.00%> (ø)
...ensearch/storage/script/core/ExpressionScript.java 100.00% <100.00%> (ø)

Yury-Fridlyand
Yury-Fridlyand previously approved these changes Jun 14, 2023
GumpacG
GumpacG previously approved these changes Jun 14, 2023
acarbonetto
acarbonetto previously approved these changes Jun 15, 2023
@@ -21,6 +21,14 @@
"ignore_above": 256
}
}
},
"moveInDate" : {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we want to test this with a nested type too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a inner object to the nested object address. We have IT tests to test with the nested function in NestedIT.java#L373-L400. Is this what you were thinking of testing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think I was wondering if we had nested within nested.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we already have tests for nested in nested too.
NestedIT.java#L106-L118

JSONObject result = executeJdbcRequest(query);

assertEquals(11, result.getInt("total"));
verifyDataRows(result,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the data type of nested(address.moveInDate), object or array?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be of type object. address would be of type nested, I'll add the schema verification.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see. one more question.
some rows is JSONObject, some rows are JSONArray, how does JDBC handle it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases this breaks JDBC, but there is no change in the expected output with this PR. An example of this would be in ObjectFieldSelectIT where we do a select on an OBJECT type array, and a NESTED type array.

ObjectFieldSelectIT.java#L69-L93
people.json
people-mapping

When selecting on an array that is not of type NESTED, the first value in the array is returned and is functional with JDBC. As shown by the query:

SELECT accounts FROM people;

While on the other hand the query that does a basic SELECT on an NESTED type array returns the whole array which breaks JDBC.

Query:

SELECT projects FROM people;

A user can get around this by using the nested function to flatten the response for a correct JDBC output.

@forestmvey forestmvey dismissed stale reviews from acarbonetto, GumpacG, and Yury-Fridlyand via 5c97d43 June 21, 2023 21:06
)
)
),
tupleValueWithArraySupport(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore my comments below.
what if the data is? the expected stringV is [["zz", "au"], ["ss"], ["z"]], right?

{ "stringV": ["z", ["s"], ["zz", "au"]] }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we execute on this:

tupleValueWithArraySupport("{"stringV": ["z", ["s"], ["zz", "au"]]}").get("stringV");

The resulting value is:

["z", ["s"], ["zz", "au"]]

In your example the order is reversed which is not the case. Does this answer your question?

Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
@forestmvey forestmvey force-pushed the integ-support-inner-hits-exprvalue branch from 0394fdd to 5d657c6 Compare June 22, 2023 21:07
} else if (content.isDouble()) {
return parse(content, prefix, Optional.of(OpenSearchDataType.of(DOUBLE)), supportArrays);
} else if (content.isNumber()) {
return parse(content, prefix, Optional.of(OpenSearchDataType.of(INTEGER)), supportArrays);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are using something with type of SHORT or BYTE does it get cast/treated as an INTEGER before this point or does it need to be supported?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Treated as an integer. No need to cast or treat differently.

@forestmvey forestmvey merged commit 9fbcf11 into opensearch-project:main Jun 27, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 27, 2023
* Add support for Array and ExprValue Parsing With Inner Hits

Signed-off-by: forestmvey <forestv@bitquilltech.com>

* Adding schema validation for IT test, and another UT for nested arrays.

Signed-off-by: forestmvey <forestv@bitquilltech.com>

* Making handleAggregationResponse a private function.

Signed-off-by: forestmvey <forestv@bitquilltech.com>

---------

Signed-off-by: forestmvey <forestv@bitquilltech.com>
(cherry picked from commit 9fbcf11)
forestmvey added a commit that referenced this pull request Jun 27, 2023
* Add support for Array and ExprValue Parsing With Inner Hits

Signed-off-by: forestmvey <forestv@bitquilltech.com>

* Adding schema validation for IT test, and another UT for nested arrays.

Signed-off-by: forestmvey <forestv@bitquilltech.com>

* Making handleAggregationResponse a private function.

Signed-off-by: forestmvey <forestv@bitquilltech.com>

---------

Signed-off-by: forestmvey <forestv@bitquilltech.com>
(cherry picked from commit 9fbcf11)
Signed-off-by: forestmvey <forestv@bitquilltech.com>
forestmvey added a commit that referenced this pull request Jun 27, 2023
* Add support for Array and ExprValue Parsing With Inner Hits

Signed-off-by: forestmvey <forestv@bitquilltech.com>

* Adding schema validation for IT test, and another UT for nested arrays.

Signed-off-by: forestmvey <forestv@bitquilltech.com>

* Making handleAggregationResponse a private function.

Signed-off-by: forestmvey <forestv@bitquilltech.com>

---------

Signed-off-by: forestmvey <forestv@bitquilltech.com>
(cherry picked from commit 9fbcf11)
Signed-off-by: forestmvey <forestv@bitquilltech.com>
@forestmvey forestmvey deleted the integ-support-inner-hits-exprvalue branch June 27, 2023 22:31
forestmvey added a commit that referenced this pull request Jun 28, 2023
* Add support for Array and ExprValue Parsing With Inner Hits



* Adding schema validation for IT test, and another UT for nested arrays.



* Making handleAggregationResponse a private function.



---------


(cherry picked from commit 9fbcf11)

Signed-off-by: forestmvey <forestv@bitquilltech.com>
Co-authored-by: Forest Vey <forestv@bitquilltech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants