Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing data types to IngestDocument deep copy #14380

Merged
merged 1 commit into from
Jun 17, 2024

Conversation

andrross
Copy link
Member

PR #11725 added a new deep copy in the ScriptProcessor flow. If a script uses a Short or Byte data type then this new deep copy introduced a regression. This commit fixes that regression.

However, it appears there has been an existing bug where using a Character type in the same way will fail (this failed before PR 11725). The failure is different, and appears to be related to something deeping in the XContent serialization layer. For now, I have fixed the regression but not yet dug into the failure with the Character data type. I have added a test that expects this failure.

Resolves #14379

Check List

  • Functionality includes testing.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

PR opensearch-project#11725 added a new deep copy in the ScriptProcessor flow. If a script
uses a Short or Byte data type then this new deep copy introduced a
regression. This commit fixes that regression.

However, it appears there has been an existing bug where using a
Character type in the same way will fail (this failed before PR 11725).
The failure is different, and appears to be related to something deeping
in the XContent serialization layer. For now, I have fixed the
regression but not yet dug into the failure with the Character data
type. I have added a test that expects this failure.

Resolves opensearch-project#14379

Signed-off-by: Andrew Ross <andrross@amazon.com>
@andrross andrross added the backport 2.x Backport to 2.x branch label Jun 15, 2024
Copy link
Contributor

❌ Gradle check result for dc0ccdc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 40b9e07: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.gateway.RecoveryFromGatewayIT.testMultipleReplicaShardAssignmentWithDelayedAllocationAndDifferentNodeStartTimeInBatchMode

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Jun 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.68%. Comparing base (b15cb0c) to head (40b9e07).
Report is 433 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #14380      +/-   ##
============================================
+ Coverage     71.42%   71.68%   +0.26%     
- Complexity    59978    62066    +2088     
============================================
  Files          4985     5118     +133     
  Lines        282275   291829    +9554     
  Branches      40946    42188    +1242     
============================================
+ Hits         201603   209190    +7587     
- Misses        63999    65304    +1305     
- Partials      16673    17335     +662     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vikasvb90
Copy link
Contributor

vikasvb90 commented Jun 16, 2024

PR #11725 added a new deep copy in the ScriptProcessor flow.

No, it did not. If you look at the code again, you will see that the PR only reused an existing deepCopyMap. This means that the issue has been there since the introduction of Byte data type in processor which wasn't reflected in deep copy map and surprisingly there were not tests to validate a byte field ingest document in script processor.

Without my PR, the code will still break in toXContent in WriteableIngestDocument in simulate flow.

@andrross
Copy link
Member Author

PR #11725 added a new deep copy in the ScriptProcessor flow.

No, it did not. If you look at the code again

@vikasvb90 My initial read of the code, and all subsequent readings, suggest that a new invocation of deep copy was added in the ScriptProcessor code flow that was not there before, which is why this fix is needed now. I'm happy to update the description if that is incorrect. I would also appreciate a review of this code so I can address any comments. Thanks!

@vikasvb90
Copy link
Contributor

@andrross Yes, the invocation is new in ingest flow. It wasn't there before. I thought you meant that I added the deep copy method itself. The code looks fine to me and I tested the rest tests myself. There are 2 things though which we can probably take in a separate PR as well.

  1. We may need tests for all data types on top of deepCopy method.
  2. Char data type will still not be fixed with this I think right? I can try digging in for it.

@andrross
Copy link
Member Author

We may need tests for all data types on top of deepCopy method.

I intended to do this with the new IngestDocumentTests::testCopy method. Let me know if you think we need more coverage of that method.

Char data type will still not be fixed with this I think right?

Correct. As far as I can tell this was broken in all previous versions as well. I opened #14382 to track it separately.

Thanks @vikasvb90!

@andrross andrross merged commit 112704b into opensearch-project:main Jun 17, 2024
33 of 34 checks passed
@andrross andrross deleted the ingest-script-data-types branch June 17, 2024 20:33
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jun 17, 2024
PR #11725 added a new deep copy in the ScriptProcessor flow. If a script
uses a Short or Byte data type then this new deep copy introduced a
regression. This commit fixes that regression.

However, it appears there has been an existing bug where using a
Character type in the same way will fail (this failed before PR 11725).
The failure is different, and appears to be related to something deeping
in the XContent serialization layer. For now, I have fixed the
regression but not yet dug into the failure with the Character data
type. I have added a test that expects this failure.

Resolves #14379

Signed-off-by: Andrew Ross <andrross@amazon.com>
(cherry picked from commit 112704b)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
dblock pushed a commit that referenced this pull request Jun 18, 2024
PR #11725 added a new deep copy in the ScriptProcessor flow. If a script
uses a Short or Byte data type then this new deep copy introduced a
regression. This commit fixes that regression.

However, it appears there has been an existing bug where using a
Character type in the same way will fail (this failed before PR 11725).
The failure is different, and appears to be related to something deeping
in the XContent serialization layer. For now, I have fixed the
regression but not yet dug into the failure with the Character data
type. I have added a test that expects this failure.

Resolves #14379


(cherry picked from commit 112704b)

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
harshavamsi pushed a commit to harshavamsi/OpenSearch that referenced this pull request Jul 12, 2024
…t#14380)

PR opensearch-project#11725 added a new deep copy in the ScriptProcessor flow. If a script
uses a Short or Byte data type then this new deep copy introduced a
regression. This commit fixes that regression.

However, it appears there has been an existing bug where using a
Character type in the same way will fail (this failed before PR 11725).
The failure is different, and appears to be related to something deeping
in the XContent serialization layer. For now, I have fixed the
regression but not yet dug into the failure with the Character data
type. I have added a test that expects this failure.

Resolves opensearch-project#14379

Signed-off-by: Andrew Ross <andrross@amazon.com>
kkewwei pushed a commit to kkewwei/OpenSearch that referenced this pull request Jul 24, 2024
…t#14380) (opensearch-project#14413)

PR opensearch-project#11725 added a new deep copy in the ScriptProcessor flow. If a script
uses a Short or Byte data type then this new deep copy introduced a
regression. This commit fixes that regression.

However, it appears there has been an existing bug where using a
Character type in the same way will fail (this failed before PR 11725).
The failure is different, and appears to be related to something deeping
in the XContent serialization layer. For now, I have fixed the
regression but not yet dug into the failure with the Character data
type. I have added a test that expects this failure.

Resolves opensearch-project#14379

(cherry picked from commit 112704b)

Signed-off-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: kkewwei <kkewwei@163.com>
wdongyu pushed a commit to wdongyu/OpenSearch that referenced this pull request Aug 22, 2024
…t#14380)

PR opensearch-project#11725 added a new deep copy in the ScriptProcessor flow. If a script
uses a Short or Byte data type then this new deep copy introduced a
regression. This commit fixes that regression.

However, it appears there has been an existing bug where using a
Character type in the same way will fail (this failed before PR 11725).
The failure is different, and appears to be related to something deeping
in the XContent serialization layer. For now, I have fixed the
regression but not yet dug into the failure with the Character data
type. I have added a test that expects this failure.

Resolves opensearch-project#14379

Signed-off-by: Andrew Ross <andrross@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] ScriptProcessor fails with Byte and Short data types
3 participants