Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable numeric sort optimization support for all numeric types #6424

Merged
merged 17 commits into from
Mar 27, 2023

Conversation

gashutos
Copy link
Contributor

@gashutos gashutos commented Feb 21, 2023

Description

Enable sort optimization for all remaining Numeric types which were not covered in our last PR 6321
The main issue supporting these remaining NumericTypes was its merging capabilities during field mapping change for indices. That is resolved in this PR by introducing wider size comparator during merge.

Below remainig types were enabled as a part of this PR.

  1. Boolean
  2. Bytes
  3. Short
  4. Int
  5. Half_float (it always uses custom comparator though)
  6. Float

The performance gain we had with this is very huge, almost 10x gain on server side latency comparison. I have listed down few benchmarks in testing section.

Performance improvement

Below is server side latency comparison. Units are in ms.
Dataset as of now I have are smaller for float & short types, but improvement in 'x' wise is pretty wide.

NumericType New sort latency Old sort latency OSB workload
SHORT 13 130 eventdata
INT 150 1411 http_logs
FLOAT 15 240 noaa

http_logs is having largest ingested data, so latency we have there higher compare to other workload during sorting. But in case of relative latency w.r.t w/out sort optimization in 'x' multiple wise gives us similar improvement.

Issues Resolved

6326

Tested

  • Added extensive unit tests to check if merge works properly in case indices have different numeric types sorted.
  • Tested creating managed service domain and did performance testing, below were the results.
  • Reindexed indices with different mapping and checked things are working fine with optimization gains.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: gashutos <gashutos@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

…ling

Signed-off-by: gashutos <gashutos@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: gashutos <gashutos@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.indices.replication.SegmentReplicationAllocationIT.testAllocationWithDisruption

@dblock
Copy link
Member

dblock commented Mar 27, 2023

I resolved the CHANGELOG conflict. Code LGTM, let's merge on green.

@gashutos
Copy link
Contributor Author

I resolved the CHANGELOG conflict. Code LGTM, let's merge on green.

Thanks @dblock !

Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just a couple nits.

@nknize
Copy link
Collaborator

nknize commented Mar 27, 2023

Unrelated precommit failure:

Execution failed for task ':modules:lang-expression:compileTestJava'.

Refired precommit job.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: gashutos <gashutos@amazon.com>
Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once green! Nice work!

@gashutos
Copy link
Contributor Author

LGTM once green! Nice work!

Awesome , thanks Nick !

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@dblock dblock merged commit e8425fc into opensearch-project:main Mar 27, 2023
@dblock dblock added the backport 2.x Backport to 2.x branch label Mar 27, 2023
@dblock
Copy link
Member

dblock commented Mar 27, 2023

@gashutos watch the auto-backport (might need to be done manually if it fails)

opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 27, 2023
* Adding numeric optimization support for all numeric types

Signed-off-by: gashutos <gashutos@amazon.com>

* modifying CHANGELOG.md

Signed-off-by: gashutos <gashutos@amazon.com>

* Handling multi-cluster scenario where SortField serialization was failing

Signed-off-by: gashutos <gashutos@amazon.com>

* Fixing javadoc errors

Signed-off-by: gashutos <gashutos@amazon.com>

* Fixing nested sort integ tests

Signed-off-by: gashutos <gashutos@amazon.com>

* Stremlining behaviour of custom comparator tests too

Signed-off-by: gashutos <gashutos@amazon.com>

* Adding more integ tests for IntValuesComparatorSource & fixing few ITs

Signed-off-by: gashutos <gashutos@amazon.com>

* Fixing few more integ tests

Signed-off-by: gashutos <gashutos@amazon.com>

* Streamlining applySortWidening method with CreateSort and avoid modifying cteated objects of sort

Signed-off-by: gashutos <gashutos@amazon.com>

* Correcting licence header

Signed-off-by: gashutos <gashutos@amazon.com>

---------

Signed-off-by: gashutos <gashutos@amazon.com>
Co-authored-by: Daniel (dB.) Doubrovkine <dblock@amazon.com>
(cherry picked from commit e8425fc)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
reta pushed a commit that referenced this pull request Mar 28, 2023
#6849)

* Adding numeric optimization support for all numeric types



* modifying CHANGELOG.md



* Handling multi-cluster scenario where SortField serialization was failing



* Fixing javadoc errors



* Fixing nested sort integ tests



* Stremlining behaviour of custom comparator tests too



* Adding more integ tests for IntValuesComparatorSource & fixing few ITs



* Fixing few more integ tests



* Streamlining applySortWidening method with CreateSort and avoid modifying cteated objects of sort



* Correcting licence header



---------



(cherry picked from commit e8425fc)

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Daniel (dB.) Doubrovkine <dblock@amazon.com>
mitrofmep pushed a commit to mitrofmep/OpenSearch that referenced this pull request Apr 5, 2023
…earch-project#6424)

* Adding numeric optimization support for all numeric types

Signed-off-by: gashutos <gashutos@amazon.com>

* modifying CHANGELOG.md

Signed-off-by: gashutos <gashutos@amazon.com>

* Handling multi-cluster scenario where SortField serialization was failing

Signed-off-by: gashutos <gashutos@amazon.com>

* Fixing javadoc errors

Signed-off-by: gashutos <gashutos@amazon.com>

* Fixing nested sort integ tests

Signed-off-by: gashutos <gashutos@amazon.com>

* Stremlining behaviour of custom comparator tests too

Signed-off-by: gashutos <gashutos@amazon.com>

* Adding more integ tests for IntValuesComparatorSource & fixing few ITs

Signed-off-by: gashutos <gashutos@amazon.com>

* Fixing few more integ tests

Signed-off-by: gashutos <gashutos@amazon.com>

* Streamlining applySortWidening method with CreateSort and avoid modifying cteated objects of sort

Signed-off-by: gashutos <gashutos@amazon.com>

* Correcting licence header

Signed-off-by: gashutos <gashutos@amazon.com>

---------

Signed-off-by: gashutos <gashutos@amazon.com>
Co-authored-by: Daniel (dB.) Doubrovkine <dblock@amazon.com>
Signed-off-by: Valentin Mitrofanov <mitrofmep@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants