Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add resizable search queue to OpenSearch (picking up #826) #3207

Merged
merged 6 commits into from
May 16, 2022

Conversation

reta
Copy link
Collaborator

@reta reta commented May 5, 2022

Description

Create a new type of threadpool of "RESIZALE" type to dynamically adjust search queue size in runtime. The current threadpools only be updated via opensearch.yml. Picking up the work from #826.

id                     name                active rejected completed size type
Zwz7ZFYIToOULW9DgFBdfw analyze                  0        0         0    1 fixed
Zwz7ZFYIToOULW9DgFBdfw fetch_shard_started      0        0         0      scaling
Zwz7ZFYIToOULW9DgFBdfw fetch_shard_store        0        0         0      scaling
Zwz7ZFYIToOULW9DgFBdfw flush                    0        0         0      scaling
Zwz7ZFYIToOULW9DgFBdfw force_merge              0        0         0    1 fixed
Zwz7ZFYIToOULW9DgFBdfw generic                  0        0       173      scaling
Zwz7ZFYIToOULW9DgFBdfw get                      0        0         0   12 fixed
Zwz7ZFYIToOULW9DgFBdfw listener                 0        0         0    6 fixed
Zwz7ZFYIToOULW9DgFBdfw management               1        0        21      scaling
Zwz7ZFYIToOULW9DgFBdfw refresh                  0        0         0      scaling
Zwz7ZFYIToOULW9DgFBdfw search                   0        0         0   19 resizable
Zwz7ZFYIToOULW9DgFBdfw search_throttled         0        0         0    1 resizable
Zwz7ZFYIToOULW9DgFBdfw snapshot                 0        0         0      scaling
Zwz7ZFYIToOULW9DgFBdfw system_read              0        0         0    5 fixed
Zwz7ZFYIToOULW9DgFBdfw system_write             0        0         0    5 fixed
Zwz7ZFYIToOULW9DgFBdfw warmer                   0        0         0      scaling
Zwz7ZFYIToOULW9DgFBdfw write                    0        0         0   12 fixed

This PR goes side by side with #2595, we replacing the SEARCH_XXX pools with the ones where the size could be adjusted at runtime. Right now this is not exposed to the outside world through API but plugins could do such adjustments.

Issues Resolved

Closes #476

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

rguo-aws added 2 commits May 6, 2022 04:11
Signed-off-by: Ruizhen <ruizhen@amazon.com>
Signed-off-by: Ruizhen <ruizhen@amazon.com>
@reta reta changed the title Add resizable write/search queue to OpenSearch (picking up #826) Add resizable search queue to OpenSearch (picking up #826) May 5, 2022
@reta reta added the v3.0.0 Issues and PRs related to version 3.0.0 label May 5, 2022
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 5a2545794f668c9c2fe95b24f8586cac2c81f620
Log 5050

Reports 5050

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 15ff5f07270ce4fa52e2d295e983758233f05b4f
Log 5052

Reports 5052

@peterzhuamazon
Copy link
Member

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 15ff5f07270ce4fa52e2d295e983758233f05b4f
Log 5066

Reports 5066

@dreamer-89
Copy link
Member

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 15ff5f07270ce4fa52e2d295e983758233f05b4f
Log 5071

Reports 5071

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 5f64ad05fb82d3cfd9deeedc2cc37a8195b88f69
Log 5081

Reports 5081

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure edff7a9965c997939a9860b59fd0b465200bab9a
Log 5086

Reports 5086

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 67df4c756ecc17b6144a8ee07b4156b46e4254f6
Log 5087

Reports 5087

@reta reta marked this pull request as ready for review May 6, 2022 16:49
@reta reta requested a review from a team as a code owner May 6, 2022 16:49
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 2a15da8b7bda031a6f47f7d6fb156b7b3bdb650b
Log 5092

Reports 5092

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 44c01c0
Log 5094

Reports 5094

@saratvemulapalli saratvemulapalli self-requested a review May 6, 2022 19:26
@saratvemulapalli saratvemulapalli dismissed their stale review May 6, 2022 19:26

Accident :)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
@reta
Copy link
Collaborator Author

reta commented May 6, 2022

I have some minor comments.

Let's spell out the side effect of this change for the user in the PR description? How does one use this?

Thanks @dblock !

Let's spell out the side effect of this change for the user in the PR description?

This PR goes side by side with #2595, we replacing the SEARCH_XXX pools with the ones where the size could be adjusted at runtime. Right now this is not exposed to the outside world through API but plugins could do such adjustments.

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure cd0f9e4
Log 5114

Reports 5114

@reta
Copy link
Collaborator Author

reta commented May 7, 2022

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure cd0f9e4
Log 5122

Reports 5122

@reta
Copy link
Collaborator Author

reta commented May 7, 2022

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success cd0f9e4
Log 5126

Reports 5126

@dblock dblock requested a review from Bukhtawar May 10, 2022 15:28
Comment on lines 15 to 17
// This is a random starting point alpha. TODO: revisit this with actual testing and/or make it configurable
double EWMA_ALPHA = 0.3;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be configurable, maybe more of an expert settings

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the default, but I will add the constructors to allow configuration

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

) {

if (queueCapacity <= 0) {
throw new IllegalArgumentException("queue capacity for [" + name + "] executor must be positive, got: " + queueCapacity);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe consider 0 in the exception

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't get this one: positive means > 0 and 0 is not acceptable value, makes sense?

Comment on lines +23 to +26
public final class QueueResizableOpenSearchThreadPoolExecutor extends OpenSearchThreadPoolExecutor
implements
EWMATrackingThreadPoolExecutor {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not me - spotless

* Resizes the work queue capacity of the pool
* @param capacity the new capacity
*/
public synchronized int resize(int capacity) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding who calls resize?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK that's could be done from the plugin(s), as per attached issue

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the plugin have the capability able to override the resize logic? Do you think we could expose a contract?

Copy link
Collaborator Author

@reta reta May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the whole purpose of the issue and the change (I am finalizing the #826 since the pull request was abandoned). From own perspective - it could be useful in certain cases since thread pools are not adjustable at runtime.

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
@opensearch-ci-bot
Copy link
Collaborator

❌   Gradle Check failure 3dfe946
Log 5259

Reports 5259

@reta
Copy link
Collaborator Author

reta commented May 12, 2022

x Gradle Check failure 3dfe946 Log 5259

Reports 5259

#1703

@reta
Copy link
Collaborator Author

reta commented May 12, 2022

start gradle check

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 3dfe946
Log 5260

Reports 5260

@reta
Copy link
Collaborator Author

reta commented May 13, 2022

@saratvemulapalli anything left on your side? thank you!

@saratvemulapalli
Copy link
Member

@saratvemulapalli anything left on your side? thank you!

Thanks @reta No I dont have anything else.
I was just trying to read the PR and understand the context :)

@dblock dblock merged commit 38fb1d9 into opensearch-project:main May 16, 2022
@dblock
Copy link
Member

dblock commented May 16, 2022

@reta Any reason not to backport to 2.x? No breaking changes here AFAIK.

@reta
Copy link
Collaborator Author

reta commented May 16, 2022

@reta Any reason not to backport to 2.x? No breaking changes here AFAIK.

It changes the pool behind SEARCH and SEARCH_THROTTLED, we thought with @andrross that it is breaking change in #2595. I am about to run the benchmarks to evaluate the impact (if any), we could target 2.1.0, for 2.0.0 would be great to backport after the benchmarking, wdyt?

@dblock
Copy link
Member

dblock commented May 16, 2022

I it's a net performance improvement without any visible backwards incompatible changes, then I don't see why not. Breaking changes are only about user experience, APIs, interfaces.

@reta
Copy link
Collaborator Author

reta commented May 16, 2022

I it's a net performance improvement without any visible backwards incompatible changes, then I don't see why not. Breaking changes are only about user experience, APIs, interfaces.

Got it, thanks @dblock , I will run benchmarks shortly and update the issue on backport plans, thanks!

@reta
Copy link
Collaborator Author

reta commented May 25, 2022

I it's a net performance improvement without any visible backwards incompatible changes, then I don't see why not. Breaking changes are only about user experience, APIs, interfaces.

Got it, thanks @dblock , I will run benchmarks shortly and update the issue on backport plans, thanks!

@dblock sorry for delay, finally run the tests I wanted, regarding

... any visible backwards incompatible changes,

There is only one - removal of the deprecated properties for SEARCH / SEARCH_THROTTLED pool [1].

thread_pool:
    search:
        size: 30
        queue_size: 500
        min_queue_size: 10  --> removed
        max_queue_size: 1000  --> removed
        auto_queue_frame_size: 2000  --> removed
        target_response_time: 1s  --> removed

If you think this is a backward compatibility concern, we could still bring the change to 2.1.0 but without changing SEARCH / SEARCH_THROTTLED pool types. Would it make sense?

Regrading performance, no regressions shown by the benchmarks:

  • nyc_taxis (search and aggregation only)
Metric Task 2.0.0 3.0.0 Diff Unit
... ... ... ... ... ...
Min Throughput default 3.01603 3.01586 -0.00017 ops/s
Mean Throughput default 3.02609 3.02584 -0.00025 ops/s
Median Throughput default 3.02374 3.02358 -0.00015 ops/s
Max Throughput default 3.04603 3.04554 -0.00048 ops/s
50th percentile latency default 10.512 10.9957 0.48375 ms
90th percentile latency default 12.1891 12.6665 0.47739 ms
99th percentile latency default 12.8956 13.5526 0.657 ms
100th percentile latency default 12.9211 13.565 0.64393 ms
50th percentile service time default 8.76475 9.15046 0.38571 ms
90th percentile service time default 10.3444 10.6486 0.30422 ms
99th percentile service time default 10.8272 11.174 0.34685 ms
100th percentile service time default 10.8565 11.3419 0.48544 ms
error rate default 0 0 0 %
Min Throughput range 0.703045 0.703475 0.00043 ops/s
Mean Throughput range 0.705008 0.70571 0.0007 ops/s
Median Throughput range 0.70456 0.705193 0.00063 ops/s
Max Throughput range 0.709045 0.710306 0.00126 ops/s
50th percentile latency range 418.294 190.774 -227.52 ms
90th percentile latency range 454.41 213.684 -240.726 ms
99th percentile latency range 563.515 225.804 -337.711 ms
100th percentile latency range 565.653 230.017 -335.636 ms
50th percentile service time range 416.174 188.381 -227.793 ms
90th percentile service time range 451.55 210.951 -240.598 ms
99th percentile service time range 561.716 223.472 -338.245 ms
100th percentile service time range 564.861 227.001 -337.86 ms
error rate range 0 0 0 %
Min Throughput distance_amount_agg 2.0118 2.01191 0.00012 ops/s
Mean Throughput distance_amount_agg 2.01942 2.0196 0.00018 ops/s
Median Throughput distance_amount_agg 2.01764 2.01781 0.00016 ops/s
Max Throughput distance_amount_agg 2.03494 2.03518 0.00025 ops/s
50th percentile latency distance_amount_agg 8.43376 7.63267 -0.80109 ms
90th percentile latency distance_amount_agg 9.05376 8.67633 -0.37743 ms
99th percentile latency distance_amount_agg 9.87221 9.10477 -0.76745 ms
100th percentile latency distance_amount_agg 9.96799 9.44583 -0.52216 ms
50th percentile service time distance_amount_agg 6.33497 6.05554 -0.27943 ms
90th percentile service time distance_amount_agg 6.68171 6.60702 -0.07468 ms
99th percentile service time distance_amount_agg 7.09549 7.39716 0.30166 ms
100th percentile service time distance_amount_agg 7.21325 7.46483 0.25157 ms
error rate distance_amount_agg 0 0 0 %
Min Throughput autohisto_agg 1.49243 1.49562 0.00319 ops/s
Mean Throughput autohisto_agg 1.49578 1.49757 0.0018 ops/s
Median Throughput autohisto_agg 1.49616 1.4978 0.00164 ops/s
Max Throughput autohisto_agg 1.49742 1.49851 0.00109 ops/s
50th percentile latency autohisto_agg 466.632 485.789 19.1568 ms
90th percentile latency autohisto_agg 486.674 539.443 52.7692 ms
99th percentile latency autohisto_agg 523.54 553.441 29.901 ms
100th percentile latency autohisto_agg 539.967 555.939 15.9721 ms
50th percentile service time autohisto_agg 465.333 484.183 18.8497 ms
90th percentile service time autohisto_agg 485.29 537.177 51.8877 ms
99th percentile service time autohisto_agg 521.557 552.085 30.5282 ms
100th percentile service time autohisto_agg 537.714 553.973 16.2587 ms
error rate autohisto_agg 0 0 0 %
Min Throughput date_histogram_agg 1.50285 1.50154 -0.00131 ops/s
Mean Throughput date_histogram_agg 1.50462 1.5025 -0.00212 ops/s
Median Throughput date_histogram_agg 1.50422 1.50228 -0.00194 ops/s
Max Throughput date_histogram_agg 1.50816 1.50442 -0.00375 ops/s
50th percentile latency date_histogram_agg 479.104 490.881 11.7763 ms
90th percentile latency date_histogram_agg 512.327 551.553 39.2264 ms
99th percentile latency date_histogram_agg 555.329 574.225 18.8962 ms
100th percentile latency date_histogram_agg 558.072 581.807 23.7346 ms
50th percentile service time date_histogram_agg 477.227 489.498 12.2716 ms
90th percentile service time date_histogram_agg 511.139 550.112 38.9724 ms
99th percentile service time date_histogram_agg 553.898 572.41 18.5116 ms
100th percentile service time date_histogram_agg 556.489 580.048 23.5587 ms
error rate date_histogram_agg 0 0 0 %
  • http_logs (search and aggregation only)
Metric Task 2.0.0 3.0.0 Diff Unit
... ... ... ... ... ...
Min Throughput default 8.00221 8.00439 0.00218 ops/s
Mean Throughput default 8.00248 8.00482 0.00234 ops/s
Median Throughput default 8.00249 8.00481 0.00232 ops/s
Max Throughput default 8.00274 8.00525 0.0025 ops/s
50th percentile latency default 7.68398 8.37626 0.69229 ms
90th percentile latency default 9.17801 8.9582 -0.21981 ms
99th percentile latency default 9.99378 9.47531 -0.51848 ms
100th percentile latency default 10.0425 9.88431 -0.15818 ms
50th percentile service time default 6.21694 6.78643 0.56949 ms
90th percentile service time default 6.96319 7.13032 0.16714 ms
99th percentile service time default 7.52822 7.43142 -0.09681 ms
100th percentile service time default 8.12879 7.62242 -0.50637 ms
error rate default 0 0 0 %
Min Throughput term 49.7882 49.8224 0.0342 ops/s
Mean Throughput term 49.7966 49.8298 0.03319 ops/s
Median Throughput term 49.7966 49.8298 0.03319 ops/s
Max Throughput term 49.8051 49.8373 0.03217 ops/s
50th percentile latency term 6.54982 12.4655 5.9157 ms
90th percentile latency term 12.9127 13.3714 0.45871 ms
99th percentile latency term 14.8837 15.3168 0.43316 ms
100th percentile latency term 27.8102 16.4798 -11.3304 ms
50th percentile service time term 5.35667 11.0563 5.69963 ms
90th percentile service time term 11.377 11.8741 0.49705 ms
99th percentile service time term 12.3291 13.7963 1.46724 ms
100th percentile service time term 26.5797 14.8792 -11.7005 ms
error rate term 0 0 0 %
Min Throughput range 1.00463 1.00476 0.00014 ops/s
Mean Throughput range 1.00641 1.00659 0.00018 ops/s
Median Throughput range 1.00616 1.00634 0.00018 ops/s
Max Throughput range 1.00921 1.00947 0.00027 ops/s
50th percentile latency range 13.1732 13.2035 0.03027 ms
90th percentile latency range 16.8576 17.1525 0.29494 ms
99th percentile latency range 18.4478 18.036 -0.41177 ms
100th percentile latency range 20.6701 21.8114 1.14129 ms
50th percentile service time range 10.7207 10.8275 0.10689 ms
90th percentile service time range 14.7411 14.6425 -0.09859 ms
99th percentile service time range 15.5539 15.6802 0.12634 ms
100th percentile service time range 18.6388 19.019 0.38021 ms
error rate range 0 0 0 %
Min Throughput 200s-in-range 32.8873 32.9331 0.04576 ops/s
Mean Throughput 200s-in-range 32.8941 32.9369 0.04281 ops/s
Median Throughput 200s-in-range 32.894 32.9371 0.0431 ops/s
Max Throughput 200s-in-range 32.9008 32.9404 0.03958 ops/s
50th percentile latency 200s-in-range 12.4071 14.643 2.23589 ms
90th percentile latency 200s-in-range 16.3007 15.7014 -0.59928 ms
99th percentile latency 200s-in-range 18.1677 16.3358 -1.83186 ms
100th percentile latency 200s-in-range 19.0266 16.4531 -2.5735 ms
50th percentile service time 200s-in-range 10.602 12.9783 2.37633 ms
90th percentile service time 200s-in-range 14.5145 14.0507 -0.46381 ms
99th percentile service time 200s-in-range 16.1185 14.4248 -1.69374 ms
100th percentile service time 200s-in-range 17.0706 14.901 -2.1696 ms
error rate 200s-in-range 0 0 0 %
Min Throughput 400s-in-range 49.9987 49.903 -0.09565 ops/s
Mean Throughput 400s-in-range 49.999 49.9065 -0.0925 ops/s
Median Throughput 400s-in-range 49.9989 49.9065 -0.09239 ops/s
Max Throughput 400s-in-range 49.9995 49.91 -0.08946 ops/s
50th percentile latency 400s-in-range 8.18514 8.94803 0.76289 ms
90th percentile latency 400s-in-range 9.54274 9.5797 0.03696 ms
99th percentile latency 400s-in-range 10.306 9.96981 -0.33623 ms
100th percentile latency 400s-in-range 10.4425 15.2209 4.77837 ms
50th percentile service time 400s-in-range 6.77338 7.42973 0.65635 ms
90th percentile service time 400s-in-range 7.83353 7.81598 -0.01754 ms
99th percentile service time 400s-in-range 8.34765 8.37274 0.02509 ms
100th percentile service time 400s-in-range 8.42711 13.0866 4.65947 ms
error rate 400s-in-range 0 0 0 %
Min Throughput hourly_agg 0.200447 0.200423 -2e-05 ops/s
Mean Throughput hourly_agg 0.200618 0.200585 -3e-05 ops/s
Median Throughput hourly_agg 0.200594 0.200562 -3e-05 ops/s
Max Throughput hourly_agg 0.200886 0.200839 -5e-05 ops/s
50th percentile latency hourly_agg 2546.73 2657.02 110.296 ms
90th percentile latency hourly_agg 2592.32 2703.39 111.076 ms
99th percentile latency hourly_agg 2681.31 2740.59 59.2801 ms
100th percentile latency hourly_agg 2689.59 2749.19 59.6026 ms
50th percentile service time hourly_agg 2543.37 2655.01 111.637 ms
90th percentile service time hourly_agg 2588.75 2701.26 112.506 ms
99th percentile service time hourly_agg 2678.43 2736.52 58.0876 ms
100th percentile service time hourly_agg 2687 2746.34 59.3386 ms
error rate hourly_agg 0 0 0 %
Min Throughput scroll 25.0507 25.0527 0.00201 pages/s
Mean Throughput scroll 25.0834 25.0868 0.00339 pages/s
Median Throughput scroll 25.076 25.079 0.00309 pages/s
Max Throughput scroll 25.1513 25.1574 0.00617 pages/s
50th percentile latency scroll 234.563 245.951 11.388 ms
90th percentile latency scroll 265.414 287.358 21.9441 ms
99th percentile latency scroll 311.755 326.597 14.8421 ms
100th percentile latency scroll 330.247 351.546 21.2992 ms
50th percentile service time scroll 231.239 242.784 11.5456 ms
90th percentile service time scroll 262.547 284.44 21.8924 ms
99th percentile service time scroll 308.578 324.156 15.5784 ms
100th percentile service time scroll 327.591 348.721 21.1297 ms
error rate scroll 0 0 0 %
Min Throughput desc_sort_timestamp 0.501015 0.501025 1e-05 ops/s
Mean Throughput desc_sort_timestamp 0.501232 0.501245 1e-05 ops/s
Median Throughput desc_sort_timestamp 0.501215 0.501228 1e-05 ops/s
Max Throughput desc_sort_timestamp 0.501515 0.501531 2e-05 ops/s
50th percentile latency desc_sort_timestamp 652.535 683.276 30.7416 ms
90th percentile latency desc_sort_timestamp 675.574 703.809 28.2351 ms
99th percentile latency desc_sort_timestamp 714.458 764.219 49.7612 ms
100th percentile latency desc_sort_timestamp 716.489 772.286 55.7966 ms
50th percentile service time desc_sort_timestamp 649.752 680.292 30.5407 ms
90th percentile service time desc_sort_timestamp 673.166 701.23 28.0636 ms
99th percentile service time desc_sort_timestamp 711.741 761.242 49.5004 ms
100th percentile service time desc_sort_timestamp 714.197 769.173 54.9769 ms
error rate desc_sort_timestamp 0 0 0 %
Min Throughput asc_sort_timestamp 0.50164 0.501641 0 ops/s
Mean Throughput asc_sort_timestamp 0.501992 0.501993 0 ops/s
Median Throughput asc_sort_timestamp 0.501965 0.501965 0 ops/s
Max Throughput asc_sort_timestamp 0.502453 0.502455 0 ops/s
50th percentile latency asc_sort_timestamp 30.0054 35.0532 5.04773 ms
90th percentile latency asc_sort_timestamp 33.7782 52.3356 18.5575 ms
99th percentile latency asc_sort_timestamp 57.08 58.5949 1.51486 ms
100th percentile latency asc_sort_timestamp 57.1303 60.0232 2.8929 ms
50th percentile service time asc_sort_timestamp 27.0778 31.8819 4.80416 ms
90th percentile service time asc_sort_timestamp 30.375 48.8441 18.4692 ms
99th percentile service time asc_sort_timestamp 53.881 54.9683 1.08728 ms
100th percentile service time asc_sort_timestamp 54.0996 56.4284 2.32881 ms
error rate asc_sort_timestamp 0 0 0 %
Min Throughput desc_sort_with_after_timestamp 0.502476 0.502214 -0.00026 ops/s
Mean Throughput desc_sort_with_after_timestamp 0.506505 0.505812 -0.00069 ops/s
Median Throughput desc_sort_with_after_timestamp 0.504518 0.504044 -0.00047 ops/s
Max Throughput desc_sort_with_after_timestamp 0.525909 0.523097 -0.00281 ops/s
50th percentile latency desc_sort_with_after_timestamp 866.508 952.86 86.3517 ms
90th percentile latency desc_sort_with_after_timestamp 907.291 987.509 80.2176 ms
99th percentile latency desc_sort_with_after_timestamp 1018.87 1046.59 27.7263 ms
100th percentile latency desc_sort_with_after_timestamp 1028.18 1093.72 65.5354 ms
50th percentile service time desc_sort_with_after_timestamp 863.539 951.523 87.9846 ms
90th percentile service time desc_sort_with_after_timestamp 905.088 986.212 81.1242 ms
99th percentile service time desc_sort_with_after_timestamp 1016.47 1044.26 27.794 ms
100th percentile service time desc_sort_with_after_timestamp 1025.64 1090.68 65.0482 ms
error rate desc_sort_with_after_timestamp 0 0 0 %
Min Throughput asc_sort_with_after_timestamp 0.502376 0.502259 -0.00012 ops/s
Mean Throughput asc_sort_with_after_timestamp 0.506246 0.505929 -0.00032 ops/s
Median Throughput asc_sort_with_after_timestamp 0.504337 0.504122 -0.00022 ops/s
Max Throughput asc_sort_with_after_timestamp 0.524844 0.523545 -0.0013 ops/s
50th percentile latency asc_sort_with_after_timestamp 928.733 1009.39 80.6548 ms
90th percentile latency asc_sort_with_after_timestamp 958.256 1035.28 77.0265 ms
99th percentile latency asc_sort_with_after_timestamp 993.756 1071.2 77.4397 ms
100th percentile latency asc_sort_with_after_timestamp 1091.93 1080.84 -11.0903 ms
50th percentile service time asc_sort_with_after_timestamp 926.65 1007.06 80.407 ms
90th percentile service time asc_sort_with_after_timestamp 955.901 1033.54 77.6397 ms
99th percentile service time asc_sort_with_after_timestamp 992.325 1069.3 76.9757 ms
100th percentile service time asc_sort_with_after_timestamp 1090.52 1079.36 -11.1592 ms
error rate asc_sort_with_after_timestamp 0 0 0 %
Min Throughput wait-until-merges-1-seg-finish 71.1768 45.1326 -26.0442 ops/s
Mean Throughput wait-until-merges-1-seg-finish 71.1768 45.1326 -26.0442 ops/s
Median Throughput wait-until-merges-1-seg-finish 71.1768 45.1326 -26.0442 ops/s
Max Throughput wait-until-merges-1-seg-finish 71.1768 45.1326 -26.0442 ops/s
100th percentile latency wait-until-merges-1-seg-finish 13.4659 21.3753 7.9094 ms
100th percentile service time wait-until-merges-1-seg-finish 13.4659 21.3753 7.9094 ms
error rate wait-until-merges-1-seg-finish 0 0 0 %
Min Throughput desc-sort-timestamp-after-force-merge-1-seg 1.52894 1.52204 -0.0069 ops/s
Mean Throughput desc-sort-timestamp-after-force-merge-1-seg 1.53478 1.52494 -0.00984 ops/s
Median Throughput desc-sort-timestamp-after-force-merge-1-seg 1.53532 1.52425 -0.01107 ops/s
Max Throughput desc-sort-timestamp-after-force-merge-1-seg 1.53998 1.52979 -0.01019 ops/s
50th percentile latency desc-sort-timestamp-after-force-merge-1-seg 38346.4 39658.1 1311.75 ms
90th percentile latency desc-sort-timestamp-after-force-merge-1-seg 44843.8 45993.4 1149.59 ms
99th percentile latency desc-sort-timestamp-after-force-merge-1-seg 46533.7 47367.6 833.887 ms
100th percentile latency desc-sort-timestamp-after-force-merge-1-seg 46686.4 47541.8 855.399 ms
50th percentile service time desc-sort-timestamp-after-force-merge-1-seg 656.122 655.751 -0.37095 ms
90th percentile service time desc-sort-timestamp-after-force-merge-1-seg 701.464 697.192 -4.27179 ms
99th percentile service time desc-sort-timestamp-after-force-merge-1-seg 784.255 744.138 -40.1172 ms
100th percentile service time desc-sort-timestamp-after-force-merge-1-seg 787.343 758.515 -28.828 ms
error rate desc-sort-timestamp-after-force-merge-1-seg 0 0 0 %
Min Throughput asc-sort-timestamp-after-force-merge-1-seg 2.00635 2.00642 7e-05 ops/s
Mean Throughput asc-sort-timestamp-after-force-merge-1-seg 2.00772 2.0078 7e-05 ops/s
Median Throughput asc-sort-timestamp-after-force-merge-1-seg 2.00762 2.00769 7e-05 ops/s
Max Throughput asc-sort-timestamp-after-force-merge-1-seg 2.00948 2.00958 0.0001 ops/s
50th percentile latency asc-sort-timestamp-after-force-merge-1-seg 29.1459 33.7639 4.618 ms
90th percentile latency asc-sort-timestamp-after-force-merge-1-seg 49.2804 56.4928 7.2124 ms
99th percentile latency asc-sort-timestamp-after-force-merge-1-seg 54.3399 57.8095 3.46968 ms
100th percentile latency asc-sort-timestamp-after-force-merge-1-seg 56.3336 57.8883 1.55473 ms
50th percentile service time asc-sort-timestamp-after-force-merge-1-seg 26.8871 31.8396 4.95252 ms
90th percentile service time asc-sort-timestamp-after-force-merge-1-seg 46.8394 54.319 7.47961 ms
99th percentile service time asc-sort-timestamp-after-force-merge-1-seg 52.5184 56.0072 3.48884 ms
100th percentile service time asc-sort-timestamp-after-force-merge-1-seg 53.0127 56.2684 3.25575 ms
error rate asc-sort-timestamp-after-force-merge-1-seg 0 0 0 %
Min Throughput desc-sort-with-after-timestamp-after-force-merge-1-seg 0.502488 0.502397 -9e-05 ops/s
Mean Throughput desc-sort-with-after-timestamp-after-force-merge-1-seg 0.50654 0.506296 -0.00024 ops/s
Median Throughput desc-sort-with-after-timestamp-after-force-merge-1-seg 0.504543 0.504376 -0.00017 ops/s
Max Throughput desc-sort-with-after-timestamp-after-force-merge-1-seg 0.526029 0.525035 -0.00099 ops/s
50th percentile latency desc-sort-with-after-timestamp-after-force-merge-1-seg 928.58 977.832 49.2514 ms
90th percentile latency desc-sort-with-after-timestamp-after-force-merge-1-seg 985.915 1055.96 70.0458 ms
99th percentile latency desc-sort-with-after-timestamp-after-force-merge-1-seg 1059.71 1102.05 42.342 ms
100th percentile latency desc-sort-with-after-timestamp-after-force-merge-1-seg 1073.87 1107.76 33.8872 ms
50th percentile service time desc-sort-with-after-timestamp-after-force-merge-1-seg 926.498 975.437 48.9392 ms
90th percentile service time desc-sort-with-after-timestamp-after-force-merge-1-seg 983.137 1053.57 70.4342 ms
99th percentile service time desc-sort-with-after-timestamp-after-force-merge-1-seg 1057.37 1099.72 42.3518 ms
100th percentile service time desc-sort-with-after-timestamp-after-force-merge-1-seg 1072.36 1104.98 32.6262 ms
error rate desc-sort-with-after-timestamp-after-force-merge-1-seg 0 0 0 %
Min Throughput asc-sort-with-after-timestamp-after-force-merge-1-seg 0.502249 0.502145 -0.0001 ops/s
Mean Throughput asc-sort-with-after-timestamp-after-force-merge-1-seg 0.505902 0.505626 -0.00028 ops/s
Median Throughput asc-sort-with-after-timestamp-after-force-merge-1-seg 0.504106 0.503913 -0.00019 ops/s
Max Throughput asc-sort-with-after-timestamp-after-force-merge-1-seg 0.523439 0.522306 -0.00113 ops/s
50th percentile latency asc-sort-with-after-timestamp-after-force-merge-1-seg 998.301 1058.08 59.7813 ms
90th percentile latency asc-sort-with-after-timestamp-after-force-merge-1-seg 1054.35 1098.29 43.9415 ms
99th percentile latency asc-sort-with-after-timestamp-after-force-merge-1-seg 1141.02 1120.78 -20.236 ms
100th percentile latency asc-sort-with-after-timestamp-after-force-merge-1-seg 1155.65 1123.34 -32.3125 ms
50th percentile service time asc-sort-with-after-timestamp-after-force-merge-1-seg 996.547 1055.03 58.4851 ms
90th percentile service time asc-sort-with-after-timestamp-after-force-merge-1-seg 1052.17 1096 43.8278 ms
99th percentile service time asc-sort-with-after-timestamp-after-force-merge-1-seg 1137.57 1118.51 -19.0669 ms
100th percentile service time asc-sort-with-after-timestamp-after-force-merge-1-seg 1153.22 1120.02 -33.1953 ms
error rate asc-sort-with-after-timestamp-after-force-merge-1-seg 0 0 0 %

[1] https://www.elastic.co/guide/en/elasticsearch/reference/7.17/modules-threadpool.html#fixed-auto-queue-size

@dblock
Copy link
Member

dblock commented May 26, 2022

... any visible backwards incompatible changes,

There is only one - removal of the deprecated properties for SEARCH / SEARCH_THROTTLED pool [1].

thread_pool:
    search:
        size: 30
        queue_size: 500
        min_queue_size: 10  --> removed
        max_queue_size: 1000  --> removed
        auto_queue_frame_size: 2000  --> removed
        target_response_time: 1s  --> removed

If you think this is a backward compatibility concern, we could still bring the change to 2.1.0 but without changing SEARCH / SEARCH_THROTTLED pool types. Would it make sense?

So, if a user has this in a config, will it break? Or just not use these? It's okay to deprecate settings with warnings (e.g. this setting is no longer used), but users' existing configuration should continue loading.

@reta
Copy link
Collaborator Author

reta commented May 26, 2022

So, if a user has this in a config, will it break?

Yes

It's okay to deprecate settings with warnings (e.g. this setting is no longer used), but users' existing configuration should continue loading.

They already are deprecated (and should be gone in 2.0 technically)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
9 participants