[BUG] OpenSearch sort query performance regression #5534

gkamat · 2022-12-13T00:06:57Z

Summary

OpenSearch performance for sort queries appears to have degraded since version 7.10. A simple timestamp sort query was executed on a 100 GB data set in Amazon's managed service on a few different versions: AES 7.10, AOS 1.3 and AOS 2.3. The clusters used were of two types: the first configured with t3.medium data nodes, and the other with r5.large data nodes.

AES 7.10 had the best sort performance with regard to latency, while AOS 2.3 was the worst of the three.

It was presumed that the memory footprint might have increased as newer OpenSearch versions were released and that this might have had an impact on performance, especially since t3.medium has only 4 GB RAM (and 2 vCPUs).

Therefore, the clusters were upgraded to r5.large, which has 2 vCPUs and 16 GB RAM. Latencies did not change substantially for AES 7.10, degraded somewhat for AOS 1.3 and improved substantially for AOS 2.3 (compared to the smaller instance type.) It seems likely that AOS 2.3 has a larger memory footprint, and would benefit from memory-optimized instance types.

It is true that the sort query used in the tests is rather simple-minded and that sophisticated users will use a more optimized query; however, not all users have that level of sophistication. The reference query used here was:

        "query": {
          "match_all": {}
        },
        "sort" : [
          {"@timestamp" : "asc"}
        ]

The other point to note is the difference in performance with regard to sort ordering (ascending vs. descending):

There is a substantial degradation of descending sort performance vis-a-vis ascending in AOS 1.3.
On AOS 2.3 with the r5.large instance type, ascending sort performance is worse than descending in memory-constrained mode with t3’s. Notably, the performance of both is similar with the r5 instances, but still worse than in AOS 1.3.

Methodology

Performance was measured using OpenSearch Benchmark (OSB). The data set was a synthetically generated one based on the http_logs workload packaged with OSB. Each document was relatively small, ~128 bytes.

After the data was ingested and a force-merge carried out, the query above was run for 200 iterations after some warmup iterations.

The clusters each had 3 master nodes (r5.large) and 2 data nodes.

Results

All numbers are latencies in ms. There were two runs in each mode to ensure repeatability.

Data Node Instance Type	Sort Mode	AES 7.10	AOS 1.3	AOS 2.3
t3.medium	desc	47	4207	11541
t3.medium	asc	181	389	43773
r5.large	desc	53	6133	7894
r5.large	asc	168	459	7229

The text was updated successfully, but these errors were encountered:

gkamat · 2022-12-13T01:30:52Z

It should be reiterated that the tests above were conducted on Amazon's managed service.

Carrying out tests on standalone clusters will provide more definitive information.

dblock · 2022-12-13T13:45:54Z

@gkamat Are you planning to run the benchmarks on open-source OpenSearch to confirm that this is in OpenSearch OSS?

Bukhtawar · 2022-12-13T16:05:51Z

The benchmarks doesn't say anything about the latency percentiles, do you mind adding p50/p90 latencies and output from the _profile API

Also the below numbers assuming avg latencies seems too off.

Data Node Instance Type	Sort Mode	AES 7.10	AOS 1.3	AOS 2.3
t3.medium	asc	181	389	43773
r5.large	asc	168	459	7229

It seems likely that AOS 2.3 has a larger memory footprint, and would benefit from memory-optimized instance types.

Can you also report what led to this observation?

gkamat · 2022-12-20T18:33:32Z

Will add percentile numbers once I have them. The comment regarding memory footprint is a surmise based on the fact that memory usage has typically increased with newer releases, that the t3 instance type is memory constrained and that there was a substantial latency improvement moving to the r5 instance type. Evidently, more testing is required to confirm this.

ketanv3 · 2022-12-27T12:02:14Z

Ruling out search back-pressure (#5039) and resource tracking framework changes (#3982) from causing this regression.

I benchmarked a standalone OpenSearch cluster with the following configuration:

Instance-type: t3.medium
EBS volume: 100 GB (gp3)
OS: Amazon Linux 2 (amzn2-ami-kernel-5.10-hvm-2.0.20221210.1-x86_64-gp2)
Heap size: 2 GiB

Ran opensearch-benchmark using the http_logs workload and did not observe any significant performance difference.

The contender (f2f809ea280) corresponds to an OS 2.4.1 cluster (most recent release).
The baseline (32fe5eb51d5) corresponds to an OS 2.4.1 cluster with search back-pressure and resource tracking framework changes removed.

opensearch-benchmark compare --baseline t3m-32fe5eb51d5 --contender t3m-f2f809ea280

   ____                  _____                      __       ____                  __                         __
  / __ \____  ___  ____ / ___/___  ____ ___________/ /_     / __ )___  ____  _____/ /_  ____ ___  ____ ______/ /__
 / / / / __ \/ _ \/ __ \\__ \/ _ \/ __ `/ ___/ ___/ __ \   / __  / _ \/ __ \/ ___/ __ \/ __ `__ \/ __ `/ ___/ //_/
/ /_/ / /_/ /  __/ / / /__/ /  __/ /_/ / /  / /__/ / / /  / /_/ /  __/ / / / /__/ / / / / / / / / /_/ / /  / ,<
\____/ .___/\___/_/ /_/____/\___/\__,_/_/   \___/_/ /_/  /_____/\___/_/ /_/\___/_/ /_/_/ /_/ /_/\__,_/_/  /_/|_|
    /_/


Comparing baseline
  TestExecution ID: t3m-32fe5eb51d5
  TestExecution timestamp: 2022-12-27 07:37:24
  TestProcedure: append-no-conflicts
  ProvisionConfigInstance: external

with contender
  TestExecution ID: t3m-f2f809ea280
  TestExecution timestamp: 2022-12-27 07:38:55
  TestProcedure: append-no-conflicts
  ProvisionConfigInstance: external

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|                                                        Metric |                                                   Task |    Baseline |   Contender |     Diff |    Unit |
|--------------------------------------------------------------:|-------------------------------------------------------:|------------:|------------:|---------:|--------:|
|                    Cumulative indexing time of primary shards |                                                        |      133.65 |     133.758 |  0.10797 |     min |
|             Min cumulative indexing time across primary shard |                                                        |           0 |           0 |        0 |     min |
|          Median cumulative indexing time across primary shard |                                                        |     1.13611 |      1.1219 | -0.01421 |     min |
|             Max cumulative indexing time across primary shard |                                                        |      21.581 |     21.7054 |  0.12433 |     min |
|           Cumulative indexing throttle time of primary shards |                                                        |           0 |           0 |        0 |     min |
|    Min cumulative indexing throttle time across primary shard |                                                        |           0 |           0 |        0 |     min |
| Median cumulative indexing throttle time across primary shard |                                                        |           0 |           0 |        0 |     min |
|    Max cumulative indexing throttle time across primary shard |                                                        |           0 |           0 |        0 |     min |
|                       Cumulative merge time of primary shards |                                                        |     46.7262 |     51.3937 |   4.6675 |     min |
|                      Cumulative merge count of primary shards |                                                        |          77 |          81 |        4 |         |
|                Min cumulative merge time across primary shard |                                                        |           0 |           0 |        0 |     min |
|             Median cumulative merge time across primary shard |                                                        |     0.13315 |      0.1127 | -0.02045 |     min |
|                Max cumulative merge time across primary shard |                                                        |     8.68735 |     10.0372 |  1.34983 |     min |
|              Cumulative merge throttle time of primary shards |                                                        |     8.09188 |     8.00913 | -0.08275 |     min |
|       Min cumulative merge throttle time across primary shard |                                                        |           0 |           0 |        0 |     min |
|    Median cumulative merge throttle time across primary shard |                                                        |           0 |           0 |        0 |     min |
|       Max cumulative merge throttle time across primary shard |                                                        |      1.6858 |     1.68585 |    5e-05 |     min |
|                     Cumulative refresh time of primary shards |                                                        |     30.3383 |     30.2785 | -0.05987 |     min |
|                    Cumulative refresh count of primary shards |                                                        |        1025 |        1035 |       10 |         |
|              Min cumulative refresh time across primary shard |                                                        |           0 |           0 |        0 |     min |
|           Median cumulative refresh time across primary shard |                                                        |    0.261633 |    0.276967 |  0.01533 |     min |
|              Max cumulative refresh time across primary shard |                                                        |     4.65508 |     4.88513 |  0.23005 |     min |
|                       Cumulative flush time of primary shards |                                                        |      3.7726 |     3.84177 |  0.06917 |     min |
|                      Cumulative flush count of primary shards |                                                        |         130 |         130 |        0 |         |
|                Min cumulative flush time across primary shard |                                                        |           0 |           0 |        0 |     min |
|             Median cumulative flush time across primary shard |                                                        |    0.003825 |    0.008275 |  0.00445 |     min |
|                Max cumulative flush time across primary shard |                                                        |    0.890833 |    0.822983 | -0.06785 |     min |
|                                       Total Young Gen GC time |                                                        |     138.809 |     139.562 |    0.753 |       s |
|                                      Total Young Gen GC count |                                                        |        5477 |        5551 |       74 |         |
|                                         Total Old Gen GC time |                                                        |           0 |           0 |        0 |       s |
|                                        Total Old Gen GC count |                                                        |           0 |           0 |        0 |         |
|                                                    Store size |                                                        |     16.7331 |     16.7311 | -0.00199 |      GB |
|                                                 Translog size |                                                        | 2.04891e-06 | 2.04891e-06 |        0 |      GB |
|                                        Heap used for segments |                                                        |           0 |           0 |        0 |      MB |
|                                      Heap used for doc values |                                                        |           0 |           0 |        0 |      MB |
|                                           Heap used for terms |                                                        |           0 |           0 |        0 |      MB |
|                                           Heap used for norms |                                                        |           0 |           0 |        0 |      MB |
|                                          Heap used for points |                                                        |           0 |           0 |        0 |      MB |
|                                   Heap used for stored fields |                                                        |           0 |           0 |        0 |      MB |
|                                                 Segment count |                                                        |          35 |          35 |        0 |         |
|                                                Min Throughput |                                           index-append |     44944.3 |       44235 | -709.306 |  docs/s |
|                                               Mean Throughput |                                           index-append |       48058 |     48170.4 |  112.401 |  docs/s |
|                                             Median Throughput |                                           index-append |     47826.6 |     47926.3 |  99.7073 |  docs/s |
|                                                Max Throughput |                                           index-append |     49754.3 |     49926.5 |  172.219 |  docs/s |
|                                       50th percentile latency |                                           index-append |     782.066 |     763.596 | -18.4699 |      ms |
|                                       90th percentile latency |                                           index-append |     1112.49 |     1119.03 |  6.54091 |      ms |
|                                       99th percentile latency |                                           index-append |     1459.01 |     1682.33 |  223.321 |      ms |
|                                     99.9th percentile latency |                                           index-append |     1919.58 |     2287.28 |  367.697 |      ms |
|                                    99.99th percentile latency |                                           index-append |     2253.49 |     2800.16 |  546.669 |      ms |
|                                      100th percentile latency |                                           index-append |     5225.22 |     3899.51 |  -1325.7 |      ms |
|                                  50th percentile service time |                                           index-append |     782.066 |     763.596 | -18.4699 |      ms |
|                                  90th percentile service time |                                           index-append |     1112.49 |     1119.03 |  6.54091 |      ms |
|                                  99th percentile service time |                                           index-append |     1459.01 |     1682.33 |  223.321 |      ms |
|                                99.9th percentile service time |                                           index-append |     1919.58 |     2287.28 |  367.697 |      ms |
|                               99.99th percentile service time |                                           index-append |     2253.49 |     2800.16 |  546.669 |      ms |
|                                 100th percentile service time |                                           index-append |     5225.22 |     3899.51 |  -1325.7 |      ms |
|                                                    error rate |                                           index-append |           0 |           0 |        0 |       % |
|                                                Min Throughput |                               wait-until-merges-finish |   0.0361434 |     9.23497 |  9.19883 |   ops/s |
|                                               Mean Throughput |                               wait-until-merges-finish |   0.0361434 |     9.23497 |  9.19883 |   ops/s |
|                                             Median Throughput |                               wait-until-merges-finish |   0.0361434 |     9.23497 |  9.19883 |   ops/s |
|                                                Max Throughput |                               wait-until-merges-finish |   0.0361434 |     9.23497 |  9.19883 |   ops/s |
|                                      100th percentile latency |                               wait-until-merges-finish |     27667.2 |     107.916 | -27559.3 |      ms |
|                                 100th percentile service time |                               wait-until-merges-finish |     27667.2 |     107.916 | -27559.3 |      ms |
|                                                    error rate |                               wait-until-merges-finish |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                                default |      7.9576 |     7.96255 |  0.00495 |   ops/s |
|                                               Mean Throughput |                                                default |     7.96119 |     7.96564 |  0.00445 |   ops/s |
|                                             Median Throughput |                                                default |     7.96125 |     7.96574 |  0.00449 |   ops/s |
|                                                Max Throughput |                                                default |     7.96442 |      7.9685 |  0.00408 |   ops/s |
|                                       50th percentile latency |                                                default |     4.35699 |     4.52117 |  0.16418 |      ms |
|                                       90th percentile latency |                                                default |     5.71354 |     5.87754 |    0.164 |      ms |
|                                       99th percentile latency |                                                default |     9.00001 |     11.5014 |  2.50141 |      ms |
|                                      100th percentile latency |                                                default |     16.3645 |     25.8167 |  9.45219 |      ms |
|                                  50th percentile service time |                                                default |      3.4697 |      3.5243 |   0.0546 |      ms |
|                                  90th percentile service time |                                                default |     4.50287 |     4.69612 |  0.19325 |      ms |
|                                  99th percentile service time |                                                default |     7.05913 |      9.2611 |  2.20196 |      ms |
|                                 100th percentile service time |                                                default |     15.4009 |     25.1296 |  9.72871 |      ms |
|                                                    error rate |                                                default |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                                   term |     45.9722 |     45.3873 | -0.58491 |   ops/s |
|                                               Mean Throughput |                                                   term |     46.1405 |     45.5643 | -0.57623 |   ops/s |
|                                             Median Throughput |                                                   term |     46.1405 |     45.5643 | -0.57623 |   ops/s |
|                                                Max Throughput |                                                   term |     46.3088 |     45.7413 | -0.56755 |   ops/s |
|                                       50th percentile latency |                                                   term |     8.29684 |     8.20256 | -0.09428 |      ms |
|                                       90th percentile latency |                                                   term |      12.807 |     9.01175 | -3.79522 |      ms |
|                                       99th percentile latency |                                                   term |     13.5406 |     20.3601 |  6.81946 |      ms |
|                                      100th percentile latency |                                                   term |      19.467 |     31.3674 |  11.9004 |      ms |
|                                  50th percentile service time |                                                   term |     6.71006 |     6.68616 |  -0.0239 |      ms |
|                                  90th percentile service time |                                                   term |      11.968 |     7.50711 | -4.46087 |      ms |
|                                  99th percentile service time |                                                   term |     12.4708 |     12.1399 | -0.33087 |      ms |
|                                 100th percentile service time |                                                   term |     18.6128 |     30.1796 |  11.5668 |      ms |
|                                                    error rate |                                                   term |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                                  range |     1.00369 |     1.00368 |   -1e-05 |   ops/s |
|                                               Mean Throughput |                                                  range |      1.0051 |      1.0051 |       -0 |   ops/s |
|                                             Median Throughput |                                                  range |     1.00491 |      1.0049 |   -1e-05 |   ops/s |
|                                                Max Throughput |                                                  range |     1.00732 |     1.00732 |       -0 |   ops/s |
|                                       50th percentile latency |                                                  range |     11.9522 |     11.8257 |  -0.1265 |      ms |
|                                       90th percentile latency |                                                  range |      17.319 |     18.1631 |  0.84408 |      ms |
|                                       99th percentile latency |                                                  range |     32.0656 |     29.8594 | -2.20626 |      ms |
|                                      100th percentile latency |                                                  range |     69.5222 |     31.9329 | -37.5893 |      ms |
|                                  50th percentile service time |                                                  range |     10.0611 |     9.82817 | -0.23292 |      ms |
|                                  90th percentile service time |                                                  range |     15.3572 |     15.9587 |  0.60145 |      ms |
|                                  99th percentile service time |                                                  range |     29.8846 |     27.9582 | -1.92649 |      ms |
|                                 100th percentile service time |                                                  range |     67.8041 |     30.0253 | -37.7788 |      ms |
|                                                    error rate |                                                  range |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                          200s-in-range |     32.6105 |     32.7768 |   0.1663 |   ops/s |
|                                               Mean Throughput |                                          200s-in-range |     32.6321 |     32.7894 |  0.15738 |   ops/s |
|                                             Median Throughput |                                          200s-in-range |     32.6329 |     32.7895 |  0.15662 |   ops/s |
|                                                Max Throughput |                                          200s-in-range |     32.6528 |      32.802 |  0.14923 |   ops/s |
|                                       50th percentile latency |                                          200s-in-range |     8.87515 |     7.66125 |  -1.2139 |      ms |
|                                       90th percentile latency |                                          200s-in-range |     12.4001 |     8.36776 | -4.03233 |      ms |
|                                       99th percentile latency |                                          200s-in-range |     16.0511 |     11.0825 | -4.96858 |      ms |
|                                      100th percentile latency |                                          200s-in-range |     19.1668 |     15.7937 | -3.37303 |      ms |
|                                  50th percentile service time |                                          200s-in-range |     8.24776 |     6.85664 | -1.39111 |      ms |
|                                  90th percentile service time |                                          200s-in-range |     11.7808 |     7.76568 | -4.01511 |      ms |
|                                  99th percentile service time |                                          200s-in-range |     15.3935 |       10.76 | -4.63349 |      ms |
|                                 100th percentile service time |                                          200s-in-range |     18.0104 |     15.0852 | -2.92525 |      ms |
|                                                    error rate |                                          200s-in-range |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                          400s-in-range |     49.5973 |     49.9128 |  0.31552 |   ops/s |
|                                               Mean Throughput |                                          400s-in-range |     49.6141 |     49.9156 |  0.30147 |   ops/s |
|                                             Median Throughput |                                          400s-in-range |     49.6141 |     49.9156 |  0.30147 |   ops/s |
|                                                Max Throughput |                                          400s-in-range |     49.6309 |     49.9183 |  0.28743 |   ops/s |
|                                       50th percentile latency |                                          400s-in-range |     5.51915 |     4.71566 | -0.80349 |      ms |
|                                       90th percentile latency |                                          400s-in-range |     15.0413 |     6.69279 | -8.34848 |      ms |
|                                       99th percentile latency |                                          400s-in-range |     29.5704 |     8.34689 | -21.2235 |      ms |
|                                      100th percentile latency |                                          400s-in-range |     31.6191 |     8.60873 | -23.0103 |      ms |
|                                  50th percentile service time |                                          400s-in-range |     4.69139 |     3.77392 | -0.91747 |      ms |
|                                  90th percentile service time |                                          400s-in-range |     13.3581 |     6.00293 | -7.35516 |      ms |
|                                  99th percentile service time |                                          400s-in-range |     21.7973 |     6.45925 |  -15.338 |      ms |
|                                 100th percentile service time |                                          400s-in-range |     27.8186 |     6.63805 | -21.1805 |      ms |
|                                                    error rate |                                          400s-in-range |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                             hourly_agg |    0.118578 |    0.124143 |  0.00556 |   ops/s |
|                                               Mean Throughput |                                             hourly_agg |    0.118768 |    0.124236 |  0.00547 |   ops/s |
|                                             Median Throughput |                                             hourly_agg |    0.118793 |    0.124237 |  0.00544 |   ops/s |
|                                                Max Throughput |                                             hourly_agg |    0.118886 |    0.124323 |  0.00544 |   ops/s |
|                                       50th percentile latency |                                             hourly_agg |      518063 |      462547 |   -55516 |      ms |
|                                       90th percentile latency |                                             hourly_agg |      652485 |      582435 | -70049.6 |      ms |
|                                       99th percentile latency |                                             hourly_agg |      682632 |      609184 | -73447.6 |      ms |
|                                      100th percentile latency |                                             hourly_agg |      685981 |      612128 | -73853.2 |      ms |
|                                  50th percentile service time |                                             hourly_agg |     8383.34 |     8022.53 | -360.812 |      ms |
|                                  90th percentile service time |                                             hourly_agg |     8422.21 |     8102.55 | -319.661 |      ms |
|                                  99th percentile service time |                                             hourly_agg |      8465.1 |     8220.37 | -244.727 |      ms |
|                                 100th percentile service time |                                             hourly_agg |     8573.09 |     8250.08 | -323.016 |      ms |
|                                                    error rate |                                             hourly_agg |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                                 scroll |     24.8358 |      24.888 |  0.05222 | pages/s |
|                                               Mean Throughput |                                                 scroll |     24.9155 |     24.9376 |  0.02206 | pages/s |
|                                             Median Throughput |                                                 scroll |     24.9231 |     24.9434 |  0.02034 | pages/s |
|                                                Max Throughput |                                                 scroll |     24.9486 |     24.9622 |  0.01359 | pages/s |
|                                       50th percentile latency |                                                 scroll |         787 |     801.549 |  14.5491 |      ms |
|                                       90th percentile latency |                                                 scroll |     805.924 |     828.946 |  23.0219 |      ms |
|                                       99th percentile latency |                                                 scroll |     899.402 |     1069.66 |  170.254 |      ms |
|                                      100th percentile latency |                                                 scroll |     996.741 |     1094.37 |  97.6332 |      ms |
|                                  50th percentile service time |                                                 scroll |     785.715 |     800.062 |  14.3465 |      ms |
|                                  90th percentile service time |                                                 scroll |     804.382 |     827.175 |  22.7932 |      ms |
|                                  99th percentile service time |                                                 scroll |     897.682 |      1046.5 |  148.821 |      ms |
|                                 100th percentile service time |                                                 scroll |     961.681 |     1068.68 |  107.001 |      ms |
|                                                    error rate |                                                 scroll |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                    desc_sort_timestamp |    0.284275 |    0.308557 |  0.02428 |   ops/s |
|                                               Mean Throughput |                                    desc_sort_timestamp |     0.28491 |    0.308615 |  0.02371 |   ops/s |
|                                             Median Throughput |                                    desc_sort_timestamp |    0.284946 |    0.308614 |  0.02367 |   ops/s |
|                                                Max Throughput |                                    desc_sort_timestamp |    0.285536 |    0.308657 |  0.02312 |   ops/s |
|                                       50th percentile latency |                                    desc_sort_timestamp |      379510 |      312564 |   -66946 |      ms |
|                                       90th percentile latency |                                    desc_sort_timestamp |      437751 |      361580 |   -76171 |      ms |
|                                       99th percentile latency |                                    desc_sort_timestamp |      450873 |      372664 |   -78209 |      ms |
|                                      100th percentile latency |                                    desc_sort_timestamp |      452316 |      373914 | -78401.3 |      ms |
|                                  50th percentile service time |                                    desc_sort_timestamp |     3470.53 |     3234.89 |  -235.64 |      ms |
|                                  90th percentile service time |                                    desc_sort_timestamp |     3494.37 |        3251 | -243.369 |      ms |
|                                  99th percentile service time |                                    desc_sort_timestamp |     3740.53 |     3284.33 | -456.194 |      ms |
|                                 100th percentile service time |                                    desc_sort_timestamp |     6247.54 |     3288.61 | -2958.93 |      ms |
|                                                    error rate |                                    desc_sort_timestamp |           0 |           0 |        0 |       % |
|                                                Min Throughput |                                     asc_sort_timestamp |    0.501602 |    0.501591 |   -1e-05 |   ops/s |
|                                               Mean Throughput |                                     asc_sort_timestamp |    0.501946 |    0.501933 |   -1e-05 |   ops/s |
|                                             Median Throughput |                                     asc_sort_timestamp |     0.50192 |    0.501907 |   -1e-05 |   ops/s |
|                                                Max Throughput |                                     asc_sort_timestamp |    0.502397 |     0.50238 |   -2e-05 |   ops/s |
|                                       50th percentile latency |                                     asc_sort_timestamp |     83.5583 |      74.787 | -8.77128 |      ms |
|                                       90th percentile latency |                                     asc_sort_timestamp |     140.182 |     131.646 | -8.53633 |      ms |
|                                       99th percentile latency |                                     asc_sort_timestamp |     210.622 |      178.28 | -32.3424 |      ms |
|                                      100th percentile latency |                                     asc_sort_timestamp |     213.416 |     184.096 | -29.3204 |      ms |
|                                  50th percentile service time |                                     asc_sort_timestamp |     80.5067 |     71.8407 | -8.66597 |      ms |
|                                  90th percentile service time |                                     asc_sort_timestamp |     136.731 |     130.159 | -6.57241 |      ms |
|                                  99th percentile service time |                                     asc_sort_timestamp |     207.746 |     174.972 | -32.7737 |      ms |
|                                 100th percentile service time |                                     asc_sort_timestamp |     211.066 |     181.238 | -29.8283 |      ms |
|                                                    error rate |                                     asc_sort_timestamp |           0 |           0 |        0 |       % |
|                                                Min Throughput |                         desc_sort_with_after_timestamp |    0.186735 |    0.198405 |  0.01167 |   ops/s |
|                                               Mean Throughput |                         desc_sort_with_after_timestamp |    0.189769 |    0.200925 |  0.01116 |   ops/s |
|                                             Median Throughput |                         desc_sort_with_after_timestamp |    0.190099 |    0.201179 |  0.01108 |   ops/s |
|                                                Max Throughput |                         desc_sort_with_after_timestamp |    0.190436 |    0.201473 |  0.01104 |   ops/s |
|                                       50th percentile latency |                         desc_sort_with_after_timestamp |      198682 |      181220 | -17462.1 |      ms |
|                                       90th percentile latency |                         desc_sort_with_after_timestamp |      326975 |      298233 | -28741.8 |      ms |
|                                       99th percentile latency |                         desc_sort_with_after_timestamp |      355923 |      324579 | -31343.2 |      ms |
|                                      100th percentile latency |                         desc_sort_with_after_timestamp |      359126 |      327508 | -31617.7 |      ms |
|                                  50th percentile service time |                         desc_sort_with_after_timestamp |      5236.8 |     4951.09 | -285.709 |      ms |
|                                  90th percentile service time |                         desc_sort_with_after_timestamp |     5246.46 |     4962.16 | -284.303 |      ms |
|                                  99th percentile service time |                         desc_sort_with_after_timestamp |     5282.88 |     4990.87 | -292.008 |      ms |
|                                 100th percentile service time |                         desc_sort_with_after_timestamp |     5348.33 |     5090.54 | -257.791 |      ms |
|                                                    error rate |                         desc_sort_with_after_timestamp |           0 |           0 |        0 |       % |
|                                                Min Throughput |                          asc_sort_with_after_timestamp |     0.18616 |    0.226284 |  0.04012 |   ops/s |
|                                               Mean Throughput |                          asc_sort_with_after_timestamp |    0.187478 |    0.226967 |  0.03949 |   ops/s |
|                                             Median Throughput |                          asc_sort_with_after_timestamp |    0.187544 |    0.227026 |  0.03948 |   ops/s |
|                                                Max Throughput |                          asc_sort_with_after_timestamp |    0.187675 |    0.227161 |  0.03949 |   ops/s |
|                                       50th percentile latency |                          asc_sort_with_after_timestamp |      203388 |      147372 | -56015.9 |      ms |
|                                       90th percentile latency |                          asc_sort_with_after_timestamp |      335109 |      242415 | -92694.1 |      ms |
|                                       99th percentile latency |                          asc_sort_with_after_timestamp |      364583 |      263770 |  -100813 |      ms |
|                                      100th percentile latency |                          asc_sort_with_after_timestamp |      367894 |      266142 |  -101752 |      ms |
|                                  50th percentile service time |                          asc_sort_with_after_timestamp |     5329.11 |     4396.73 | -932.381 |      ms |
|                                  90th percentile service time |                          asc_sort_with_after_timestamp |     5375.04 |      4416.5 | -958.539 |      ms |
|                                  99th percentile service time |                          asc_sort_with_after_timestamp |     5410.96 |     4447.96 | -963.005 |      ms |
|                                 100th percentile service time |                          asc_sort_with_after_timestamp |     5443.42 |     4456.82 | -986.601 |      ms |
|                                                    error rate |                          asc_sort_with_after_timestamp |           0 |           0 |        0 |       % |
|                                                Min Throughput |                         wait-until-merges-1-seg-finish |     3.72822 |     11.9643 |  8.23613 |   ops/s |
|                                               Mean Throughput |                         wait-until-merges-1-seg-finish |     3.72822 |     11.9643 |  8.23613 |   ops/s |
|                                             Median Throughput |                         wait-until-merges-1-seg-finish |     3.72822 |     11.9643 |  8.23613 |   ops/s |
|                                                Max Throughput |                         wait-until-merges-1-seg-finish |     3.72822 |     11.9643 |  8.23613 |   ops/s |
|                                      100th percentile latency |                         wait-until-merges-1-seg-finish |      267.84 |      83.228 | -184.612 |      ms |
|                                 100th percentile service time |                         wait-until-merges-1-seg-finish |      267.84 |      83.228 | -184.612 |      ms |
|                                                    error rate |                         wait-until-merges-1-seg-finish |           0 |           0 |        0 |       % |
|                                                Min Throughput |            desc-sort-timestamp-after-force-merge-1-seg |    0.273701 |    0.280344 |  0.00664 |   ops/s |
|                                               Mean Throughput |            desc-sort-timestamp-after-force-merge-1-seg |    0.273835 |    0.280459 |  0.00662 |   ops/s |
|                                             Median Throughput |            desc-sort-timestamp-after-force-merge-1-seg |    0.273829 |    0.280447 |  0.00662 |   ops/s |
|                                                Max Throughput |            desc-sort-timestamp-after-force-merge-1-seg |    0.274069 |    0.280573 |   0.0065 |   ops/s |
|                                       50th percentile latency |            desc-sort-timestamp-after-force-merge-1-seg |      789121 |      767766 | -21355.9 |      ms |
|                                       90th percentile latency |            desc-sort-timestamp-after-force-merge-1-seg |      913417 |      888811 | -24606.5 |      ms |
|                                       99th percentile latency |            desc-sort-timestamp-after-force-merge-1-seg |      941077 |      916047 | -25029.5 |      ms |
|                                      100th percentile latency |            desc-sort-timestamp-after-force-merge-1-seg |      944151 |      919050 |   -25101 |      ms |
|                                  50th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     3635.83 |     3543.53 | -92.3042 |      ms |
|                                  90th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     3655.17 |     3614.88 | -40.2965 |      ms |
|                                  99th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     3730.66 |     3648.46 | -82.2032 |      ms |
|                                 100th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     3852.31 |     3651.05 | -201.259 |      ms |
|                                                    error rate |            desc-sort-timestamp-after-force-merge-1-seg |           0 |           0 |        0 |       % |
|                                                Min Throughput |             asc-sort-timestamp-after-force-merge-1-seg |     2.00562 |     2.00442 |  -0.0012 |   ops/s |
|                                               Mean Throughput |             asc-sort-timestamp-after-force-merge-1-seg |     2.00682 |     2.00536 | -0.00146 |   ops/s |
|                                             Median Throughput |             asc-sort-timestamp-after-force-merge-1-seg |     2.00673 |     2.00528 | -0.00144 |   ops/s |
|                                                Max Throughput |             asc-sort-timestamp-after-force-merge-1-seg |     2.00838 |     2.00657 | -0.00181 |   ops/s |
|                                       50th percentile latency |             asc-sort-timestamp-after-force-merge-1-seg |     69.3626 |     67.6345 | -1.72807 |      ms |
|                                       90th percentile latency |             asc-sort-timestamp-after-force-merge-1-seg |     98.1814 |     120.536 |  22.3549 |      ms |
|                                       99th percentile latency |             asc-sort-timestamp-after-force-merge-1-seg |     149.508 |      136.45 | -13.0583 |      ms |
|                                      100th percentile latency |             asc-sort-timestamp-after-force-merge-1-seg |     149.817 |      140.76 | -9.05761 |      ms |
|                                  50th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     68.2351 |     66.2148 | -2.02031 |      ms |
|                                  90th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     96.9693 |     118.949 |  21.9794 |      ms |
|                                  99th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     148.156 |      134.72 | -13.4361 |      ms |
|                                 100th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     148.567 |     139.287 | -9.27939 |      ms |
|                                                    error rate |             asc-sort-timestamp-after-force-merge-1-seg |           0 |           0 |        0 |       % |
|                                                Min Throughput | desc-sort-with-after-timestamp-after-force-merge-1-seg |    0.178115 |    0.174843 | -0.00327 |   ops/s |
|                                               Mean Throughput | desc-sort-with-after-timestamp-after-force-merge-1-seg |    0.180498 |    0.180201 |  -0.0003 |   ops/s |
|                                             Median Throughput | desc-sort-with-after-timestamp-after-force-merge-1-seg |    0.180812 |    0.180745 |   -7e-05 |   ops/s |
|                                                Max Throughput | desc-sort-with-after-timestamp-after-force-merge-1-seg |    0.181078 |    0.181924 |  0.00085 |   ops/s |
|                                       50th percentile latency | desc-sort-with-after-timestamp-after-force-merge-1-seg |      215071 |      213698 | -1372.82 |      ms |
|                                       90th percentile latency | desc-sort-with-after-timestamp-after-force-merge-1-seg |      354219 |      352338 | -1881.33 |      ms |
|                                       99th percentile latency | desc-sort-with-after-timestamp-after-force-merge-1-seg |      385453 |      381967 | -3485.78 |      ms |
|                                      100th percentile latency | desc-sort-with-after-timestamp-after-force-merge-1-seg |      389078 |      386568 | -2509.79 |      ms |
|                                  50th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |      5502.6 |     5327.98 | -174.615 |      ms |
|                                  90th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     5528.73 |      6174.5 |  645.767 |      ms |
|                                  99th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     5664.35 |     7365.75 |   1701.4 |      ms |
|                                 100th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     5971.12 |     7382.26 |  1411.14 |      ms |
|                                                    error rate | desc-sort-with-after-timestamp-after-force-merge-1-seg |           0 |           0 |        0 |       % |
|                                                Min Throughput |  asc-sort-with-after-timestamp-after-force-merge-1-seg |    0.186273 |    0.194642 |  0.00837 |   ops/s |
|                                               Mean Throughput |  asc-sort-with-after-timestamp-after-force-merge-1-seg |    0.187272 |    0.195415 |  0.00814 |   ops/s |
|                                             Median Throughput |  asc-sort-with-after-timestamp-after-force-merge-1-seg |    0.187362 |    0.195446 |  0.00808 |   ops/s |
|                                                Max Throughput |  asc-sort-with-after-timestamp-after-force-merge-1-seg |    0.187442 |    0.195616 |  0.00817 |   ops/s |
|                                       50th percentile latency |  asc-sort-with-after-timestamp-after-force-merge-1-seg |      203710 |      190563 |   -13147 |      ms |
|                                       90th percentile latency |  asc-sort-with-after-timestamp-after-force-merge-1-seg |      335699 |      313451 |   -22248 |      ms |
|                                       99th percentile latency |  asc-sort-with-after-timestamp-after-force-merge-1-seg |      365359 |      341271 | -24087.8 |      ms |
|                                      100th percentile latency |  asc-sort-with-after-timestamp-after-force-merge-1-seg |      368673 |      344336 | -24336.7 |      ms |
|                                  50th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     5330.56 |     5101.09 | -229.466 |      ms |
|                                  90th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     5342.68 |      5147.7 | -194.976 |      ms |
|                                  99th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     5366.04 |     5219.83 | -146.202 |      ms |
|                                 100th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     5367.06 |     5371.07 |  4.01395 |      ms |
|                                                    error rate |  asc-sort-with-after-timestamp-after-force-merge-1-seg |           0 |           0 |        0 |       % |


-------------------------------
[INFO] SUCCESS (took 0 seconds)
-------------------------------

gashutos · 2023-02-14T17:20:01Z

Lucene gives us in-built ability to optimise sorting on certain sort field types where its point type is matching. i.e for fields with data type Date or Long, we will able to use this optimisation.
There already exists a check in our code to enable this optimisation for only Date/Long data types. This was introduced as part of PR https://github.com/opensearch-project/OpenSearch/pull/1974/files
As part of PR (during upgrade of Lucene 9.0.0) https://github.com/opensearch-project/OpenSearch/pull/1109/files, we have removed the numeric sort optimisations thus causing a regression in sort performance. I think this has been removed due to its deprecated method.
but Lucene didn't say it's numeric optimisation would be deprecated, but the method introduced newly to enable/disable optimisation would be deprecated. apache/lucene@cc58c51. Since these optimisations are enabled by default, earlier these optimisations were disabled by default.

I tried adding back the code in https://github.com/opensearch-project/OpenSearch/pull/1974/files with sortField.setOptimizeSortWithPoints(true); for Date and Long data types and with that we were able to achieve similar performance (before the removal) in Open Search 2.3.

Below are the results from OS_2.3. (ran on managed AOS)
Units in ms.

Data Node Type.         Sort Mode             OS 2.3             OS 2.3(before)
r6g.2xlarge.               asc                264                  428
r6g.2xlarge.               desc               639                  2422

gashutos · 2023-02-14T17:22:23Z

@nknize @Rishikesh1159 shold we enable this optimisation back for below 4 types as of now ?

DATE
DATE_NANOSECONDS
LONG
DOUBLE

For above 4 types, we have same sort type and point type. So should not be any haarm doing like that.

dblock · 2023-02-14T22:09:08Z

Nice work @gashutos.

nknize · 2023-02-15T20:51:56Z

Updating here:

#6321 was merged to main and the backport to 2.x has also been merged. This solves issues with the four common numeric data types (LONG, DOUBLE, DATE, and DATE_NANOSECONDS) but if a user sorts by a different type (e.g., SHORT, FLOAT, ...) the BKD points optimization will still be bypassed. @gashutos is volunteering to explore a bigger POC to expand this fix to all data types; which will have a broader impact. In the meantime the merged patch includes deprecated method contracts to ensure any new numeric type explicitly specifies whether or not to apply the BKD optimization.

gashutos · 2023-03-28T09:06:08Z

Closing this since this #6321 is merged and backported to 2.3. As mentioned in PR description, sort latencies are back to normal level what we had earlier.

gashutos · 2023-03-28T09:08:22Z

@gkamat can we close this issue ?

gkamat · 2023-03-29T15:38:13Z

Issue has been resolved.

gkamat added bug Something isn't working untriaged labels Dec 13, 2022

dblock changed the title ~~[BUG] OpenSearch sort query performance has degraded since version 7.10~~ [BUG] OpenSearch sort query performance regression Dec 13, 2022

dblock added the Performance This is for any performance related enhancements or bugs label Dec 13, 2022

dreamer-89 added needs more info untriaged distributed framework Search Search query, autocomplete ...etc and removed untriaged labels Dec 19, 2022

anasalkouz removed the untriaged label Dec 21, 2022

gashutos mentioned this issue Feb 14, 2023

Enable numeric sort optimisation for few numerical sort types #6321

Merged

6 tasks

anasalkouz added Migration:Pending Input and removed Migration:Pending Input labels Mar 16, 2023

gkamat closed this as completed Mar 29, 2023

gkamat mentioned this issue Apr 4, 2023

[RFC] Enhancements for OSB Workloads opensearch-project/opensearch-benchmark#253

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] OpenSearch sort query performance regression #5534

[BUG] OpenSearch sort query performance regression #5534

gkamat commented Dec 13, 2022

gkamat commented Dec 13, 2022 •

edited by Bukhtawar

Loading

dblock commented Dec 13, 2022 •

edited

Loading

Bukhtawar commented Dec 13, 2022 •

edited

Loading

gkamat commented Dec 20, 2022

ketanv3 commented Dec 27, 2022

gashutos commented Feb 14, 2023 •

edited

Loading

gashutos commented Feb 14, 2023

dblock commented Feb 14, 2023

nknize commented Feb 15, 2023

gashutos commented Mar 28, 2023

gashutos commented Mar 28, 2023

gkamat commented Mar 29, 2023

[BUG] OpenSearch sort query performance regression #5534

[BUG] OpenSearch sort query performance regression #5534

Comments

gkamat commented Dec 13, 2022

Summary

Methodology

Results

gkamat commented Dec 13, 2022 • edited by Bukhtawar Loading

dblock commented Dec 13, 2022 • edited Loading

Bukhtawar commented Dec 13, 2022 • edited Loading

gkamat commented Dec 20, 2022

ketanv3 commented Dec 27, 2022

gashutos commented Feb 14, 2023 • edited Loading

gashutos commented Feb 14, 2023

dblock commented Feb 14, 2023

nknize commented Feb 15, 2023

gashutos commented Mar 28, 2023

gashutos commented Mar 28, 2023

gkamat commented Mar 29, 2023

gkamat commented Dec 13, 2022 •

edited by Bukhtawar

Loading

dblock commented Dec 13, 2022 •

edited

Loading

Bukhtawar commented Dec 13, 2022 •

edited

Loading

gashutos commented Feb 14, 2023 •

edited

Loading