[BUG] Faster search queries might see a timeout discrepancy in search response due to cached time #2000

Bukhtawar · 2022-01-27T19:06:07Z

Describe the bug
With search timeout set to 200ms user can end up seeing a response below which seems inconsistent since the timeout itself was set at 200ms and there would possibly be no way the took time is below the time out and yet the query times out

{
    "took": 60,
    "timed_out": true,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    }
}

The major issue could be wrong timeouts being enforced either pre-mature or too late based on the estimated time intervals. For eg it might timeout at 0ms or 400ms for a 200ms timeout

This happens due to the elapsed time computation which uses an optimization for System#nanoTime that caches time by 200ms by default based on the setting thread_pool.estimated_time_interval
Some latency sensitive query might see discrepancy based on theses defaults.

We need to check what is a reasonable default for the estimated time interval based on JMH benchmarks. Since today it exists as a static value, with basically no documentation on what to expect out of a search timeout. We can choose to make it dynamic with reasonable defaults and let users choose this interval within appropriate limits

OpenSearch/server/src/main/java/org/opensearch/threadpool/ThreadPool.java

Line 632 in 996d33a

Thread.sleep(interval);

The text was updated successfully, but these errors were encountered:

dblock · 2022-02-02T18:34:44Z

So in this case is the value of "took" plain incorrect?

Bukhtawar · 2022-02-02T18:45:25Z

@dblock there are two problems

Premature or too relaxed timeouts, requested timeout of 200ms would kick in at 0ms or 400ms worst case. So essentially at low timeout values, actual timeouts might be totally off even resulting in missing results
Took time can be off by 200ms irrespective

due to the relative times being cached at 200ms

Yes the actual took time looks wrong here

dblock · 2022-02-02T19:24:45Z

Caching time at any rate seems problematic. There must be a good reason that was introduced at all though?

Bukhtawar · 2022-02-03T07:48:12Z

I think the System#nanoTime is a slightly expensive system call and probably long running queries could live with the inaccuracy, thats my understanding of the motivation behind the cache.

dblock · 2022-02-03T17:15:32Z

Sounds like we use nanoTime to get more resolution, but then cache it, so should this be swapped for System.currentTimeMillis()? I am just brainstorming - worried that we optimize a thing that shouldn't exist.

https://stackoverflow.com/questions/19052316/why-is-system-nanotime-way-slower-in-performance-than-system-currenttimemill

Bukhtawar · 2022-02-03T19:58:44Z

System.currentTimeMillis is based on physical clock time that can give negative deltas if clocks go back due to NTP. System.nanoTime uses a monotonically increasing time which is the right choice for measuring elapsed times

reta · 2022-02-03T20:33:30Z

Agree with @Bukhtawar, nanoTime is the best option in this case, the larger essay on the subject [1], highly recommend it. Thinking about loud, setting thread_pool.estimated_time_interval to -1 should disable time caching altogether, correct?

[1] https://shipilev.net/blog/2014/nanotrusting-nanotime/#_building_performance_models

dblock · 2022-02-03T23:00:04Z

No. System.currentTimeMillis is based on physical clock time that can give negative deltas if clocks go back due to NTP. System.nanoTime uses a monotonically increasing time which is the right choice for measuring elapsed times

Yes, of course. I should have read the documentation. Thx.

aabukhalil · 2022-05-16T20:22:43Z

I'm looking into this.

aabukhalil · 2022-05-26T21:41:15Z

Root cause:

Right now, when timeout cancellation Runnable is created in QueryPhase.java, it uses searchContext.getRelativeTimeInMillis() as startTime and to check if time has passed the timeout (multiple calls). searchContext.getRelativeTimeInMillis() is a cached time and will always be delayed from actual System.nanoTime(). Cached time lifespan is controlled by thread_pool.estimated_time_interval.

This premature false positive timeouts can happen when timeout < thread_pool.estimated_time_interval AND startTime + timeout falls in next cached time slot AND actual query execution end time falls in next cached time slot.

E.g.

time_cache_lifespan = 200ms 
cached_start_time = X, where X is multiple of time_cache_lifespan
timeout = 30ms 
max_end_time = X + 30ms
actual_time = X + 190ms (System.nanoTime() when the execution started)
actual_query_execution_time = 20ms

Cached Timeline:  |1X|------------------------------|2X|------------------------------|3X|
Events Timeline:  -----------------------------E1----E2----E3---------------------------

E1: at 1X + 190ms, timeout cancellation runnable is created with start_time = 1X (because it is cached)
E2: at 2X the cached time became 2X
E3: at 2x + 5ms, cancellation runnable is called to check if execution time exceeded timeout 
      which will evaluate  (current_cached_time - start_cached_time > timeout) and in this
      edge case 2X - 1X = X = 200ms which is > timeout so it will trigger timeout which is 
      invalid because at E3 query execution has passed only 15ms

to fix this, only startTime should use System.nanoTime() (which is only called once per query) and this will eliminate this issue but might still produce false negative timeouts (no timeout while it should) within the resolution of thread_pool.estimated_time_interval. AKA, timeouts still can be delayed from specified one up to thread_pool.estimated_time_interval which is more acceptable, I guess.

Steps to consistently reproduce:

# change thread_pool.estimated_time_interval to 10ms instead of 200ms to increase probability of false positive timeouts while executing fast queries with short query timeout

# index data
for i in {1..20}; do
    curl -XPOST localhost:9200/hyd_10000_01/_bulk?pretty --data-binary @hyd_ready_to_index_10000.json -H 'Content-Type: application/json'
done

# will result in
health status index        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   hyd_10000_01 rxPDNe0oSImDAZgmplgr0g   1   1      99720            0     73.2mb         73.2mb


# run the test (query below typically takes 2~3ms)
for i in {1..1000}; do
    curl -s --location --request POST 'localhost:9200/hyd_10000_01/_search' --header 'Content-Type: application/json' --data-raw '{
        "timeout": "5ms",
      "query": {
        "fuzzy": {
          "tweet": {
            "value": "moh"
          }
        }
      }
    }' | jq '.took,.timed_out' | tr '\n' ',' >> output.csv
    echo >> output.csv
done

# check result stats 
cat output.csv | wc -l
1000
cat output.csv | grep -i false | wc -l
938
cat output.csv | grep -i true | wc -l
62 # false positives as all "took" is below timeout

and after using System.nanoTime() as startTime I didn't have any timeouts

Bukhtawar added bug Something isn't working untriaged labels Jan 27, 2022

Bukhtawar changed the title ~~[BUG] Faster search queries might see a discrepancy in search response due to cached time~~ [BUG] Faster search queries might see a timeout discrepancy in search response due to cached time Jan 28, 2022

anasalkouz added distributed framework and removed untriaged labels Feb 1, 2022

anasalkouz assigned aabukhalil May 16, 2022

aabukhalil mentioned this issue May 26, 2022

Fix false positive query timeouts due to using cached time #3454

Merged

5 tasks

dblock closed this as completed in #3454 Jun 17, 2022

andrross mentioned this issue May 31, 2023

[BUG] Imprecise duration logged by MasterService #7849

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Faster search queries might see a timeout discrepancy in search response due to cached time #2000

[BUG] Faster search queries might see a timeout discrepancy in search response due to cached time #2000

Bukhtawar commented Jan 27, 2022 •

edited

Loading

dblock commented Feb 2, 2022

Bukhtawar commented Feb 2, 2022 •

edited

Loading

dblock commented Feb 2, 2022

Bukhtawar commented Feb 3, 2022

dblock commented Feb 3, 2022

Bukhtawar commented Feb 3, 2022 •

edited

Loading

reta commented Feb 3, 2022

dblock commented Feb 3, 2022 •

edited

Loading

aabukhalil commented May 16, 2022

aabukhalil commented May 26, 2022 •

edited

Loading

[BUG] Faster search queries might see a timeout discrepancy in search response due to cached time #2000

[BUG] Faster search queries might see a timeout discrepancy in search response due to cached time #2000

Comments

Bukhtawar commented Jan 27, 2022 • edited Loading

dblock commented Feb 2, 2022

Bukhtawar commented Feb 2, 2022 • edited Loading

dblock commented Feb 2, 2022

Bukhtawar commented Feb 3, 2022

dblock commented Feb 3, 2022

Bukhtawar commented Feb 3, 2022 • edited Loading

reta commented Feb 3, 2022

dblock commented Feb 3, 2022 • edited Loading

aabukhalil commented May 16, 2022

aabukhalil commented May 26, 2022 • edited Loading

Root cause:

Bukhtawar commented Jan 27, 2022 •

edited

Loading

Bukhtawar commented Feb 2, 2022 •

edited

Loading

Bukhtawar commented Feb 3, 2022 •

edited

Loading

dblock commented Feb 3, 2022 •

edited

Loading

aabukhalil commented May 26, 2022 •

edited

Loading