You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the cache scenario, it seems to be not reasonable. The file cache will try to evict blocks when the watermark is higher than user-defined size. At the same time, the query task may need other new blocks that are located at remote storage, if the tolerance space( totalSpace - fileCacheSize) is very small, there will be no free space for caching new blocks and the query task will fail.
Describe the solution you'd like
There are some options:
the user-defined file cache size must be less than a specific percentage of total space (hard code), like 95%;
add another tolerance size setting(non-dynamic, can be set as byte size or percentage).
Related component
Storage:Snapshots
Describe alternatives you've considered
Treats the current file cache capacity as the maximum size allowed, and introduces another setting evict_watermark that is the percentage of the file cache capacity. When the total size of cache entries occupies a proportion of the cache capacity that exceeds the preset watermark, the file cache begins to attempt to evict entries. When a new block needs to be cached and there is no free space in the file cache, we can fail the corresponding query or we can use an on-heap memory block to hold this file block and to serve the query.
In this way, the behavior of the file cache may become more predictable, especially when the search node is deployed with other node roles on the same node, to ensure that the file cache does not encroach on disk space for different purposes, which then affects the normal operation of the node. And when implementing a writable warm index, we may also need to set a file cache along with a local directory, if a sudden large number of queries or a big query causes the file cache to take up all the available space, it can cause writes to fail.
Additional context
No response
The text was updated successfully, but these errors were encountered:
the user-defined file cache size must be less than a specific percentage of total space (hard code), like 95%;
add another tolerance size setting(non-dynamic, can be set as byte size or percentage).
@bugmakerrrrrr These ideas seem reasonable. One thing to consider is the behavior of the system once it crosses into the disk watermark thresholds. Should we even allow a cache size to be larger than the watermark thresholds?
Is your feature request related to a problem? Please describe
Following up to #14004 . Currently we use the total space of file cache path to verify the user-defined file cache size setting.
OpenSearch/server/src/main/java/org/opensearch/node/Node.java
Line 2033 in c71fd4a
For the cache scenario, it seems to be not reasonable. The file cache will try to evict blocks when the watermark is higher than user-defined size. At the same time, the query task may need other new blocks that are located at remote storage, if the tolerance space(
totalSpace - fileCacheSize
) is very small, there will be no free space for caching new blocks and the query task will fail.Describe the solution you'd like
There are some options:
Related component
Storage:Snapshots
Describe alternatives you've considered
Treats the current file cache capacity as the maximum size allowed, and introduces another setting
evict_watermark
that is the percentage of the file cache capacity. When the total size of cache entries occupies a proportion of the cache capacity that exceeds the preset watermark, the file cache begins to attempt to evict entries. When a new block needs to be cached and there is no free space in the file cache, we can fail the corresponding query or we can use an on-heap memory block to hold this file block and to serve the query.In this way, the behavior of the file cache may become more predictable, especially when the search node is deployed with other node roles on the same node, to ensure that the file cache does not encroach on disk space for different purposes, which then affects the normal operation of the node. And when implementing a writable warm index, we may also need to set a file cache along with a local directory, if a sudden large number of queries or a big query causes the file cache to take up all the available space, it can cause writes to fail.
Additional context
No response
The text was updated successfully, but these errors were encountered: