-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Description
Hi,
OpenSearch had to change IOContext from READONCE to DEFAULT , for its remote store feature. This is because in remote store, the thread which closes the IndexInput is different from the one which opens it .
Reference : opensearch-project/OpenSearch#17502
Post switching to DEFAULT, we are seeing maps exhaustion in our production workloads. So far we have not been able to reproduce the issue.
One related issue/repro is opensearch-project/k-NN#2665 (comment) which points to one open index input preventing the whole arena from getting freed up , and ultimately exhaustion of maps.
[2025-05-24T07:00:02,721][WARN ][o.o.i.s.RemoteStoreRefreshListener] [24b9ade6de8c7c3be1382651135d04a1] [vpc-flowlogs-2025.05.24-000673][0] Exception while uploading new segments to the remote segment store
java.io.IOException: Map failed: MemorySegmentIndexInput(path="/hdd1/mnt/env/root/ES-PATH/var/es/data/nodes/0/indices/index-uuid/0/index/_6yh.cfs") [this may be caused by lack of enough unfragmented virtual address space or too restrictive virtual memory limits enforced by the operating system, preventing us to map a chunk of 2714694334 bytes. Please review 'ulimit -v', 'ulimit -m' (both should return 'unlimited'), and 'sysctl vm.max_map_count'. More information: https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html]
at java.base/sun.nio.ch.FileChannelImpl.mapInternal(FileChannelImpl.java:1319)
at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1218)
at org.apache.lucene.store.MemorySegmentIndexInputProvider.map(MemorySegmentIndexInputProvider.java:134)
at org.apache.lucene.store.MemorySegmentIndexInputProvider.openInput(MemorySegmentIndexInputProvider.java:76)
at org.apache.lucene.store.MemorySegmentIndexInputProvider.openInput(MemorySegmentIndexInputProvider.java:33)
at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:394)
at org.opensearch.index.store.FsDirectoryFactory$HybridDirectory.openInput(FsDirectoryFactory.java:181)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:101)
at org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:101)
at org.opensearch.index.store.Store$MetadataSnapshot.checksumFromLuceneFile(Store.java:1213)
at org.opensearch.index.store.Store$MetadataSnapshot.loadMetadata(Store.java:1183)
at org.opensearch.index.store.Store.getSegmentMetadataMap(Store.java:392)
at org.opensearch.index.shard.IndexShard.computeReplicationCheckpoint(IndexShard.java:1887)
at org.opensearch.index.shard.RemoteStoreRefreshListener.syncSegments(RemoteStoreRefreshListener.java:251)
at org.opensearch.index.shard.RemoteStoreRefreshListener.performAfterRefreshWithPermit(RemoteStoreRefreshListener.java:159)
at org.opensearch.index.shard.ReleasableRetryableRefreshListener.runAfterRefreshWithPermit(ReleasableRetryableRefreshListener.java:167)
at org.opensearch.index.shard.ReleasableRetryableRefreshListener.lambda$scheduleRetry$2(ReleasableRetryableRefreshListener.java:127)
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:964)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
With IOContext.Default , we are seeing exhaustion of maps . System limits are as below :
❯ ulimit -m
unlimited
❯ulimit -v
unlimited
❯ sysctl vm.max_map_count
vm.max_map_count = 1048576
To solve this we tried with -Dorg.apache.lucene.store.MMapDirectory.sharedArenaMaxPermits=1 . But surprisingly, we were seeing the same issue after some time as well.
We had to disable Memory Segment Index Input to get over this org.apache.lucene.store.MMapDirectory.enableMemorySegments=false . But this fallback option is longer available in latest lucene version .
Does anyone has any pointers to how to go about this issue ? One option we are exploring is to change the IOContext to READONCE, by closing it in the same thread . But even the read is performed in different thread. So we are not sure if READONCE will work fine for us.
Is there any option we can make MemorySegmentIndexInput work for us ?
Version and environment details
OpenSearch Version - 2.19
Lucene Version - 9.12