-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move away from file watcher for releasing memory #1885
Comments
hi @jmazanec15 i am writing a DocValuesProducer for KNN80Codec for save store usage. i like this issues very much. it can release memory efficiently. i can help contribute my DocValuesProducer if you like |
Is the lifecycle of native memory(native knn index) same as the lifecycle of DocValuesProducer or KnnVectorsReader? Does |
Awesome! What custom functionality did you add in DVProducer to reduce storage? I think the change for this issue would be quite small so I do not see it conflicting.
DocValuesProducer or KnnVectorsReader are both per segment. They two producer/reader classes are opened (or shared) in https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/SegmentReader.java ctor. On close, the close methods will eventually be called:
This reader will be required in the search path: https://github.com/opensearch-project/k-NN/blob/main/src/main/java/org/opensearch/knn/index/query/KNNWeight.java#L206 |
in #1571 we can save the storage for and i think we can save the docValues storage like #1571 (comment) we create 2 types docValues in Lucene engine. also we can get docValues from native engines, in faiss we can use |
@jmazanec15 Right. However, does producer/reader classes are singleton and stay open until index is closed or deleted? |
@luyuncheng thats interesting. I think @navneet1v may have been thinking around that. We are in process of migrating away from Custom Doc Values to KNNVectorFormat. See #1855. Would you be able to do it with the KNNVectorReader instead of doc values? |
@heemin32 Oh I think I see what you are saying. Will there be multiple readers/producers per segment/search. I think the purpose of SegmentCoreReaders is to ensure that they can be shared. Worth noting that for lucene, the input is closed as well (see https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java#L325). |
Sharing is good but one other thing is whether the reader get closed and opened repetitively regardless the existence of segment file. |
Im not sure why this would happen. Need to be sure though before changing. |
I am don't know how it works internally but based on this comment, it seems like multiple class of SegmentReader can be created. https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/SegmentReader.java#L128
|
@jmazanec15 thanks for opening up the issue and I see already some great ideas on the gh issue. Here is my true sense here:
|
Right, refCounting would prevent issues, but make removal a little trickier. |
Read through the issue here - My proposal would be to separate the concerns and instead follow this pattern to handle lifecycle events
Thoughts? |
@kotwanikunal My issue with the index life cycle is that we would also need to keep the file watcher code, which I dont like because its a direct dependency on file system and directory implementation. Thus, if we are able to properly ref-count and free on segment docvalues producer code, then we will be able to solve both issues at one go. |
I see the problem now. We have to track at the file level and not the shard level. I was wondering if we could simply filter cache entries on receiving the shard lifecycle events and evict them (in a way - skip the file watcher all together) - but the cache operates at a more granular level. |
@kotwanikunal when we have i think
|
Before cutting out the absolute file path dependency in KNN, this should be resolved first before this issue : #2033 |
After a deep dive, we can safely remove FileWatcher. The issue it was trying to solve has already been addressed. |
Removing FSDirectory Dependency in OpenSearch KNN1. GoalThis document outlines the rationale for deprecating 2. BackgroundIssue : Move away from file watcher for releasing memory #1885 A project (#2033) is underway to introduce a loading and writing layer within native engines — FAISS and NMSLIB. This layer provides a read stream that wraps Lucene’s However, even with this change, there is still places where they rely on an absolute file path — Therefore, in order to entirely eliminate 3. Problem DefinitionsThe current approach of using FileWatcher has two problems.
4. Solution - Free LunchFortunately, luyuncheng has already done a great job in #1946 where it cleans up allocated vector indices after receiving a close notification from Lucene. Now, all we need to do is just remove
4.2.3. Lucene's Doc Value Reader CycleAll doc value readers ultimately derive from the Codec. The KNN80DocValuesProducer
Lucene uses a reference counting mechanism to determine when a resource is safe to clean up. When the count reaches zero, it indicates the resource is no longer in use and can be safely closed. This mechanism is fundamentally same to For more information, refer to:
4.2.4. Twin of
|
@0ctopus13prime thanks for adding the idea of removing the file watcher. I like the idea presented here except this
Even after the competition of proposal we will still be left with FSDirectory dependency. Is this something that will be removed when you fix the write path? or this dependency will still be there? Another point is, with 2.17 version of opensearch we moved the Vectors getting stored as BinaryDocValues to FloatVectorValues/ByteVectorValues, so if we really want to remove the FileWatcher from code we have to fix this NativeEngines990KnnVectorsFormat -> Reader too so that when the readers are getting closed we can remove the graphs from cache. Another thing is, we should also remove the filePath as key from the cache and move to a better key which we can generate on the DVProducers and KNNVectorFormatReaders per vector field, to ensure we can remove graphs easily without taking any dependency on FSDirectory. |
@navneet1v And for this These are my todo list:
After these action items, there won't be FSDirectory dependencies. |
@0ctopus13prime I think overall proposal looks good. If you have code, could you raise a draft PR so I can take a closer look? |
@jmazanec15 |
@jmazanec15 |
The PR was merged! |
Description
Right now, we free memory for segments whose files get deleted by having a resource watcher watch for those files:
This file watching can actually be completely removed and it would do some good in simplifying the code (less static initialization, decoupling between cache manager and native memory load strategy, moving away from tight coupling of FSDirectory, etc). And additionally, how we do it now is a bug - see #1012.
To fix this, we just need to ensure that for either our DocValuesProducer and our KnnVectorsReader, when close is called, we evict any indices from memory (similar to how we do it now with the filewatcher). We would actually need to implement a really light weight DocValuesProducer that delegates for everything.
Related issues
#1012
The text was updated successfully, but these errors were encountered: