-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for reloading the SPI for KnnVectorsFormat class #13393
Comments
I have raised a PR for the fix: #13394 |
My only concern is why is this necessary? Is feature parity the only reason? |
Currently in my application I want to extend the KNNVectorsFormat just like Codec, to plug in a completely different implementation of Vector Search Algorithm(which not in Java). As KNNVectorsFormat is part of Codec which allows applications on top of Lucene to extend the codec and their format. But due to lack of support of reloading of SPI in KNNVectorsFormat this not possible. The change is a pretty straightforward change and adding extensibility support on KNNVectorsFormat only. @benwtrent please let me know if you have further questions. |
I still don't understand. If the SPIs are there, it should work. If the SPI extension didn't work, none of the current formats would work either as they all rely on it. Why doesn't your application work? |
@benwtrent But in my case during runtime I need to load some other jars(consider them as plugin) which has the KNNVectorFormat classes. And these classes as they are getting loaded at runtime when instance of KNNVectorsFormat class(an implementation present in loaded plugins) is getting created via SPI, the SPI is not able to find the class as they are not present in the This is the same reloading of SPI via Classpaths is present for Codec and DocValuesformat too as I have referenced in description. Ref: |
The dynamic addition of formats from other class loaders seems reasonable, though I don't have a use case for it myself. Maybe I'm missing something but before adding a new API, is it not possible to implement the functionality through the existing |
A use case of this is Opensearch where Opensearch loads the plugins jars during runtime, and plugins have classes(Codecs, DocValuesFormat etc) thats get init using SPI. For similar kind of usecases a public API to reload the SPI using class loaders is very useful. Opensearch uses this for DocValuesFormat and Codec just like this. Ref: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/plugins/PluginsService.java#L763-L765
Actually I implemented this exact same way, where I have my Now during indexing everything works well. But when the IndexReader is getting opened, it needs to create an Instance of I hope this clarifies. |
Is it fair to say that this change just brings
These other formats are dynamically pluggable, and this change is just making |
Yes that is correct. |
Thanks for the clarification @navneet1v I didn't know folks were dynamically loading jars for different vector formats. The idea sounds good to me. I haven't reviewed the PR. I just wanted to make sure we weren't adding code without a practical and current use. |
@benwtrent Sounds good to me. Will wait for your review on the PR. |
Thanks for the explanation - sounds like a reasonable idea. |
@benwtrent , @ChrisHegarty as we are inclined that this is a valid usecase, can you guys please review the PR #13394 I would like to get this feature in the 9.11 version of Lucene. |
Thanks for bringing this up. Unfortunately we have no way to automatically enforce this for our NamedSPI provides (as it is static methods), but we should keep all of them in sync. At moment there's also missing the getter for list of names. I argued about this in the other PR already. The testframework uses expensive extra walk though ServiceLoader to do get those. This should be also brought in line with the other codec/analysis components. I will open a PR for this. The code in test-framework is inconsistent and a duplicates existing logic in a different way (the one that randomly chooses a KNN format). |
see #13428 This also fixed the concern in other PR with SPI named similarity functions |
@uschindler, when raising the PR I saw this and thought is it needed or not. To ensure the scope of the PR was minimal I didn't add the getter method. I see that you have raised the PR. |
Description
Description
Lucene uses SPI to get the instance for various classes like Codec, KNNVectorsFormat etc.
Currently Codec class provide a way to reload the SPIs by providing an interface which takes a ClassLoader and reload the SPIs. Ref: https://github.com/apache/lucene/blob/branch_9_10/lucene/core/src/java/org/apache/lucene/codecs/Codec.java#L126-L137
but similar functionality is not present in the KNNVectorsFormat.(I checked main branch and branch_9_10 too). Ref: https://github.com/apache/lucene/blob/branch_9_10/lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsFormat.java
I am not sure if this is a miss or there is some other way to reload the SPI of KNNVectorsFormat class.
Solution
What I am looking for here is to add the support for reload SPI function in KNNVectorsFormat class too so that external libraries/application can load their own KNNVectorsFormat.
I am willing to pick up this issue and contribute back.
@benwtrent please let me know your thoughts on this
The text was updated successfully, but these errors were encountered: