-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental Codec Support #13992
base: main
Are you sure you want to change the base?
Experimental Codec Support #13992
Conversation
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
14ee1c0
to
03f1246
Compare
❌ Gradle check result for 14ee1c0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #13992 +/- ##
============================================
+ Coverage 71.42% 71.64% +0.22%
- Complexity 59978 61336 +1358
============================================
Files 4985 5065 +80
Lines 282275 288206 +5931
Branches 40946 41743 +797
============================================
+ Hits 201603 206476 +4873
- Misses 63999 64677 +678
- Partials 16673 17053 +380 ☔ View full report in Codecov by Sentry. |
@sarthakaggarwal97 I am not onboard with this change, in my opinion:
On a general note, I think we should not be tight to the notion of "experimental codec in code" but provide the mechanism to configure inclusions / exclusions. Surely this is just my opinion, @dblock @andrross @msfroh any thoughts from you guys? Thanks! |
Agree with @reta's comment above. I still don't understand what is preventing us from implementing this capability within custom-codecs behind a setting defined by that plugin? |
@andrross the issue is that the codecs are validated and made available over at two places: EngineConfig and at CustomCodecService. The validation that happens in the EngineConfig, the codecs are available via NamedSPI which in turn gets the codecs / services from resources file. I don't think there is a way to load granular resources/services/codecs based on the cluster settings. We can definitely control whether to make the codecs available or not in CustomCodecService but I dont think there is way to make codecs not available in EngineConfig once loaded via resources. If we don't make a validation change (as in the PR), we would always allow the experimental codec (as the codec was loaded through the resources), as it will be present in the NamedSPI. If the same experimental codec is not available in CodecService, the shards will fail. If there is another way out by limiting the changes in custom-codecs plugin, we should go for it. Tagging @reta @backslasht @mgodwan @shwetathareja @dblock @msfroh @ankitkala to add more thoughts. |
So we have 2 issues here:
As you rightfully mentioned, Apache Lucene has no direct support of codecs filtering (the only way we could distinguish stable and non-stable codecs is by convention, Apache Lucene uses
We could narrow the scope and try to contain the change with |
@reta What is the behavior of the system if we just implement runtime checks based on the value of some "experimental" configuration setting that dynamically chooses the codecs to add in the CodecService here? https://github.com/opensearch-project/custom-codecs/blob/main/src/main/java/org/opensearch/index/codec/customcodecs/CustomCodecService.java#L47 |
@andrross that would not "hide" them from Apache Lucene: https://github.com/opensearch-project/custom-codecs/blob/main/src/main/resources/META-INF/services/org.apache.lucene.codecs.Codec |
Right, but would they be usable to an OpenSearch user if CodecService prevent access? |
Yes and no: the validation check will pass successfully for all codec related settings (those use Codec SPI), but it will fail later on if the codec is filtered out from |
yeah, exactly. The shard will fail in this scenario. @reta @andrross what should be the next course of action on this? |
@sarthakaggarwal97 I think we have to agree on 3 things in principle:
The decisions here would shape the solution. |
@reta I'm aligned with your suggestion. If we are able to restrict the experimental codec changes to custom-codecs, then we should go for it. Granular control over experimental codec would be ideal as well. |
Thanks @sarthakaggarwal97 , let us close ti and move the discussion to opensearch-project/custom-codecs#148 |
This PR is stalled because it has been open for 30 days with no activity. |
@@ -140,15 +140,19 @@ public Supplier<RetentionLeases> retentionLeasesSupplier() { | |||
return s; | |||
default: | |||
if (Codec.availableCodecs().contains(s)) { | |||
return s; | |||
if (!isExperimentalCodec(Codec.forName(s))) { | |||
return s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we just rename s
to something better?
This PR is stalled because it has been open for 30 days with no activity. |
Description
We currently do not have a way to mark and validate the codecs as experimental in OpenSearch. With this, we would be able to safely introduce experimental codecs for consumption.
Related Issues
Resolves #13723
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.