-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does Opensearch need constant_keyword field type? #9981
Comments
I had been thinking about doing something with ingest pipelines to route incoming documents to different indices based on some predicate(s) on field values, and then route queries to the right indices using search pipelines. (This assumes that both index pipelines and search pipelines can be evaluated before finalizing on a target index.) I worked on something like that to add capacity for "hot documents" a few years back. With these On the other hand, if we can just select target indices based on a search-time predicate, that feels easier to me, I think. |
The experience for
|
Also, during search it is not zero cost on
|
Can you elaborate more on this approach? The target indices selection should be done during search and ingestion!? It will be ideal experience for customer to not deal with multiple indices |
I was imagining something where you could do e.g.
Essentially, the defined pipelines would route the index and search requests to the right indices. The user would need to define the pipelines appropriately, but wouldn't need to worry about routing after that. |
@msfroh - The above defined ingestion experience using pipelines is much better and seamless to the users. It takes care of efficiency concern as well, by only querying the requisite index instead of all possible ones. |
To not even have the user specify search/index pipeline in the request, I am wondering if we can create alias on top of bicycles/other_bicycles and the pipelines are specified for any indexing or search request to that alias!? |
Oh -- incidentally, it turns out that we already almost have the We already have OpenSearch/server/src/main/java/org/opensearch/index/mapper/ConstantFieldType.java Line 57 in 7b75fb4
Right now, the only implementation is OpenSearch/server/src/main/java/org/opensearch/index/mapper/IndexFieldMapper.java Line 65 in 5bb7fa3
Essentially, that's how the |
@hasnain2808 -- you expressed some interest in working on this one in our OpenSearch Lucene Study Group meeting (https://forum.opensearch.org/t/opensearch-lucene-study-group-meeting-monday-november-20th/16729/9). Can you please respond to this issue so we can assign it to you? It can only be assigned to a maintainer or someone who participates in the issue. |
Sure @msfroh |
Came across constant_keyword field type added by Elasticsearch here. The idea is pretty simple where 2 indices can be maintained partitioning the documents based on specific value for field. Essentially, all the documents with value X for field F go to index I1, everything with non X value go to index I2. While searching both the indices can be evaluated, for filter on field F, it will MatchAll or MatchNone on index I1 depending on filter value. This is much more efficient in practice compared to the default single index approach that will match lot of documents in the index.
That being said, I have not come across many customers in managed service looking for something like this. Want to get community feedback if they think it will be useful?
The text was updated successfully, but these errors were encountered: