-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Search Pipelines] Split processor namespace by type (request vs response vs phase) #7576
Labels
enhancement
Enhancement or improvement to existing feature or request
Search
Search query, autocomplete ...etc
v2.8.0
'Issues and PRs related to version v2.8.0'
Comments
msfroh
added
enhancement
Enhancement or improvement to existing feature or request
untriaged
Search
Search query, autocomplete ...etc
labels
May 15, 2023
msfroh
added a commit
to msfroh/OpenSearch
that referenced
this issue
May 17, 2023
In the initial search pipelines commit, I threw request and response processor factories into one combined map. I think that was a mistake. We should embrace type-safety by making sure that the kind of processor is clear from end to end. As we add more processor types (e.g. search phase processor), throwing them all in one big map would get messier. As a bonus, we'll be able to reuse processor names across different types of processor. Closes opensearch-project#7576 Signed-off-by: Michael Froh <froh@amazon.com>
6 tasks
msfroh
added a commit
to msfroh/OpenSearch
that referenced
this issue
May 17, 2023
In the initial search pipelines commit, I threw request and response processor factories into one combined map. I think that was a mistake. We should embrace type-safety by making sure that the kind of processor is clear from end to end. As we add more processor types (e.g. search phase processor), throwing them all in one big map would get messier. As a bonus, we'll be able to reuse processor names across different types of processor. Closes opensearch-project#7576 Signed-off-by: Michael Froh <froh@amazon.com>
msfroh
added a commit
to msfroh/OpenSearch
that referenced
this issue
May 19, 2023
In the initial search pipelines commit, I threw request and response processor factories into one combined map. I think that was a mistake. We should embrace type-safety by making sure that the kind of processor is clear from end to end. As we add more processor types (e.g. search phase processor), throwing them all in one big map would get messier. As a bonus, we'll be able to reuse processor names across different types of processor. Closes opensearch-project#7576 Signed-off-by: Michael Froh <froh@amazon.com>
msfroh
added a commit
to msfroh/OpenSearch
that referenced
this issue
May 22, 2023
In the initial search pipelines commit, I threw request and response processor factories into one combined map. I think that was a mistake. We should embrace type-safety by making sure that the kind of processor is clear from end to end. As we add more processor types (e.g. search phase processor), throwing them all in one big map would get messier. As a bonus, we'll be able to reuse processor names across different types of processor. Closes opensearch-project#7576 Signed-off-by: Michael Froh <froh@amazon.com>
github-project-automation
bot
moved this from 👀 In review
to ✅ Done
in Search Project Board
May 22, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
Enhancement or improvement to existing feature or request
Search
Search query, autocomplete ...etc
v2.8.0
'Issues and PRs related to version v2.8.0'
Is your feature request related to a problem? Please describe.
In my initial PR for search pipelines, I copied the example of
IngestService
that maintains a map from processor names to their factories.Of course with search pipelines, we already have two different kinds of processors -- request processors and response processors. Thinking about something like the
RenameResponseProcessor
(that we may rename toRenameFieldResponseProcessor
) with a possible key ofrename
orrename_field
, I was thinking "What would we call a request processor that replaces all occurrences of a given field name with another name?" We would probably also call it something likerename_field
. Right now, you wouldn't be able to register two processors (of different types) with the same name.Describe the solution you'd like
I think
SearchPipelineService
should maintain separate maps for request and response processors (or any future processor type), effectively giving them their own namespaces. This also simplifies some of the logic around pipeline construction, since we no longer have to doinstanceof
checks to make sure that a processor with a given name corresponds to the correct type.The biggest challenge I see is that search pipeline code went out with OpenSearch 2.7 behind a feature flag. That included a
NodeInfo
change so that nodes provide a flat list of supported processors (so we can check on pipeline creation that all nodes in the cluster have the required processors, since they may be provided by plugins). With this change, we would also need to split that output to list supported processors by type. Since we're not going to change 2.7 (since it's been released), we would probably need to flatten the list in the unlikely event that we're called from a 2.7 node. Similarly, if we receive aNodeInfo
from a 2.7 node, I think we should pretend that all processors are (potentially) both request and response processors (since we have no way of knowing which they are).Describe alternatives you've considered
The main alternative (I think) is to let my existing mistake stand. Going forward, we would never be able to register two search pipeline processors with the same name, regardless of what part of the search flow is being acted upon.
We could also establish (and enforce) a naming convention where all processors are named according to their type (with a prefix or suffix).
The text was updated successfully, but these errors were encountered: