Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Use ml_inference request processor output in normalization-processor #16775

Open
wrigleyDan opened this issue Dec 4, 2024 · 3 comments
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc

Comments

@wrigleyDan
Copy link

Is your feature request related to a problem? Please describe

I'm currently facing the challenge to use the output of an ml_inference request processor in a normalization-processor as part of a hybrid search scenario.

In this particular case the output of the ml_inference request processor are two floats. What currently is supported is using the output in a query_template. For example, using the floats as field values is possible.

Example pipeline definition:

PUT /_search/pipeline/ml_inference_pipeline
{
  "description": "search with predictions",
  "request_processors": [
    {
      "ml_inference": {
        "function_name": "remote",
        "full_response_path": true,
        "model_id": "xBEzU5MBDpIvIC8uBBh1",
        "model_input": """{ "parameters": {"input": "${input_map.features}"}}""",
        "query_template": """{
          "query": {
            "multi_match": {
              "type": "best_fields",
              "fields": [
                "my_field_a^${keywordness}",
                "my_field_b^${neuralness}"
              ],
              "operator": "and",
              "query": "iphone"
            }
          },"explain": true
        }""",
        "input_map": [
          {
            "features": "query.term.features.value"
          }
        ],
        "output_map": [
          {
            "neuralness": "$.inference_results[0]output[0]dataAsMap.neuralness",
            "keywordness": "$.inference_results[0]output[0]dataAsMap.keywordness"
          }
        ],
        "ignore_missing": false,
        "ignore_failure": false
      }
    }
  ]
}

Using the model output in a hybrid search scenario and the floats as part of the normalization-processor does not work however. The query does return results but normalization is not applied.

PUT _search/pipeline/ml_inference_pipeline
{
  "description": "search with predictions",
  "request_processors": [
    {
      "ml_inference": {
        "function_name": "remote",
        "full_response_path": true,
        "model_id": "zRG2YpMBDpIvIC8uJxhB",
        "model_input": """{ "parameters": {"input": "${input_map.features}"}}""",
        "query_template": """{
          "_source": {
            "excludes": [
              "title_embedding"
            ]
          },
          "query": {
            "hybrid": {
              "queries": [
                {
                  "multi_match": {
                    "type": "best_fields",
                    "fields": [
                      "my_field_a^20",
                      "my_field_b^10"
                    ],
                    "operator": "and",
                    "query": "iphone"
                  }
                },
                {
                  "neural": {
                    "title_embedding": {
                      "query_text": "iphone",
                      "k": 50
                    }
                  }
                }
              ]
            }
          },
         "search_pipeline": {
           "phase_results_processors": [
             {
              "normalization-processor": {
                "normalization": {
                "technique": "l2"
              },
              "combination": {
              "technique": "arithmetic_mean",
              "parameters": {
                "weights": [
                  ${keywordness},
                  ${neuralness}
                ]
              }
            }
          }
        }
      ]
    },
    "size": 100
    }""",
        "input_map": [
          {
            "features": "query.term.features.value"
          }
        ],
        "output_map": [
          {
            "neuralness": "$.inference_results[0]output[0]dataAsMap.neuralness",
            "keywordness": "$.inference_results[0]output[0]dataAsMap.keywordness"
          }
        ],
        "ignore_missing": false,
        "ignore_failure": false
      }
    },
    {
      "neural_query_enricher": {
        "description": "one of many search pipelines for experimentation",
        "default_model_id": "i6jHTZMBflg_ePyfu9EK",
        "neural_field_default_id": {
          "title_embeddings": "i6jHTZMBflg_ePyfu9EK"
        }
      }
    }
  ]
}

Describe the solution you'd like

Ideally, I'd like to use the output of an ml_inference request processor anywhere in the query including search pipeline processors.

Related component

Search

Describe alternatives you've considered

No response

Additional context

This is potentially related to opensearch-project/ml-commons#2841

Some testing done as part of asking this question on the #ml OpenSearch Slack channel: https://opensearch.slack.com/archives/C05BGJ1N264/p1732270893567709

Github branch with a notebook based example: https://github.com/o19s/opensearch-hybrid-search-optimization/blob/ml-inference-in-os/notebooks/ML%20Inference%20in%20OS.ipynb

@wrigleyDan wrigleyDan added enhancement Enhancement or improvement to existing feature or request untriaged labels Dec 4, 2024
@github-actions github-actions bot added the Search Search query, autocomplete ...etc label Dec 4, 2024
@msfroh
Copy link
Collaborator

msfroh commented Dec 4, 2024

I think we can address this with the PipelineProcessingContext that operates as request-level storage (essentially a HashMap) to propagate state/variables from one processor to another.

Essentially, I think we would need the following changes:

  1. In the ml_inference processor, add parameters to say "Store your output in a variable called foo" (where the name of the variable is configurable). Maybe everything from output_map can be pushed into the PipelineProcessingContext?
  2. In the normalization-processor processor, you need a way to read from the context. Since weights are expected to be numbers, one option could be to check for string values, and check if they look like variable substitutions (like "${keywordness}" and "${neuralness}"). If they do look like variable substitutions, read those variables from the processing context.

I think OpenSearch core is providing the required building block (the PipelineProcessingContext). The real work needs to happen in the https://github.com/opensearch-project/ml-commons/ and https://github.com/opensearch-project/neural-search repositories.

@martin-gaievski, @mingshl, @owaiskazi19 -- what do y'all think?

@martin-gaievski
Copy link
Member

I think this approach is generic enough, there isn't much of coupling between two processors.

I do have one concern regarding instantiation of the normalization processor. New processor instance is created by the Processor.Factory, validation of initial parameters like weights is responsibility of this factory. It does not have access to PipelineProcessingContext, from what I've seen only processor has such access. This means we need to make the validation logic more complex and take into account all possible cases, e.g. search pipeline can be created by a separate call or it's part of the search request.

@owaiskazi19
Copy link
Member

Good to see this feature ask. Although, this is a very specific case, applicable only for reading the output of the ml-inference processor and passing it as input to the normalization processor.

As part of the enhancements planned for Search Pipelines, we have been discussing a generic solution for passing the output of one processor as input to another processor. The PipelineProcessingContext in the core is definitely a good place to act as a mediator between the processors. However, we will need to incorporate the output of one processor and convert it into a readable format that is compatible with the input requirements of the next processor.
cc: @minalsha @dbwiddis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc
Projects
None yet
Development

No branches or pull requests

5 participants