Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore model inference options in OpenSearch #5

Closed
wrigleyDan opened this issue Nov 8, 2024 · 3 comments
Closed

Explore model inference options in OpenSearch #5

wrigleyDan opened this issue Nov 8, 2024 · 3 comments
Assignees

Comments

@wrigleyDan
Copy link
Collaborator

wrigleyDan commented Nov 8, 2024

OpenSearch supports a couple of models directly (e.g. linear regression models, see https://opensearch.org/docs/latest/ml-commons-plugin/algorithms/#linear-regression).

Together with the ML inference processor we want to find out if there is a way run the dynamic model-driven approach as part of the main query.

That would mean a low code integration for OpenSearch users.

The idea would be to

  1. index the training data (index features in an index)
  2. train a model based on the training data
  3. create a pipeline with the ML inference request processor
    a. the request processor takes the query together with its features as its input
    b. it generates a prediction as its output
    c. the prediction is the neural search weight, we can derive the keyword search weight from that value and use these in the hybrid search part (basically a result processor)

What would still live outside of OpenSearch is feature generation. Maybe that's alright as not everyone might use an identical set of features.

@wrigleyDan
Copy link
Collaborator Author

wrigleyDan commented Nov 8, 2024

Reached out to the OpenSearch community to see how to overcome the current blocker of only getting NaNs when predicting via OpenSearch models:
https://opensearch.slack.com/archives/C05BGJ1N264/p1731077205560749

Created a Github issue after reaching out: opensearch-project/ml-commons#3210

@wrigleyDan wrigleyDan self-assigned this Nov 20, 2024
@wrigleyDan
Copy link
Collaborator Author

wrigleyDan commented Dec 3, 2024

With the changed approach (predict NDCG instead of neuralness) we are thinking about a custom pipeline. To be discussed in ml-commons community meeting.

The ml_inference pipeline is probably one to use as a starting point.
https://opensearch.org/docs/latest/search-plugins/search-pipelines/ml-inference-search-request/

@wrigleyDan
Copy link
Collaborator Author

Feature request opened and discussed in Search Community Triage meeting on 12/05:
opensearch-project/OpenSearch#16775

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant