Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Configurable Automation for OpenSearch AI Use Cases #9213

Open
dbwiddis opened this issue Aug 9, 2023 · 29 comments
Open

[RFC] Configurable Automation for OpenSearch AI Use Cases #9213

dbwiddis opened this issue Aug 9, 2023 · 29 comments
Labels
RFC Issues requesting major changes Roadmap:Ease of Use Project-wide roadmap label v2.12.0 Issues and PRs related to version 2.12.0

Comments

@dbwiddis
Copy link
Member

dbwiddis commented Aug 9, 2023

Note: This RFC is for a back-end framework and templates that enable it. See No-code designer for AI-augmented workflows for the corresponding front-end RFC.

Proposal

The current process of using ML offerings in OpenSearch, such as Semantic Search , requires users to handle complex setup and pre-processing tasks, and send verbose user queries, both of which can be time-consuming and error-prone.

The directional idea is to provide OpenSearch users with use case templates, which provide a compact description (e.g., JSON document). These templates would describe configurations for automated workflows such as Retrieval Augment Generation (RAG), AI connectors and other components that prime OpenSearch as a backend to leverage generative models—once primed, builders can query OpenSearch directly without building middleware logic to stitch together data flows and ML models.

While a range of pre-configured templates will be available, we also envision a no-code drag-and drop frontend application similar to LangFlow, Flowise, and other offerings, which would enable quick editing of these templates for rapid prototyping. This No-Code frontend application will be covered in a separate RFC, but it is important to note that these templates will be written in user-readable form (e.g., JSON or YAML), so use of the no-code builder is not required for the backend framework to use them.

Once a workflow is uploaded to OpenSearch, it will enable a streamlined API call with only a minimum amount of (required and optional) parameters provided by the user.

This proposal is the collaboration of the following additional contributors:

Goals and Benefits

The goal is to improve the developer experience that we created in 2.9 for developing semantic search and GenAI chatbot solutions (RAG), as well as future capabilities being developed, by creating a framework that simplifies these tasks, hiding the complexity and enabling users to leverage ML offerings seamlessly.

This framework will provide a complete solution, streamlining the setup process with an easy-to-use interface for creating and accessing workflows supporting ML use cases. Using the framework, builders can create workflows leveraging OpenSearch and external AI apps to support visual, semantic, or multi-modal search, intelligent personalization, anomaly detection, forecasting, and other AI-powered features faster and easier. End-user app integrations will be greatly simplified and easy to integrate into third party app frameworks.

These templates help users automatically configure search pipelines, ml-commons models and AI connectors and other lower-level framework components through declarative-style APIs. Once a template successfully executes, the OpenSearch cluster should be fully configured and vector database hydrated, allowing an app developer to run direct AI-enriched queries. Since the API is built on framework components—neural search, ml framework, AI connectors and search pipelines--there’s no middleware like LangChain microservices to manage. In some cases, the ingest process may require sophisticated customizations so templates might leave a builder with an environment to touch up workflows and control over the right time to hydrate OpenSearch.

Background

Between OpenSearch 2.4 and 2.9, we released a number of platform features that simplifies the creation of semantic search apps on OpenSearch. Our intent is to continue building on these features to improve the user experiences, performance, and economics. More importantly, we want to enhance our framework so that it’s not limited to semantic search type use cases. One of our goals is to support a broader set of AI use cases suited for OpenSearch, for example the upcoming Conversational Memory (Chat History) API and Conversation Plugin. Currently, our framework is built around the following:

  1. ML Framework (ml-commons plugin): the ML framework provides the ability to manage and serve models locally and integrate with externally managed models.
  2. AI connectors (ml-commons plugin): the ML framework is extensible and allows integrators to build AI connectors between OpenSearch and 3rd-party AI services and platforms. As of 2.9, we have connectors for Amazon SageMaker Hosting, OpenAI ChatGPT, Cohere Rerank and Cohere Embed. Connectors for Amazon Bedrock Text and Amazon Bedrock Multi-modal APIs are planned in the next couple of months.
  3. K-NN plugin: the k-nn plugin provides allows users to index vectors generated by embeddings alongside metadata to support vector and hybrid search.
  4. Neural Search (neural-search plugin): neural search is currently designed to provide a high-level interface for semantic search. It provides a text based search interface and APIs to hydrate k-nn indexes via an integrated text embedding model. The plugin uses the ML framework to integrate and serve text embedding models and allows it to perform the text encoding on behalf of the developer. Thus, unlike the k-nn plugin, the developer doesn’t have to interface with OpenSearch through vector queries and index commands. The user interfaces through natural language queries and text input.
  5. Search and Ingest Pipelines: The difference between a large number of AI-powered search and analytics use cases boils down to how queries are processed. For example, a semantic search query workflow will look something like this. There might be slight variations. For example, instead of asking a question, I may want to specific a text document in an index and ask for the 10 most similar documents.
image (2)

RAG extends the semantic search query workflow. The first part of RAG is the retrieval workflow which is the same as the semantic search query flow, but instead of returning similar documents to the user it sends it to a generative LLM. The LLM processes those results to return a modified response (eg. summarization or recommendations as a conversational response).

image (3)

Search pipelines provides APIs to configure query workflows based on painless scripts. In 2.9 we leverage search pipelines to support the RAG workflow.

These framework components provide a lot of the building blocks required to assemble an AI app on OpenSearch. There are still some feature gaps and variation between use cases, which depend on future OpenSearch capabilities. Initial design must envision this potential to extend and evolve as these new features are added.

High Level Design

The design will center around use case templates which describe the sequence of execution of building blocks in a Directed Acyclic Graph (DAG). Both series and parallel execution will be supported. Each building block implements a logical step in a workflow. Example building blocks include:

  • Start point/User Prompt: Takes specified user input which will include required and optional fields specified by the template.
  • Query an OpenSearch API. The building block would pre-populate the request body, with a user able to override some or all of the fields. Some fields will be mandatory and provided as input in the workflow. The response to the API call will provide output which can be used by the next building block in the workflow.
  • Query an external Web Service API. Conceptually similar to OpenSearch API but allows leveraging third-party APIs.
  • Process embeddings using a specified model.
  • Execute an existing search pipeline, ingest pipeline, or previously saved template workflow.
  • Process text, including filtering or modifying fields, tokenizing, etc.
  • Implement conditional logic, such as “query this API until the response status field matches the required value”.
  • End point: Returns information to the user.

Drag-and-drop editors under consideration depend on ReactFlow. While we are considering using an existing one, we may also develop our own, also depending on ReactFlow. Exporting of the flow takes the form of a JSON object with fields that we can implement as necessary. A minimal default flow with two nodes connected by one edge shows the format identifying the graph underlying the workflow:

{
  "nodes": [
    { 
      "id": "1",
      "data": { 
        "label": "Node 1"
      },
      "position": {
        "x": 100,
        "y": 100
      }
    },
    { 
      "id": "2",
      "data": {
        "label": "Node 2"
      },
      "position": {
        "x": 100,
        "y": 200
      }
    }
  ],
  "edges": [
    {
      "id": "e1-2",
      "source": "1",
      "target": "2"
    }
  ]
}

For more detailed example templates, consider this example in Flowise using Anthropic Claude to ingest a document, or this basic example in Langflow. We intend to use a similar format, and potentially enable the ability to parse/import formats from these and other popular no-code builders.

These templates can be generated using a no-code editor, or manually constructed/edited using the required fields. The interaction between a no-code front end and the execution layer in a plugin is shown here.

Image (2)

Template Fields

The below example outlines potential fields that we will likely include. Key components include version compatibility (some APIs require minimum OpenSearch versions), what client interfaces/APIs are needed (to permit validation of appropriate plugin installation), what external connectors are required, what Integrations may be used, and other definitions. Some workflows associated with common use cases such as Retrieval Augmented Generation (RAG) will be available from the backend framework by name, offering a streamlined template definition, while others may be assembled by the user from a selection of building block nodes.

Use Case: Generative AI Chatbot
Description: 
    This template can be used to configure and prime OpenSearch for building a Generative AI Chatbot. 
    Once this template completes execution, the user will have a fully configured OpenSearch environment 
    to begin creating a Generative AI Chatbot. The next step is to hydrate OpenSearch by running the ingest 
    pipeline that has been configured for you using the XYZ API. Once that is completed you can start
    building your application on the neural search APIs.
Version: 
    template: 1.0
    compatibility: [2.9, 3.0]
Client Interfaces:
    neural_search: [field_name1:[text, required], field_name2:[binary, optional]...]
User_Parameters:
    index_fields: [map, required]
Connectors:
    embeddings: $connector.embeddings[${credential.key}]
    generative_llm: amazon_bedrock_v1[${user_params.engine=Claude2},${credential.iam_role}]
Data_Schema: 
    index_mappings: ${user_parameters.index_fields}
    prompt_template: 
        """Use the following pieces of context to answer the question at the end. If you don't know the answer,
        just say that you don't know, don't try to make up an answer. Use three sentences maximum. Keep the
        answer as concise as possible. Always say "thanks for asking!" at the end of the answer.
Workflows:
    Query:
        retrieval_augmented_generation: 
            retriever: 
                knn:
                    encoder: $connectors.embedding
                    hybrid_scoring: yes
                    k: 10
                lexical: 
                   ranker: BM25
                   limit: 25
                   max_tokens: 4096
        generator: connectors.amazon_bedrock_v1
    Ingest:
        engine: opensearch_ingest_service | apache_spark | custom
        data_source: s3://...
        pipeline: 
            cmd: xyz[${user_parameters.pipeline_inputs}]. # just a rpc
            pre_processors: [text_splitter...], 
            vector_encoder: [$connectors.embedding[mode=batch]],  
            post_processors: [...]
            index:
                mappings: data.index_mappings
                knn: 
                    engine: faiss
                    algo: hnsw
                        params: ...

Open Questions

  • How do we protect the cluster from resource-intensive workflows? Can we throttle execution time or prevent execution based on indicators of pressure on the cluster?
  • How do workflows interact with OpenSearch security features? Can users share workflows? Can we simplify permissions with some default roles?
  • While templates can be stored in an OpenSearch index, we need a place to persist templates when working offline, importing from a library, etc. Initially we can store these on GitHub, but would like to enable easier searching of these templates.
  • Workflows need to be validated. Some validation can take place in the front end (e.g., validating models or connectors) while some must be done in the backend (e.g., validating whether required plugins are installed). We need to develop a comprehensive approach.
  • If a workflow fails in a partially-executed state, is there an ability to undo what it has already done?
  • Which plugin has the responsibility for "registering" workflows with the automation framework? Can we leverage existing features (e.g., RestController to test if API exists) or should we add a new extension point in OpenSearch?
@dbwiddis dbwiddis added the RFC Issues requesting major changes label Aug 9, 2023
@dbwiddis
Copy link
Member Author

dbwiddis commented Aug 9, 2023

Proposed Implementation

The Backend will contain a minimal Processor interface taking a map parameter to pass the necessary context between workflow steps. In particular, the output of one step needs to be available as the input for the next step. Other configuration may need to persist across multiple steps, and partial results may be stored during execution to provide status if requested. Processors will be similar to (and in some cases link directly to) Search Processors and Ingest Processors. If a process needs to make a REST API call as part of its execution, it can do so similarly to this Kendra Ranking Processor.

Workflow validation and execution is demonstrated below for a Semantic Search example.

  1. The user invokes the REST API communicating the selected template (usually via the front-end).
  2. The backend plugin would validate installation of the required plugins and registration of the required APIs. Since this is a search use case, the plugins and APIs associated for Semantic Search like K-NN, Neural Plugin and ML-Commons need to be available (plugins installed and processors linking to the APIs registered).
  3. If the required APIs are not available, the user would receive a detailed error response explaining why. Otherwise the needed processors would be added to the context.
  4. Once validation is complete, the workflow/pipeline would be created and executed.

Image (3)

Workflow Sequence Diagram

The sequence below shows the complete interaction between user and OpenSearch.

Image (4)

  1. Use Case Selection: Users will interact with the ML Offerings Framework through the UI (probably on OpenSearch Dashboards). They can select a desired ML use case from the available options using a simple drag and drop mechanism. This will provide a default template with some customization available.
  2. Optional customization: After selecting the use case, users will be able to update details related to the use case, such as changing which model or connectors are used, including their own custom model or a pre-trained model provided by OpenSearch.
  3. Determine dependencies: Based on the chosen use case and user inputs, the frontend will map the use cases with required dependencies on OpenSearch such as required plugins, models, and connectors.
  4. REST API Interaction: The frontend will pass a JSON template including the workflow and dependencies to the Backend plugin using a REST API.
  5. Template Processing: The backend will store the template in a global context for the next steps.
  6. Template Validation: The backend will confirm the required plugins are installed, the requested dependencies are available, and the workflow properly parses.
  7. Workflow Creation: The backend will orchestrate necessary preprocessing steps using OpenSearch or Plugin APIs. These may include setting up search or ingest pipelines, deploying models or creating connectors, or any other setup requirements needed to enable future execution of the workflow.
  8. Workflow ID created: The workflow ID allows end users / application integrators to use a simplified API call to execute the workflow.
  9. An end user (via an application using the REST API) passes the workflow ID and any required parameters (and optionally additional parameters) to the back end to execute the workflow.
  10. The workflow provides a task ID to the user
  11. The user (or an application using a progress bar) can query the back end status API with the task ID to obtain progress of the workflow execution.
  12. In response to the status request with the task ID, the back end will return results to the user.

@dylan-tong-aws
Copy link

@dbwiddis, thanks for putting together the RFC. I like to request to adjust the name because I don't want to give the community that we are building an application framework like Streamlit or business process automation software. We are building backend functionality and we have no intention of creating a framework that makes it difficult for users to decouple app and data tier logic.

I propose we describe this work along the lines of a no-code designer and configurable automation for AI-augmented search and ingest pipelines. I also think there are at least two projects here. One is the no-code designer and one is the backend APIs.

@dbwiddis
Copy link
Member Author

I like to request to adjust the name because I don't want to give the community that we are building an application framework like Streamlit or business process automation software.

Naming things is hard! I agree we can find a better name and happy to discuss alternatives here.

I propose we describe this work along the lines of a no-code designer and configurable automation for AI-augmented search and ingest pipelines. I also think there are at least two projects here. One is the no-code designer and one is the backend APIs.

Yes, as mentioned in the first section, "This No-Code frontend application will be covered in a separate RFC". This RFC is for the templates and back-end framework.

"Configurable automation for AI-augmented search and ingest pipelines" is a rather verbose name that I think limits the scope of what we're doing here.

I do think "Configurable automation for AI-augmented workflows" could work?

@dbwiddis dbwiddis changed the title [RFC] No-code Application Workflow Orchestration Framework [RFC] Configurable automation for AI-augmented workflows Aug 10, 2023
@HenryL27
Copy link

HenryL27 commented Aug 10, 2023

Questions about how this integrates with various things:

What would it take to make this a backend tool for Langflow or Flowise? Or whatever they call it. Will this make it easy to have OpenSearch-based AI blocks in my external AI workflow or is it an either/or - OpenSearchFlow vs Flowise?

It feels to me like there's a fair amount of overlap with Agent Framework, in terms of configuring complicated CoT/GenAI workflows. What should be done with that RFC versus with this one, and where do they meet?

@austintlee
Copy link
Contributor

Regarding what to call these things, how about OpenSearch [AI] Studio and OpenSearch [AI] Workflows? Because what I'm reading here sounds a lot like AWS Glue Studio and Glue Workflows. Glue also offers blueprints which are constructs built on workflows which I think I am also sensing where this might be headed. I put "AI" as [optional] since none of this sounds specifically tied to AI workloads.

Is the backend going to live in OpenSearch ("core") itself or will it run as a separate thing?

Playing devil's advocate, I want to ask why this can't be all done using Apache Airflow (or something similar). I can see there being something new to bring for the frontend, but not seeing a compelling reason for the backend to be built anew unless it has to be deeply integrated inside the core?

@dylan-tong-aws
Copy link

dylan-tong-aws commented Aug 10, 2023

It feels to me like there's a fair amount of overlap with opensearch-project/ml-commons#1161, in terms of configuring complicated CoT/GenAI workflows. What should be done with that RFC versus with this one, and where do they meet?

@ylwu-amzn, @dbwiddis. I agree. Configurable workflow support will provide a flexible but generic way to support variations of RAG with or without CoT, which may require multi-pass model invocations--the goal of our project is to leave it to the user to decide how they design the query pipelines and what belongs in application-tier logic. The intent of this feature isn't to be overly prescriptive and leave app developers with full control over agent execution logic. IMHO, agents should be built in the app layer, and the data stores like OpenSearch will provide users with the flexibility to determine how to execute AI-enriched information retrieval workflows for those agent applications. Let's chat.

What would it take to make this a backend tool for Langflow or Flowise? Or whatever they call it. Will this make it easy to have OpenSearch-based AI blocks in my external AI workflow or is it an either/or - OpenSearchFlow vs Flowise?

We are considering this, but there are some challenges 1/ how do we provide compatibility and a seamless experience for LangChain backends that have functionality that is beyond the scope of OpenSearch (Eg. other vectorstores)? 2/ how do we extend these frameworks beyond LLMs? There are a multitude of predictive time-series analytics use case and non-LLM search use cases like visual and multi-modal search that our framework aims to support.

@dylan-tong-aws
Copy link

Regarding what to call these things, how about OpenSearch [AI] Studio and OpenSearch [AI] Workflows? Because what I'm reading here sounds a lot like AWS Glue Studio and Glue Workflows. Glue also offers blueprints which are constructs built on workflows which I think I am also sensing where this might be headed. I put "AI" as [optional] since none of this sounds specifically tied to AI workloads.

Hi Austin, the workflows that we are creating will cover ingest and query workflows. The later is beyond what Glue does and is executed in real-time within the OpenSearch engine (Search Pipelines). Secondly, we are not building/re-building any data processing engine. On the ingest side, we'll provide the option for users to describe ingest workflows that can be translated to some target engine like OpenSearch Ingestion and Apache Spark. We'll look to expanding the our ML Extensibility features so that the community can easily plugin alternative engines. Our framework handles the orchestration, integration and a portable interface between supported engines.

Is the backend going to live in OpenSearch ("core") itself or will it run as a separate thing?

It will run on the cluster either as a plugin or core. As described in the RFC, we're using existing OpenSearch components, which are plugins (eg. ml-commons) or core (eg. Search Pipelines). The intent of this work is to simplify what could be done manually. Users could configure and build-on the individual components in the RFCs (with enhancements) to accomplish the same thing. We feel like we need to simplify the developer and user experiences.

Playing devil's advocate, I want to ask why this can't be all done using Apache Airflow (or something similar). I can see there being something new to bring for the frontend, but not seeing a compelling reason for the backend to be built anew unless it has to be deeply integrated inside the core?

I think we need to adjust the title of this RFC because I feared people might think we're building things like an Airflow alternative (@dbwiddis ) and we are not. One way to think of this is we are enhancing Search Pipelines. Unlike Airflow, our intent isn't to provide general purpose batch-oriented workflows. We are building something exclusively for OpenSearch users. As well, Search Pipelines is not batch--it performs query-time processing. The ingest part could call out to Airflow, but we also don't want to build something that forces people to use Airflow with OpenSearch. As commented above, I am going to advocate for our team to make this component extensible and support multiple ingest job execution engines. A user could also build an Airflow workflow that uses this API to prep OpenSearch for a vector database hydration process. A user might chose this option because our intent is to provide a simpler higher-level interface than if they were to build on lower level components like the OpenSearch ingest

@dylan-tong-aws
Copy link

dylan-tong-aws commented Aug 10, 2023

"Configurable automation for AI-augmented search and ingest pipelines" is a rather verbose name that I think limits the scope of what we're doing here. I do think "Configurable automation for AI-augmented workflows" could work?

I agree. But I think we need be more specific that just workflows because I already see comments asking how this is different that general purpose data processing and workflow engines.

Let's think of something that:
1/ clarifies that the scope is just OpenSearch and we're building a general purpose workflow engine
2/ now that I know this is the backend component, I agree that this is more than just workflows. We're automating the "priming" of opensearch to serve as a backend for specific AI use cases.

Some thoughts...
1/ Engine configuration automation: we do need to be clear early on in the RFC that we are initially focusing on AI use cases. I also think that if we are more specific than this in the naming, a user might think feature is about automation for things like ISM.
2/ Configurable automation for OpenSearch AI workflows? It should be clear that the scope is OpenSearch and AI. However, this is more than workflows. The core intent is prime the system for specific use cases, and I don't think we capture that intent in the name.
3/ Backend configuration/priming automation for AI apps?

@dblock
Copy link
Member

dblock commented Aug 10, 2023

One way to think of this is we are enhancing Search Pipelines.

This is a thoughtful proposal. After reading it I asked myself why this is not "just an enhancement over search pipelines". So that seems like the best incremental approach to take. That should solve a lot of the open questions, such as storage.

@dylan-tong-aws
Copy link

I asked myself why this is not "just an enhancement over search pipelines".

We could consider calling this project that if it's easier for people to understand the intent. A lot of the work is revolved around creating search processors, making search pipelines configurable and improving usability. However, there are also ingest workflows and extensibility elements that are beyond the scope of search pipelines.

@dbwiddis
Copy link
Member Author

Responding to @HenryL27 :

What would it take to make this a backend tool for Langflow or Flowise?

Both of those products (and many others) depend on ReactFlow. Accordingly, their workflows are in the same JSON format cited in examples earlier here (Flowise) and here (Langflow). From a backend framework, as long as the same fields are in either product, we could use either one.

The actual selection of a front-end UI will be covered in another RFC, but just as a preview, to answer "what would it take":

  • Define the template requirements (this RFC)
  • Build modules that encode these requirements in Flowise, for that userbase
  • Build modules that encode these requirements in Langflow, for that userbase
  • Build modules that encode these requirements in chaiNNer, for that userbase
  • Build modules that encode these requirements in Automa (Patterns), for that userbase
  • Build modules that encode these requirements in future front-ends.

Certainly this can be done quickly and easily, but it would then also require OpenSearch users who want to edit these templates to download/install/setup the appropriate editor. Both Langflow and Flowise require operating a webserver (one can deploy it locally on their own machine with docker). So we're distributing the (repetitive) setup effort among many users who can follow the setup steps in line with their corporate security requirements.

Or whatever they call it. Will this make it easy to have OpenSearch-based AI blocks in my external AI workflow or is it an either/or - OpenSearchFlow vs Flowise?

We can easily do both. We are defining a template format compatible with either. Getting it out there faster in these other frameworks can happen in parallel with any UI we choose to build (if we choose to do it ourselves).

It feels to me like there's a fair amount of overlap with Agent Framework, in terms of configuring complicated CoT/GenAI workflows. What should be done with that RFC versus with this one, and where do they meet?

They meet at the REST API layer. I wrote this considering the current state of API available in 2.9 and making it available easily to as wide an audience as possible.

It may be that an API is not available today but will be available in 3 months. This framework makes it easily accessible today, until first-class API support is available (conceptually, much like a set of "custom tool" modules in Flowise could do the same thing, until a first-class module replaces it.)

In my initial research I considered building this inside ml-commons, but reasoned that some use cases will be external to that plugin. From what I understand of that RFC (and other RFCs such as Conversations, and existing work improving the capabilities of Search Pipelines, etc.) there will be new APIs made available to streamline many specific processes. This is a good thing. The first few templates created may end up obsolete in 6 months, by which time we'll be building templates for even more things to enable them earlier, even if they become obsolete in 6 more months...

@dbwiddis
Copy link
Member Author

Responding to @austintlee :

Is the backend going to live in OpenSearch ("core") itself or will it run as a separate thing?

I'm planning initial rapid development in a Opensearch Plugin. Some general functionality may be moved into core later, particularly if it's useful for other projects.

There will be many overlapping concepts/interfaces/code with search pipelines, ingest pipelines, and others, that will probably find their way into core to be depended on by all projects.

Playing devil's advocate, I want to ask why this can't be all done using Apache Airflow (or something similar). I can see there being something new to bring for the frontend, but not seeing a compelling reason for the backend to be built anew unless it has to be deeply integrated inside the core?

I think the best counter to that is a perspective of "where does the automation live". An external server automates remotely and sends API queries. Automation inside OpenSearch (via a plugin) can be triggered by a single API call and is under control of the cluster to execute.

When using an external webserver-hosted automation framework, we are automating a bunch of API calls coming in to OpenSearch externally.

  • Each call has some amount of latency
  • We have to wait for each API call to return and then send the next one. This presents either long (silent) waiting, or the need for repeated queries to ascertain task progress to update a progress bar.
  • We have little control internally in OpenSearch over throttling/timing execution speed if the cluster has high pressure from other sources
  • We have to take lots of extra external steps (via API query) for version checks (API compatibility), verifying the configuration of plugins, etc.

Automation directly on the cluster has direct access to cluster statistics (e.g., indexing pressure) allowing more stability. Task completion can execute asynchronously with immediate chaining of the next step(s). There's no network latency for API requests, easier checks of versioning/compatibility and plugin availability.

@dbwiddis
Copy link
Member Author

Another reply to @HenryL27

It feels to me like there's a fair amount of overlap with Agent Framework, in terms of configuring complicated CoT/GenAI workflows. What should be done with that RFC versus with this one, and where do they meet?

There is some commonality in the low-level "tools" that could be leveraged/co-developed by both teams. However, we are solving fundamentally different problems.

This framework is solving known execution sequences that can be articulated in a directed graph and executed in a known order. Our primary purpose is enabling rapid prototyping of by exchanging particular components of a known workflow, such as trying different models to improve semantic search.

The Agent Framework says it's solving a "complex problem, the process generally is hard to be predefined. We need to find some way to solve the problem step by step, identify potential solutions to reach a resolution." The order of execution is unknown in advance. I don't see an obvious application in rapid prototyping as you don't even know if/when a particular tool will be used.

@dbwiddis
Copy link
Member Author

Responding to @dylan-tong-aws

1/ Engine configuration automation: we do need to be clear early on in the RFC that we are initially focusing on AI use cases. I also think that if we are more specific than this in the naming, a user might think feature is about automation for things like ISM.

Yes, but...

One question is "why don't we build this in ML-commons" and the answer is that the capability is more generic than ML-commons. Yes we are initially focusing on AI use cases, but we are building a generic capability that will be more easily reused/adapted outside of these cases.

2/ Configurable automation for OpenSearch AI workflows? It should be clear that the scope is OpenSearch and AI. However, this is more than workflows. The core intent is prime the system for specific use cases, and I don't think we capture that intent in the name.

I think this is closest to where we're going. "Configurable automation for OpenSearch AI use cases"?

3/ Backend configuration/priming automation for AI apps?

I'm not sure "priming" is technically specific enough.

@dbwiddis dbwiddis changed the title [RFC] Configurable automation for AI-augmented workflows [RFC] Configurable Automation for OpenSearch AI Use Cases Aug 13, 2023
@navneet1v
Copy link
Contributor

@dbwiddis thanks for putting up the proposal. I tried to follow the whole conversation but might have missed few things, hence please let me know if this question is a already answered I will check that response.

So the question I have is:
I understand the template and drag drop feature is great for setting up complex usecase like RAG, or even little simple like Semantic Search. But what I am not able to understand is if a setup is made for lets say semantic search. A customer wants to do semantic search which is via _search api of Opensearch then does he need to change the api to do search queries? Or what will be the user experience here

@owaiskazi19
Copy link
Member

A customer wants to do semantic search which is via _search api of Opensearch then does he need to change the api to do search queries? Or what will be the user experience here

@navneet1v, we plan to create a pipeline similar to Search and Ingest for chaining of all the plugins required for a specific use case as mentioned here . Later, this pipeline can be use with hot path of OpenSearch be it SearchRequest similar to how Search Pipeline does today.

POST /my-index/_search?pipeline=my_pipeline
{
  "query" : {
    "match" : {
      "text_field" : "search text"
    }
  }
}

@navneet1v
Copy link
Contributor

we plan to create a pipeline similar to Search and Ingest for chaining of all the plugins required for a specific use case as mentioned #9213 (comment)

@owaiskazi19 still not clear. Looking at the above request what I can see is its a simple _search query. You added a query param with value my_pipeline. But bigger question is why this query looks like a text search query?

So is the expectation is we will convert this query to a Neural Search query with "neural" clause and add the model id too in the payload?

@owaiskazi19
Copy link
Member

owaiskazi19 commented Aug 16, 2023

So is the expectation is we will convert this query to a Neural Search query with "neural" clause and add the model id too in the payload?

Yes, since we would take care of uploading the model (and will have access to model_id as well) or for that matter any other use cases of OpenSearch like Multi Modal Search, Vector Search etc. We would mold the query based on the use case selected by the user. The only requirement from the user would be to provide us the pipeline_id with the request once the pipeline has been created using the drag/drop option. The aim of the framework is to provide user with minimal setup to be done for any use case and thus the heavy lifting would be done by the plugin.

@navneet1v
Copy link
Contributor

The only requirement from the user would be to provide us the pipeline_id with the request once the pipeline has been created using the drag/drop option.

with the request once what do you mean by this? like are we saying give us a simple text query doesn't matter you added match or term clause and we will convert it to Neural Query clause. That seems pretty weird. Also how this is different from Search Templates?

@ohltyler
Copy link
Member

The RFC for the frontend no-code designer is here: opensearch-project/OpenSearch-Dashboards#4755

@dbwiddis
Copy link
Member Author

dbwiddis commented Aug 18, 2023

Hey @navneet1v let me try to address your question.

But what I am not able to understand is if a setup is made for lets say semantic search. A customer wants to do semantic search which is via _search api of Opensearch then does he need to change the api to do search queries?

The customer does not need to change anything. The search API will remain as it is.

A customer with less experience with the _search api will be empowered to do more and experiment more easily.

Or what will be the user experience here

For "builders":

  1. The primary benefit of the framework will benefit a builder who is experimenting with different models. Which generative model (or models) work best with their search use cases. They can quickly experiment, try out options, and eventually "publish" a workflow that combines their optimal choices. That workflow will include pre-existing OpenSearch API optimizations when possible, but will enable more flexibility.
  2. There will be some additional benefit for a simplified API available to end users and/or to include OpenSearch as one part of an even more complex workflow developed in a 3P framework that does a lot more workflow coordination outside of OpenSearch. Sure, you can try to encode/templatize a complex search query inside a flowise module, but it would be easier with only the fields you need to provide and no need to refer to the docs to properly format your query and escape literals in your scripts, etc. This is still somewhat of a "builder" experience, but a 3P builder.

For End users,

  1. Automation in multi-query cases.
  2. Simplification in single-query cases.

For a not-yet-existing experience that requires multiple queries, the user experience will be a single query. Take for example RAG. We currently don't have RAG in a single query (although one is proposed in opensearch-project/ml-commons#1150 with an open PR implementing it as a Processor in search pipelines. This is great and may be available in a future version of OpenSearch.) So a RAG template today may be obsolete in 3-6 months, but we still might have an easier builder for experimenting with processors to drag and drop into search pipelines, etc.

For an existing single-query use case such as semantic search, the primary benefit is that the user needs to provide less information and know fewer technical details. Let's consider the query used in this blog post:

GET amazon-review-index-nlp/_search?size=5
{
  "query": {
    "neural": {
      "review_embedding": {
        "query_text": "I love this product so much i bought it twice! But their customer service is TERRIBLE. I received the second glassware broken and did not receive a response for one week and STILL have not heard from anyone to receive my refund. I received it on time, but am not happy at the moment., review_title: Would recommend this product, but not the seller if something goes wrong.",
        "model_id": <model_id>,
        "k": 10
      }
    }
  },
  "fields": ["review_body", "stars"],
  "_source": false
}

A user using this API needs to know:

  • They need to add a "neural" section to their query, inside which
    • They need to know they have to provide a "k" value, and don't really know what a sensible default for this particular vector DB size might be (which the builder would have a better idea for a default)
    • They need to know they have to provide a model_id (although that requirement may go away)
    • They need to know their text search field should be named query_text
    • They can limit the query to a particular vector_field

But they also need to know lots of things about _search API which may be relevant for this use case, such as using a size parameter to limit results, or using a match query to apply a painless script in some use cases such as in this example.

Ultimately the end user just needs to know the template/pipeline ID, and a list of required fields (like indices to search) and optional ones for which the default is pre-populated but they can override.

Search templates can help simplify some of this, but they're more limited, but you could conceptualize these use case templates as somewhat of a superset of search templates, search pipelines, and more.

@navneet1v
Copy link
Contributor

I synced with @dbwiddis on slack. Cleared out few things. The main contention for me was happening because of this example

POST /my-index/_search?pipeline=my_pipeline
{
  "query" : {
    "match" : {
      "text_field" : "search text"
    }
  }
}

It is touching the main search api path by hijacking the _search request.

Ideally it should have been this, which is a different api:

POST /_plugins/<plugin-name>/_search?pipeline=my_pipeline
{
  // payload with keys required by the pipeline which are configured by builder of the pipeline.
}

which after talking to @dbwiddis I got the clarity.

@msfroh
Copy link
Collaborator

msfroh commented Aug 28, 2023

I'm wondering how much incremental progress could be made just by adding an optional self-describing interface to search and ingest pipeline processors. Essentially, this interface could return some kind of processor spec that includes a name and description for the processor, and enumerate the available configuration parameters (with the types, constraints, and description for each parameter).

Basically, you could ask SearchPipelineService and IngestService for their sets of processors. For each processor that implements this self-describing interface, you can show them in a UI-based workflow builder complete with a handy config dialog. (If they don't implement the self-describing interface, I guess you could present a JSON text box as an "expert" config mode.)

As a bonus, this self-describing interface could be used to generate documentation.

@joshpalis
Copy link
Member

I'm wondering how much incremental progress could be made just by adding an optional self-describing interface to search and ingest pipeline processors. Essentially, this interface could return some kind of processor spec that includes a name and description for the processor, and enumerate the available configuration parameters (with the types, constraints, and description for each parameter).
Basically, you could ask SearchPipelineService and IngestService for their sets of processors. For each processor that implements this self-describing interface, you can show them in a UI-based workflow builder complete with a handy config dialog. (If they don't implement the self-describing interface, I guess you could present a JSON text box as an "expert" config mode.)
As a bonus, this self-describing interface could be used to generate documentation.

I think this is a great idea, as it would provide a path forward to automatically map the required inputs for any processor, that way we can support any new processor types without having to manually specify the required inputs for each processor used within a use case template.

@ohltyler
Copy link
Member

ohltyler commented Sep 5, 2023

I'm wondering how much incremental progress could be made just by adding an optional self-describing interface to search and ingest pipeline processors. Essentially, this interface could return some kind of processor spec that includes a name and description for the processor, and enumerate the available configuration parameters (with the types, constraints, and description for each parameter).

Basically, you could ask SearchPipelineService and IngestService for their sets of processors. For each processor that implements this self-describing interface, you can show them in a UI-based workflow builder complete with a handy config dialog. (If they don't implement the self-describing interface, I guess you could present a JSON text box as an "expert" config mode.)

As a bonus, this self-describing interface could be used to generate documentation.

@msfroh I could see this as a potential big maintainability win on the frontend UI like you've mentioned. We are exploring how to persist some of these interfaces on the frontend for this framework, including the parameters for different search & ingest processors. We want to make sure we can scale well as the processor libraries continues to grow.

@owaiskazi19
Copy link
Member

owaiskazi19 commented Sep 19, 2023

Final Backend Design Approach

A workflow setup would be needed to chain all the components together at one place. This setup should take care of the sequencing of the different plugins APIs required for a specific use case and also creating Search or Ingest Pipeline based on the operation selected by the user. A sample flow is outlined below for Semantic Search.

generator (5)

This will be a one time setup and once the above steps are completed a workflow id would be return to the frontend. All the responses received from the above respective APIs would be stored in the global context. This will also help a dependent component to utilize the response of the previous component by reading the global context. Global context can be a system index stored in OpenSearch.

Finally, user would send a request to the backend plugin’s orchestrate API which will facilitate actions defined in the use case template, such as search and ingest. The Orchestrator will handle transforming a request to the use case payload request and invoking the transport APIs that comprise the action in the order in which they are configured. Once the action sequence is completed, the correct response would be returned to the user.

Collaborator: @joshpalis
Design Components

Orchestrator

The Orchestrator is responsible for mapping the workflow ID to the use case template sub-workflows to ascertain the necessary API calls and the order of execution. Upon retrieving the chosen sub-workflow, the Orchestrator then iterates over each API and uses the Payload Generator to transform the user request into the correct format. The resulting query produced by the Payload Generator is then used to invoke this API, and the subsequent response will be used to pre-fill the next API request body if applicable. Upon consolidating all the responses, the final response will be reformatted in such a way that removes unnecessary information, such as number of hits, etc, prior to being sent back to the user.

Payload Generator

The Payload Generator will be responsible for mapping the request to the correct query template and filling in the required fields from both the global context and the user input. The query templates will be mapped to a specific API and will format all the user provided input. Query templates will take inspiration from SearchTemplates, which allow us to specify the query body and required parameters. These query templates will be stored as mustache scripts within the cluster state, similar to how search templates are configured and stored, and will be used to define the required inputs for the user on the front end plugin. Currently, only SearchTemplates utilize mustache scripts to re-write queries, but this can be expanded by the Payload Generator such that any API payload can be re-written into the desired format.

@austintlee
Copy link
Contributor

Will this support arbitrary plugins and extensions (let's say they support ML/AI use cases)?

@dylan-tong-aws
Copy link

dylan-tong-aws commented Oct 10, 2023

@austintlee, there will be multiple integration/extension points. For instance, 1. via the external OpenSearch API 2. by publishing a use case template 3. by contributing building blocks eg. a data processor like a text chunker, a specialized ml processor like a connector to a AI service...etc.

Let's connect sometime, so I can understand how you like to extend this system.

@ramda1234786
Copy link

We will be able to bring the Hugging Face models as well?

As in JSON/YAML document we will just have the below settings and the user will not be able to adjust the response value. Does user will have flexibility to use any LLM model or just pre-defined hardcoded models?

Like we have now pre_process_function and post_process_function

Connectors:
    embeddings: $connector.embeddings[${credential.key}]
    generative_llm: amazon_bedrock_v1[${user_params.engine=Claude2},${credential.iam_role}]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Issues requesting major changes Roadmap:Ease of Use Project-wide roadmap label v2.12.0 Issues and PRs related to version 2.12.0
Projects
Status: 2.12.0 (Launched)
Status: New
Development

No branches or pull requests