Orchestrator REST Microservice

This toolkit provides an orchestrator microservice that integrates PrimeQA's retriever & reader modules as a REST Server and also other "search" capabilities e.g. IBM Watson Discovery.

Hence, using this orchestrator one can either integrate a neural retriever like ColBERT from PrimeQA or external search e.g. IBM Watson Discovery to fetch documents and then use PrimeQA's reader to extract answer spans from those relevant documents.

✔️ Getting Started

Repository

✅ Prerequisites

Python 3.9

⚙️ Setup

📓 Third-party dependencies

PrimeQA: If you don't have access to running PrimeQA instance, then please refer to PrimeQA repository for more details on setting and running a local one.
Watson Discovery (Optional): Follow instructions on IBM Cloud to configure Watson Discovery V2 service.

🧩 Setup Local Environment

Setup and activate a Virtual Environment (as shown below) or use Miniconda

# Install virtualenv
pip3 install virtualenv

# Create a new virtual environment for this project. If using pyenv, path_to_python_3.9_executable will be ~/.pyenv/versions/3.9.x/bin/python
virtualenv --python=<path_to_python_3.9_executable> venv

# Activate virtual environment
source venv/bin/activate

Install dependencies

pip install -r requirements.txt
pip install -r requirements_test.txt

🐛 gprcio and grpcio-tools has limited support on Apple Silicone (M1, M2). Please refer to grpc github issue#25082 for details or download appropriate wheels from here.

📜 TLS and Certificate Management

Orchestrator service REST server supports mutual or two-way TLS authentication (also known as mTLS). Application's config.ini file contains the default certificate paths, but they can be overridden using environment variables.

Self-signed certificates are generated and packaged with the Docker build. Self-signed certs may be required for local development and testing. If you want to generate them, follow the steps below:

#!/usr/bin/env bash

# Make neccessary directories
mkdir -p security/
mkdir -p security/certs/
mkdir -p security/certs/ca security/certs/server security/certs/client

# Generate CA key and CA cert
openssl req -x509 -days 365 -nodes -newkey rsa:4096 -subj "/C=US/ST=New York/L=Yorktown Heights/O=IBM/OU=Research/CN=example.com" -keyout security/certs/ca/ca.key -out security/certs/ca/ca.crt

# Generate Server key (without passphrase) and Server cert signing request
openssl req -nodes -new -newkey rsa:4096 -subj "/C=US/ST=New York/L=Yorktown Heights/O=IBM/OU=Research/CN=example.com" -keyout security/certs/server/server.key -out security/certs/server/server.csr

# Sign Server cert
openssl x509 -req -days 365 -in security/certs/server/server.csr -CA security/certs/ca/ca.crt -CAkey security/certs/ca/ca.key -CAcreateserial -out security/certs/server/server.crt

# Generate Client key (without passphrase) and Client cert signing request
openssl req -nodes -new -newkey rsa:4096 -subj "/C=US/ST=New York/L=Yorktown Heights/O=IBM/OU=Research/CN=example.com" -keyout security/certs/client/client.key -out security/certs/client/client.csr

# Sign Client cert
openssl x509 -req -days 365 -in security/certs/client/client.csr -CA security/certs/ca/ca.crt -CAkey security/certs/ca/ca.key -CAserial security/certs/ca/ca.srl -out security/certs/client/client.crt

# Delete signing requests
rm -rf security/certs/server/server.csr
rm -rf security/certs/client/client.csr

IMPORTANT:

By default, the application tries to load certs from /opt/tls. You will need to update appropriate tls_* variables in config.ini during local use.
We recommend to generate certificates with official signing authority and use them via volume mounts in the application container.

🛠 Build & Deployment

💻 Local

Open Python IDE & set the created virtual environment
Open orchestrator/services/config/config.ini, set require_ssl = True (if you wish to use TLS authentication) & rest_port

Generate GRPC:

#!/usr/bin/env bash
set -xeuo pipefail
python -m grpc_tools.protoc -I ./orchestrator/integrations/primeqa/protos --python_out=orchestrator/integrations/primeqa/grpc_generated --grpc_python_out=orchestrator/integrations/primeqa/grpc_generated orchestrator/integrations/primeqa/protos/indexer.proto
python -m grpc_tools.protoc -I ./orchestrator/integrations/primeqa/protos --python_out=orchestrator/integrations/primeqa/grpc_generated --grpc_python_out=orchestrator/integrations/primeqa/grpc_generated orchestrator/integrations/primeqa/protos/parameter.proto
python -m grpc_tools.protoc -I ./orchestrator/integrations/primeqa/protos --python_out=orchestrator/integrations/primeqa/grpc_generated --grpc_python_out=orchestrator/integrations/primeqa/grpc_generated orchestrator/integrations/primeqa/protos/reader.proto
python -m grpc_tools.protoc -I ./orchestrator/integrations/primeqa/protos --python_out=orchestrator/integrations/primeqa/grpc_generated --grpc_python_out=orchestrator/integrations/primeqa/grpc_generated orchestrator/integrations/primeqa/protos/retriever.proto
2to3 --fix=import --nobackups --write orchestrator/integrations/primeqa/grpc_generated

Open application.py and run/debug
Go to http://localhost:{rest_port}/docs
To be able to use reader, indexer and retriever services, be sure you have access to running instance of PrimeQA container

💻 Docker

Open config.ini and set rest_port
Open Dockerfile and set the same value to port
Run docker build -f Dockerfile -t primeqa-orchestrator:$(cat VERSION) . (creates docker image)
Run docker run --rm --name primeqa-orchestrator -d -p <port>:<port> --mount type=bind,source="$(pwd)"/store,target=/store -e STORE_DIR=/store primeqa-orchestrator:$(cat VERSION) (run docker container)
Go to <http://{Container's public URL}:{rest_port}/docs>
To be able to use reader, indexer and retriever services, be sure you have access to running instance of PrimeQA container

🚨 Configure

Before first use, you will need to specify few neccessary configurations to connect to third-party depedencies. These setting are intentionally left blank for security purposes.
Go to STORE_DIR directory on your local machine and copy the primeqa.json file in that directory.

You will need to add/update the settings portion in primeqa.json file. Primarily add service_endpoint information (inclusive of port) for PrimeQA in retriever and reader sections in settings.

a. To use a IBM® Watson Discovery based retriever, add/update Watson Discovery add the following to the list in the retrievers section.

    "Watson Discovery": {
        "service_endpoint": "<IBM® Watson Discovery Cloud/CP4D Instance Endpoint>",
        "service_api_key": "<API key (If using IBM® Watson Discovery Cloud instance)>",
        "service_project_id": "<IBM® Watson Discovery Project ID>"
    }

b. For PrimeQA based retrievers, add/update PrimeQA related section in retrievers as follows

    "PrimeQA": {
        "service_endpoint": "<Primeqa Instance Endpoint>:<Port>"
    }

c. For PrimeQA based readers, add/update PrimeQA related section in readers as follows

    "PrimeQA": {
        "service_endpoint": "<Primeqa Instance Endpoint>:<Port>",
        "beta": 0.7
    }

For example, to enable both IBM® Watson Discovery instance based retriever and PrimeQA based retrievers and PrimeQA based reader, the settings will look as follows

{
  "retrievers": {
    "Watson_Discovery": {
      "service_endpoint": "<IBM® Watson Discovery CP4D Instance Endpoint>",
      "service_api_key": "<API key (If using IBM® Watson Discovery Cloud instance)>",
      "service_project_id": "<IBM® Watson Discovery Project ID>"
    },
    "PrimeQA": {
      "service_endpoint": "<Primeqa Instance Endpoint>:<Port>"
    }
  },
  "readers": {
    "PrimeQA": {
      "service_endpoint": "<Primeqa Instance Endpoint>:<Port>",
      "beta": 0.7
    }
  }
}

NOTE: The final scoring and ranking is done with a weighted sum of the Reader answer scores and Retriever search hits scores. The beta field is the weight assigned to the reader scores and 1-beta is the weight assigned to the retriever scores.

🧪 Testing

To see all available retrievers, execute [GET] /retrievers endpoint

	curl -X 'GET' 'http://{PUBLIC_IP}:50059/retrievers' -H 'accept: application/json'

To see all available readers, execute [GET] /readers endpoint

	curl -X 'GET' 'http://{PUBLIC_IP}:50059/readers' -H 'accept: application/json'

Frequenty Asked Questions (FAQs)

1. How do I get feedbacks to fine tune my reader model?

  curl -X 'GET' \
'http://localhost:50059/feedbacks?application=reading&application=qa&_format=primeqa' \
-H 'accept: application/json' > feedbacks.json

2. How do I get feedbacks to fine tune my retriever model?

  curl -X 'GET' \
'http://localhost:50059/feedbacks?application=retrieval&_format=primeqa' \
-H 'accept: application/json' > feedbacks.json

📄 Documentation Sync

Keep PrimeQA documentation reference sync
Anytime this README files is updated, it is necessary to open a PR on PrimeQA repository to update, with the same modifications, the associated file used on documentation page.
Do not modify initial image path

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
data		data
orchestrator		orchestrator
static		static
tests		tests
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
VERSION		VERSION
requirements.txt		requirements.txt
requirements_test.txt		requirements_test.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orchestrator REST Microservice

✔️ Getting Started

✅ Prerequisites

⚙️ Setup

📓 Third-party dependencies

🧩 Setup Local Environment

📜 TLS and Certificate Management

🛠 Build & Deployment

💻 Local

💻 Docker

🚨 Configure

🧪 Testing

Frequenty Asked Questions (FAQs)

1. How do I get feedbacks to fine tune my reader model?

2. How do I get feedbacks to fine tune my retriever model?

📄 Documentation Sync

About

Releases 1

Packages

Contributors 5

Languages

License

primeqa/primeqa-orchestrator

Folders and files

Latest commit

History

Repository files navigation

Orchestrator REST Microservice

✔️ Getting Started

✅ Prerequisites

⚙️ Setup

📓 Third-party dependencies

🧩 Setup Local Environment

📜 TLS and Certificate Management

🛠 Build & Deployment

💻 Local

💻 Docker

🚨 Configure

🧪 Testing

Frequenty Asked Questions (FAQs)

1. How do I get feedbacks to fine tune my reader model?

2. How do I get feedbacks to fine tune my retriever model?

📄 Documentation Sync

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 5

Languages

Packages