This is a Cloudflare worker that acts as an adapter to proxy requests to the OpenAI Chat Completions API to other AI services like Claude, Azure OpenAI, and Google Palm.
This project draws inspiration from, and is based on, the work of haibbo/cf-openai-azure-proxy. If you only require a single service proxy (such as Claude/Azure), please consider using the original project. This project, however, is more complex and is specifically designed to accommodate multiple services.
curl TODO
The worker allows you to make requests to the OpenAI API format and seamlessly redirect them under the hood to other services. This provides a unified API interface and abstraction layer across multiple AI providers.
- Single endpoint to access multiple AI services;
- Chat to Multiple models in a single request;
- Unified request/response format using the OpenAI API;
- Handles streaming responses;
- Single API key for authentication;
- Support for OpenAI, Claude, Azure OpenAI, and Google Palm;
- Support multiple resource config for Azure OpenAI;
- Multiple “ONE_API_KEY” support;
- Logging request by keys, pricing and token count;
- Throttling;
To use the adapter, simply make requests to the worker endpoint with the OpenAI JSON request payload.
Behind the scenes the worker will:
- Route requests to the appropriate backend based on the `model` specified
- Transform request payload to the destination API format
- Proxy the request and response
- Convert responses back to OpenAI format
For example, to use gpt-3.5-turbo
:
{
"model": "gpt-3.5-turbo",
"stream": true,
"messages": [
{
"role": "user",
"content": "Hello there!"
}
]
}
To use claude-2
:
{
"model": "claude-2",
"stream": true,
"messages": [...]
}
You can specify multiple models (delimitered by ,
) to query in parallel:
{
"model": "gpt-3.5-turbo,claude-2",
"stream": true,
"messages": [...]
}
The response will contain the concatenated output from both models streamed in the OpenAI API format.
Other OpenAI parameters like `temperature`, `stream`, `stop` etc. can also be specified normally.
import openai
openai.api_key = "<your specified API_KEY>"
openai.api_base = "<your worker endpoint>/v1"
# For example, the local wrangler development endpoint
# openai.api_key = 'sk-fakekey'
# openai.api_base = "http://127.0.0.1:8787/v1"
chat_completion = openai.ChatCompletion.create(
model="gpt-4,claude-2",
messages=[
{
"role": "user",
"content": "A brief introduction about yourself and say hello!",
}
],
stream=True,
)
for chunk in chat_completion:
if chunk["choices"]:
print(chunk["model"], chunk["choices"][0]["delta"].get("content", ""))
Here are the models currently supported by the adapter service:
To use a particular model, specify its ID in the `model` field of the request body.
All the chat models available by your OPENAI_API_KEY
Based on your deployment name, you will have to set the environment variable
AZURE_OPENAI_API_KEY
to the corresponding API key.
You can also setup multiple deployments with different API keys to access different models.
// TODO:
- State “TODO” from [2023-09-04 Mon 23:24]
- claude-instant-1(claude-instant-1.2)
- claude-2(claude-2.0)
- State “TODO” from [2023-09-04 Mon 23:24]
- text-bison-001
- chat-bison-001
To deploy, you will need:
- Cloudflare account
- API keys for each service
npm i wrangler -g
wrangler kv:namespace create ONELLM_KV
# if you need to test in the local wrangler dev
wrangler kv:namespace create ONELLM_KV --preview
Configure the worker environment variables with your secret keys.
Skip the service key if you do not have one or you do not want to deploy it.
wrangler secret put ONE_API_KEY
wrangler secret put OPENAI_API_KEY
wrangler secret put AZURE_OPENAI_API_KEYS
wrangler secret put ANTHROPIC_API_KEY
wrangler secret put PALM_API_KEY
Or you can add the keys after deploy using the Cloudflare dashboard.
Worker -> Settings -> Variables -> Environment Variables
wrangler depoly
Create a .dev.vars
with your environment API_KEYs, then run:
wrangler dev
curl -vvv http://127.0.0.1:8787/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer sk-fakekey" -d '{
"model": "gpt-3.5-turbo,claude-2", "stream": true,
"messages": [{"role": "user", "content": "Say: Hello I am your helpful one Assistant."}]
}'
Contributions and improvements are welcome! Please open GitHub issues or PRs.
Let me know if you would like any changes or have additional sections to add!