Skip to content

NumexaHQ/numexa-python-sdk

Repository files navigation


Build reliable, secure, and production-ready AI apps easily.

Install From Git

pip install git+https://github.com/NumexaHQ/numexa-python-sdk.git#egg=numexa

πŸ’‘ Features

πŸšͺ AI Gateway:

  • Unified API Signature: If you've used OpenAI, you already know how to use Numexa with any other provider.
  • Interoperability: Write once, run with any provider. Switch between any model from any provider seamlessly.
  • Automated Fallbacks & Retries: Ensure your application remains functional even if a primary service fails.
  • Load Balancing: Efficiently distribute incoming requests among multiple models.
  • Semantic Caching: Reduce costs and latency by intelligently caching results.

πŸ”¬ Observability:

  • Logging: Keep track of all requests for monitoring and debugging.
  • Requests Tracing: Understand the journey of each request for optimization.
  • Custom Tags: Segment and categorize requests for better insights.

πŸš€ Quick Start

4️ Steps to Integrate the SDK

  1. Get your Numexa API key and your virtual key for AI providers.
  2. Construct your LLM, add Numexa features, provider features, and prompt.
  3. Construct the Numexa client and set your usage mode.
  4. Now call Numexa regularly like you would call your OpenAI constructor.

Let's dive in! If you are an advanced user and want to directly jump to various full-fledged examples, click here.


Step 1️⃣ : Get your Numexa API Key and your Virtual Keys for AI providers

Numexa API Key: Log into Numexa here, then click on the API Keys link on left and "Click on Generate".

import os
os.environ["NUMEXA_API_KEY"] = "NUMEXA_API_KEY"

Numexa Without proxy:

import os
os.environ["NUMEXA_PROXY"] = "disable"

Virtual Keys: Navigate to the "API Keys" page on Numexa and hit the "Generate" button. Choose your AI provider and assign a unique name to your key. Your virtual key is ready!

Step 2️⃣ : Construct your LLM, add NUmexa features, provider features, and prompt

Numexa Features: You can find a comprehensive list of Numexa features here. This includes settings for caching, retries, metadata, and more.

Provider Features: Numexa is designed to be flexible. All the features you're familiar with from your LLM provider, like top_p, top_k, and temperature, can be used seamlessly. Check out the complete list of provider features here.

Setting the Prompt Input: This param lets you override any prompt that is passed during the completion call - set a model-specific prompt here to optimise the model performance. You can set the input in two ways. For models like Claude and GPT3, use prompt = (str), and for models like GPT3.5 & GPT4, use messages = [array].

Here's how you can combine everything:

from numexa import LLMOptions

# Numexa Config
provider = "openai"
virtual_key = "key_a"
trace_id = "numexa_sdk_test"

# Model Settings
model = "gpt-4"
temperature = 1

# User Prompt
messages = [{"role": "user", "content": "Who are you?"}]

# Construct LLM
llm = LLMOptions(provider=provider, virtual_key=virtual_key, trace_id=trace_id, model=model, temperature=temperature)

Step 3️⃣ : Construct the Numexa Client

Numexa client's config takes 3 params: api_key, mode, llms.

  • api_key: You can set your NUmexa API key here or with os.ennviron as done above.
  • mode: There are 3 modes - Single, Fallback, Loadbalance.
    • Single - This is the standard mode. Use it if you do not want Fallback OR Loadbalance features.
    • Fallback - Set this mode if you want to enable the Fallback feature.
    • Loadbalance - Set this mode if you want to enable the Loadbalance feature.
  • llms: This is an array where we pass our LLMs constructed using the LLMOptions constructor.
import asyncio
import os

# For Observability (Mandatory)
os.environ["NUMEXA_API_KEY"] = "Your Key"  

# By Default proxy is always Enabled, If we do not want any proxy
os.environ["NUMEXA_PROXY"] = "disable"

# We need to set OPEN_API_KEY in case of Zero Proxy Overhead or Numexa-Free-Version Expired 
os.environ["OPEN_API_KEY"] = "Bearer YOURKEY"

import numexa
from numexa import Config, LLMOptions

llm = LLMOptions(provider="openai", model="gpt-4", virtual_key="a"),
numexa.config = Config(mode="single",llms=[llm])

Step 4️⃣ : Let's Call the Numexa Client!

The Numexa client can do ChatCompletions and Completions.

Since our LLM is GPT4, we will use ChatCompletions:

# noinspection PyUnresolvedReferences
async def jarvis():
    response = await numexa.ChatCompletions.create(
        messages=[{
            "role": "user",
            "content": "Capital Of India?"
        }]
    )
    print(response)

You have integrated Numexa Python SDK in just 4 steps!


πŸ” Demo: Implementing GPT4 to GPT3.5 Fallback Using the Numexa SDK

import asyncio
import os

# For Observability (Mandatory)
os.environ["NUMEXA_API_KEY"] = "Your Key"  

# By Default proxy is always Enabled, If we do not want any proxy
os.environ["NUMEXA_PROXY"] = "disable"

# We need to set OPEN_API_KEY in case of Zero Proxy Overhead or Numexa-Free-Version Expired 
os.environ["OPEN_API_KEY"] = "Bearer YOURKEY"

import numexa
from numexa import Config, LLMOptions

# Let's construct our LLMs.
llm1 = LLMOptions(provider="openai", model="gpt-3.5-turbo-16k-0613", virtual_key="a")
llm2 = LLMOptions(provider="openai", model="gpt-4", virtual_key="b")

# In case of single LLM
numexa.config = Config(mode="single", llms=[llm1, ])
OR
# In case of Multiple LLM
numexa.config = Config(mode="fallback", llms=[llm1, llm2])

async def jarvis():
    response = await numexa.ChatCompletions.create(
        messages=[{
            "role": "user",
            "content": "Who is Anu kapoor?"
        }]
    )
    print(response)

async def main():
    await asyncio.gather(jarvis())


if __name__ == '__main__':
    asyncio.run(main())

πŸ“” Full List of Numexa Config

Feature Config Key Value(Type) Required
Provider Name provider string βœ… Required
Model Name model string βœ… Required
Virtual Key OR API Key virtual_key or api_key string βœ… Required (can be set externally)
Cache Type cache_status simple, semantic ❔ Optional
Force Cache Refresh cache_force_refresh True, False (Boolean) ❔ Optional
Cache Age cache_age integer (in seconds) ❔ Optional
Trace ID trace_id string ❔ Optional
Retries retry integer [0,5] ❔ Optional
Metadata metadata json object More info ❔ Optional

🀝 Supported Providers

Provider Support Status Supported Endpoints
OpenAI βœ… Supported /completion, /embed
Azure OpenAI βœ… Supported /completion, /embed
Anthropic βœ… Supported /complete
Cohere 🚧 Coming Soon generate, embed

follow on Twitter Discord

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published