Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP litellm integration #320

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/images/azure_openai.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
126 changes: 126 additions & 0 deletions docs/vectorizer-api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -251,10 +251,136 @@ generated for your data.

The embedding functions are:

- [ai.embedding_litellm](#aiembedding_litellm)
- [ai.embedding_openai](#aiembedding_openai)
- [ai.embedding_ollama](#aiembedding_ollama)
- [ai.embedding_voyageai](#aiembedding_voyageai)

### ai.embedding_litellm

You call the `ai.embedding_litellm` function to use LiteLLM to generate embeddings for models from multiple providers.

The purpose of `ai.embedding_litellm` is to:
- Define the embedding model to use.
- Specify the dimensionality of the embeddings.
- Configure optional, provider-specific parameters.
- Set the name of the environment variable that holds the value of your API key.

#### Example usage

Use `ai.embedding_litellm` to create an embedding configuration object that is passed as an argument to [ai.create_vectorizer](#create-vectorizers):

1. Set the required API key for your provider.

The API key should be set as an environment variable which is available to either the Vectorizer worker, or the
Postgres process.

2. Create a vectorizer using LiteLLM to access the 'microsoft/codebert-base' embedding model on huggingface:

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'huggingface/microsoft/codebert-base',
768,
api_key_name => 'HUGGINGFACE_API_KEY',
extra_options => '{"wait_for_model": true}'::jsonb
),
-- other parameters...
);
```

#### Parameters

The function takes several parameters to customize the LiteLLM embedding configuration:

| Name | Type | Default | Required | Description |
|---------------|-------|---------|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
| model | text | - | ✔ | Specify the name of the embedding model to use. Refer to the [LiteLLM embedding documentation] for an overview of the available providers and models. |
| dimensions | int | - | ✔ | Define the number of dimensions for the embedding vectors. This should match the output dimensions of the chosen model. |
| api_key_name | text | - | ✖ | Set the name of the environment variable that contains the API key. This allows for flexible API key management without hardcoding keys in the database. |
| extra_options | jsonb | - | ✖ | Set provider-specific configuration options. |

[LiteLLM embedding documentation]: https://docs.litellm.ai/docs/embedding/supported_embedding


#### Returns

A JSON configuration object that you can use in [ai.create_vectorizer](#create-vectorizers).

#### Provider-specific configuration examples

The following subsections show how to configure the vectorizer for all supported providers.

##### Cohere

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'cohere/embed-english-v3.0',
1024,
api_key_name => 'COHERE_API_KEY',
),
-- other parameters...
);
```

Note: The [Cohere documentation on input_type] specifies that the `input_type` parameter is required.
By default, LiteLLM sets this to `search_document`. The input type can be provided
via `extra_options`, i.e. `extra_options => '{"input_type": "search_document"}'::jsonb`.

[Cohere documentation on input_type]: https://docs.cohere.com/v2/docs/embeddings#the-input_type-parameter

#### Mistral

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'mistral/mistral-embed',
1024,
api_key_name => 'MISTRAL_API_KEY',
),
-- other parameters...
);
```

Note: Mistral limits the maximum input per batch to 16384 tokens.

##### Azure OpenAI

To set up a vectorizer with Azure OpenAI you require three values from the Azure AI Foundry console:
- deployment name
- endpoint
- version

![Azure AI Foundry console example](./images/azure_openai.png)

Configure the vectorizer:

```sql
SELECT ai.create_vectorizer(
'my_table'::regclass,
embedding => ai.embedding_litellm(
'azure/<deployment name here>',
1024,
api_key_name => 'AZURE_OPENAI_API_KEY',
extra_options => '{"api_base": "<endpoint here>", "api_version": "<version here>"}'::jsonb
),
-- other parameters...
);
```

#### AWS Bedrock

TODO

#### Vertex AI

TODO


### ai.embedding_openai

You call the `ai.embedding_openai` function to use an OpenAI model to generate embeddings.
Expand Down
10 changes: 9 additions & 1 deletion docs/vectorizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,19 @@ textual, data analysis and semantic search:

## Select an embedding provider and set up your API Keys

Vectorizer supports the following vector embedding providers:
Vectorizer supports the following vector embedding providers as first-party integrations:
- [Ollama](https://ollama.com/)
- [Voyage AI](https://www.voyageai.com/)
- [OpenAI](https://openai.com/)

Additionally, through the [LiteLLM](https://litellm.ai) provider we support:
- [Cohere](https://cohere.com/)
- [HuggingFace Inference Endpoints](https://endpoints.huggingface.co/)
- [Mistral](https://mistral.ai/)
- [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service)
- [AWS Bedrock](https://aws.amazon.com/bedrock/)
- [Vertex AI](https://cloud.google.com/vertex-ai)

When using an external embedding service, you need to setup your API keys to access
the service. To store several API keys, you give each key a name and reference them
in the `embedding` section of the Vectorizer configuration. The default API key
Expand Down
35 changes: 35 additions & 0 deletions projects/extension/ai/litellm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import litellm
from typing import Optional, Generator


def embed(
model: str,
input: list[str],
api_key: str,
user: Optional[str] = None,
dimensions: Optional[int] = None,
timeout: Optional[int] = None,
api_base: Optional[str] = None,
api_version: Optional[str] = None,
api_type: Optional[str] = None,
organization: Optional[str] = None,
**kwargs,
) -> Generator[tuple[int, list[float]], None, None]:
if organization is not None:
litellm.organization = organization
response = litellm.embedding(
model=model,
input=input,
user=user,
dimensions=dimensions,
timeout=timeout,
api_type=api_type,
api_key=api_key,
api_base=api_base,
api_version=api_version,
**kwargs,
)
if not hasattr(response, "data"):
return None
for idx, obj in enumerate(response["data"]):
yield idx, obj["embedding"]
20 changes: 7 additions & 13 deletions projects/extension/ai/secrets.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,27 +24,21 @@ def remove_secret_from_cache(sd_cache: dict[str, str], secret_name: str):

def get_secret(
plpy,
secret: Optional[str],
secret_name: Optional[str],
secret_name_default: str,
sd_cache: Optional[dict[str, str]],
) -> str:
secret: Optional[str] = None,
secret_name: Optional[str] = None,
secret_name_default: Optional[str] = None,
sd_cache: Optional[dict[str, str]] = None,
) -> str | None:
if secret is not None:
return secret

if secret_name is None:
secret_name = secret_name_default

if secret_name is None or secret_name == "":
plpy.error("secret_name is required")

secret = reveal_secret(plpy, secret_name, sd_cache)
if secret is None:
plpy.error(f"missing {secret_name} secret")
# This line should never be reached, but it's here to make the type checker happy.
return ""
return None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there existing code that depends on this condition throwing an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I still need to do a full audit of all of the callers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now audited the callers - from what I can tell none of our code depends on this throwing an error.

The motivation for this change is that with litellm there is no sensible "default secret name" that we could configure because every provider has different conventions for secret naming. This motivated the change on line 29 to allow for secret_name_default to be None.


return secret
return reveal_secret(plpy, secret_name, sd_cache)


def check_secret_permissions(plpy, secret_name: str) -> bool:
Expand Down
4 changes: 3 additions & 1 deletion projects/extension/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
openai==1.44.0
openai==1.56.0
tiktoken==0.7.0
ollama==0.4.5
anthropic==0.29.0
cohere==5.5.8
backoff==2.2.1
voyageai==0.3.1
datasets==3.1.0
litellm==1.55.4
google-cloud-aiplatform==1.74.0 # required for vertexAI (don't know why litellm doesn't include this)
26 changes: 26 additions & 0 deletions projects/extension/sql/idempotent/008-embedding.sql
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,30 @@ $func$ language plpgsql immutable security invoker
set search_path to pg_catalog, pg_temp
;

-------------------------------------------------------------------------------
-- embedding_litellm
create or replace function ai.embedding_litellm
( model pg_catalog.text
, dimensions pg_catalog.int4
, api_key_name pg_catalog.text default null
, extra_options pg_catalog.jsonb default null
) returns pg_catalog.jsonb
as $func$
begin
return json_object
( 'implementation': 'litellm'
, 'config_type': 'embedding'
, 'model': model
, 'dimensions': dimensions
, 'api_key_name': api_key_name
, 'extra_options': extra_options
absent on null
);
end
$func$ language plpgsql immutable security invoker
set search_path to pg_catalog, pg_temp
;

-------------------------------------------------------------------------------
-- _validate_embedding
create or replace function ai._validate_embedding(config pg_catalog.jsonb) returns void
Expand All @@ -98,6 +122,8 @@ begin
-- ok
when 'voyageai' then
-- ok
when 'litellm' then
-- ok
else
if _implementation is null then
raise exception 'embedding implementation not specified';
Expand Down
58 changes: 58 additions & 0 deletions projects/extension/sql/idempotent/017-litellm.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
-------------------------------------------------------------------------------
-- litellm_embed
-- generate an embedding from a text value
create or replace function ai.litellm_embed
( model text
, input_text text
, api_key text default null
, api_key_name text default null
, extra_options jsonb default null
) returns @extschema:vector@.vector
as $python$
#ADD-PYTHON-LIB-DIR
import ai.litellm
import ai.secrets
options = {}
if extra_options is not None:
import json
options = {k: v for k, v in json.loads(extra_options).items()}

api_key_resolved = ai.secrets.get_secret(plpy, api_key, api_key_name, None, SD)
for tup in ai.litellm.embed(model, [input_text], api_key=api_key_resolved, **options):
return tup[1]
$python$
language plpython3u immutable parallel safe security invoker
set search_path to pg_catalog, pg_temp
;

-------------------------------------------------------------------------------
-- litellm_embed
-- generate embeddings from an array of text values
create or replace function ai.litellm_embed
( model text
, input_texts text[]
, api_key text default null
, api_key_name text default null
, extra_options jsonb default null
) returns table
( "index" int
, embedding @extschema:vector@.vector
)
as $python$
#ADD-PYTHON-LIB-DIR
import ai.litellm
import ai.secrets
options = {}
if extra_options is not None:
import json
options = {k: v for k, v in json.loads(extra_options).items()}

plpy.log("options", options)

api_key_resolved = ai.secrets.get_secret(plpy, api_key, api_key_name, None, SD)
for tup in ai.litellm.embed(model, input_texts, api_key=api_key_resolved, **options):
yield tup
$python$
language plpython3u immutable parallel safe security invoker
set search_path to pg_catalog, pg_temp
;
5 changes: 4 additions & 1 deletion projects/extension/tests/contents/output16.expected
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ CREATE EXTENSION
function ai.create_vectorizer(regclass,name,jsonb,jsonb,jsonb,jsonb,jsonb,jsonb,name,name,name,name,name,name,name[],boolean)
function ai.disable_vectorizer_schedule(integer)
function ai.drop_vectorizer(integer,boolean)
function ai.embedding_litellm(text,integer,text,jsonb)
function ai.embedding_ollama(text,integer,text,jsonb,text)
function ai.embedding_openai(text,integer,text,text)
function ai.embedding_voyageai(text,integer,text,text)
Expand All @@ -34,6 +35,8 @@ CREATE EXTENSION
function ai.indexing_diskann(integer,text,integer,integer,double precision,integer,integer,boolean)
function ai.indexing_hnsw(integer,text,integer,integer,boolean)
function ai.indexing_none()
function ai.litellm_embed(text,text,text,text,jsonb)
function ai.litellm_embed(text,text[],text,text,jsonb)
function ai.load_dataset_multi_txn(text,text,text,name,name,text,jsonb,integer,integer,integer,jsonb)
function ai.load_dataset(text,text,text,name,name,text,jsonb,integer,integer,jsonb)
function ai.ollama_chat_complete(text,jsonb,text,text,jsonb)
Expand Down Expand Up @@ -94,7 +97,7 @@ CREATE EXTENSION
table ai.vectorizer_errors
view ai.secret_permissions
view ai.vectorizer_status
(90 rows)
(93 rows)

Table "ai._secret_permissions"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
Expand Down
5 changes: 4 additions & 1 deletion projects/extension/tests/contents/output17.expected
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ CREATE EXTENSION
function ai.create_vectorizer(regclass,name,jsonb,jsonb,jsonb,jsonb,jsonb,jsonb,name,name,name,name,name,name,name[],boolean)
function ai.disable_vectorizer_schedule(integer)
function ai.drop_vectorizer(integer,boolean)
function ai.embedding_litellm(text,integer,text,jsonb)
function ai.embedding_ollama(text,integer,text,jsonb,text)
function ai.embedding_openai(text,integer,text,text)
function ai.embedding_voyageai(text,integer,text,text)
Expand All @@ -34,6 +35,8 @@ CREATE EXTENSION
function ai.indexing_diskann(integer,text,integer,integer,double precision,integer,integer,boolean)
function ai.indexing_hnsw(integer,text,integer,integer,boolean)
function ai.indexing_none()
function ai.litellm_embed(text,text,text,text,jsonb)
function ai.litellm_embed(text,text[],text,text,jsonb)
function ai.load_dataset_multi_txn(text,text,text,name,name,text,jsonb,integer,integer,integer,jsonb)
function ai.load_dataset(text,text,text,name,name,text,jsonb,integer,integer,jsonb)
function ai.ollama_chat_complete(text,jsonb,text,text,jsonb)
Expand Down Expand Up @@ -108,7 +111,7 @@ CREATE EXTENSION
type ai.vectorizer_status[]
view ai.secret_permissions
view ai.vectorizer_status
(104 rows)
(107 rows)

Table "ai._secret_permissions"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
Expand Down
Loading
Loading