timescale · JamesGuthrie · Dec 18, 2024 · Dec 20, 2024 · jgpruitt · Dec 18, 2024
@@ -251,10 +251,136 @@ generated for your data.
 
 The embedding functions are:
 
+- [ai.embedding_litellm](#aiembedding_litellm)
 - [ai.embedding_openai](#aiembedding_openai)
 - [ai.embedding_ollama](#aiembedding_ollama)
 - [ai.embedding_voyageai](#aiembedding_voyageai)
 
+### ai.embedding_litellm
+
+You call the `ai.embedding_litellm` function to use LiteLLM to generate embeddings for models from multiple providers.
+
+The purpose of `ai.embedding_litellm` is to:
+- Define the embedding model to use.
+- Specify the dimensionality of the embeddings.
+- Configure optional, provider-specific parameters.
+- Set the name of the environment variable that holds the value of your API key.  
+
+#### Example usage
+
+Use `ai.embedding_litellm` to create an embedding configuration object that is passed as an argument to [ai.create_vectorizer](#create-vectorizers):
+
+1. Set the required API key for your provider.
+
+   The API key should be set as an environment variable which is available to either the Vectorizer worker, or the
+   Postgres process.
+
+2. Create a vectorizer using LiteLLM to access the 'microsoft/codebert-base' embedding model on huggingface: 
+
+    ```sql
+    SELECT ai.create_vectorizer(
+        'my_table'::regclass,
+        embedding => ai.embedding_litellm(
+          'huggingface/microsoft/codebert-base',
+          768,
+          api_key_name => 'HUGGINGFACE_API_KEY',
+          extra_options => '{"wait_for_model": true}'::jsonb
+        ),
+        -- other parameters...
+    );
+    ```
+
+#### Parameters
+
+The function takes several parameters to customize the LiteLLM embedding configuration:
+
+| Name          | Type  | Default | Required | Description                                                                                                                                              |
+|---------------|-------|---------|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------| 
+| model         | text  | -       | ✔        | Specify the name of the embedding model to use. Refer to the [LiteLLM embedding documentation] for an overview of the available providers and models.    |
+| dimensions    | int   | -       | ✔        | Define the number of dimensions for the embedding vectors. This should match the output dimensions of the chosen model.                                  |
+| api_key_name  | text  | -       | ✖        | Set the name of the environment variable that contains the API key. This allows for flexible API key management without hardcoding keys in the database. |
+| extra_options | jsonb | -       | ✖        | Set provider-specific configuration options.                                                                                                             |
+
+[LiteLLM embedding documentation]: https://docs.litellm.ai/docs/embedding/supported_embedding
+
+
+#### Returns
+
+A JSON configuration object that you can use in [ai.create_vectorizer](#create-vectorizers).
+
+#### Provider-specific configuration examples
+
+The following subsections show how to configure the vectorizer for all supported providers.
+
+##### Cohere
+
+```sql
+    SELECT ai.create_vectorizer(
+        'my_table'::regclass,
+        embedding => ai.embedding_litellm(
+          'cohere/embed-english-v3.0',
+          1024,
+          api_key_name => 'COHERE_API_KEY',
+        ),
+        -- other parameters...
+    );
+```
+
+Note: The [Cohere documentation on input_type] specifies that the `input_type` parameter is required.
+By default, LiteLLM sets this to `search_document`. The input type can be provided
+via `extra_options`, i.e. `extra_options => '{"input_type": "search_document"}'::jsonb`.
+
+[Cohere documentation on input_type]: https://docs.cohere.com/v2/docs/embeddings#the-input_type-parameter
+
+#### Mistral
+
+```sql
+    SELECT ai.create_vectorizer(
+        'my_table'::regclass,
+        embedding => ai.embedding_litellm(
+          'mistral/mistral-embed',
+          1024,
+          api_key_name => 'MISTRAL_API_KEY',
+        ),
+        -- other parameters...
+    );
+```
+
+Note: Mistral limits the maximum input per batch to 16384 tokens.
+
+##### Azure OpenAI
+
+To set up a vectorizer with Azure OpenAI you require three values from the Azure AI Foundry console:
+- deployment name
+- endpoint
+- version
+
+![Azure AI Foundry console example](./images/azure_openai.png)
+
+Configure the vectorizer:
+
+```sql
+    SELECT ai.create_vectorizer(
+        'my_table'::regclass,
+        embedding => ai.embedding_litellm(
+          'azure/<deployment name here>',
+          1024,
+          api_key_name => 'AZURE_OPENAI_API_KEY',
+          extra_options => '{"api_base": "<endpoint here>", "api_version": "<version here>"}'::jsonb
+        ),
+        -- other parameters...
+    );
+```
+
+#### AWS Bedrock
+
+TODO
+
+#### Vertex AI
+
+TODO
+
+
 ### ai.embedding_openai
 
 You call the `ai.embedding_openai` function to use an OpenAI model to generate embeddings.

@@ -45,11 +45,19 @@ textual, data analysis and semantic search:
 
 ## Select an embedding provider and set up your API Keys
 
-Vectorizer supports the following vector embedding providers:
+Vectorizer supports the following vector embedding providers as first-party integrations:
 - [Ollama](https://ollama.com/)
 - [Voyage AI](https://www.voyageai.com/)
 - [OpenAI](https://openai.com/)
 
+Additionally, through the [LiteLLM](https://litellm.ai) provider we support:
+- [Cohere](https://cohere.com/)
+- [HuggingFace Inference Endpoints](https://endpoints.huggingface.co/)
+- [Mistral](https://mistral.ai/)
+- [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service)
+- [AWS Bedrock](https://aws.amazon.com/bedrock/)
+- [Vertex AI](https://cloud.google.com/vertex-ai)
+
 When using an external embedding service, you need to setup your API keys to access
 the service. To store several API keys, you give each key a name and reference them
 in the `embedding` section of the Vectorizer configuration. The default API key

@@ -0,0 +1,35 @@
+import litellm
+from typing import Optional, Generator
+
+
+def embed(
+    model: str,
+    input: list[str],
+    api_key: str,
+    user: Optional[str] = None,
+    dimensions: Optional[int] = None,
+    timeout: Optional[int] = None,
+    api_base: Optional[str] = None,
+    api_version: Optional[str] = None,
+    api_type: Optional[str] = None,
+    organization: Optional[str] = None,
+    **kwargs,
+) -> Generator[tuple[int, list[float]], None, None]:
+    if organization is not None:
+        litellm.organization = organization
+    response = litellm.embedding(
+        model=model,
+        input=input,
+        user=user,
+        dimensions=dimensions,
+        timeout=timeout,
+        api_type=api_type,
+        api_key=api_key,
+        api_base=api_base,
+        api_version=api_version,
+        **kwargs,
+    )
+    if not hasattr(response, "data"):
+        return None
+    for idx, obj in enumerate(response["data"]):
+        yield idx, obj["embedding"]
@@ -24,27 +24,21 @@ def remove_secret_from_cache(sd_cache: dict[str, str], secret_name: str):
 
 def get_secret(
     plpy,
-    secret: Optional[str],
-    secret_name: Optional[str],
-    secret_name_default: str,
-    sd_cache: Optional[dict[str, str]],
-) -> str:
+    secret: Optional[str] = None,
+    secret_name: Optional[str] = None,
+    secret_name_default: Optional[str] = None,
+    sd_cache: Optional[dict[str, str]] = None,
+) -> str | None:
     if secret is not None:
         return secret
 
     if secret_name is None:
         secret_name = secret_name_default
 
     if secret_name is None or secret_name == "":
-        plpy.error("secret_name is required")
-
-    secret = reveal_secret(plpy, secret_name, sd_cache)
-    if secret is None:
-        plpy.error(f"missing {secret_name} secret")
-        # This line should never be reached, but it's here to make the type checker happy.
-        return ""
+        return None
 
-    return secret
+    return reveal_secret(plpy, secret_name, sd_cache)
 
 
 def check_secret_permissions(plpy, secret_name: str) -> bool:

@@ -1,8 +1,10 @@
-openai==1.44.0
+openai==1.56.0
 tiktoken==0.7.0
 ollama==0.4.5
 anthropic==0.29.0
 cohere==5.5.8
 backoff==2.2.1
 voyageai==0.3.1
 datasets==3.1.0
+litellm==1.55.4
+google-cloud-aiplatform==1.74.0 # required for vertexAI (don't know why litellm doesn't include this)
@@ -74,6 +74,30 @@ $func$ language plpgsql immutable security invoker
 set search_path to pg_catalog, pg_temp
 ;
 
+-------------------------------------------------------------------------------
+-- embedding_litellm
+create or replace function ai.embedding_litellm
+( model pg_catalog.text
+, dimensions pg_catalog.int4
+, api_key_name pg_catalog.text default null
+, extra_options pg_catalog.jsonb default null
+) returns pg_catalog.jsonb
+as $func$
+begin
+    return json_object
+    ( 'implementation': 'litellm'
+    , 'config_type': 'embedding'
+    , 'model': model
+    , 'dimensions': dimensions
+    , 'api_key_name': api_key_name
+    , 'extra_options': extra_options
+    absent on null
+    );
+end
+$func$ language plpgsql immutable security invoker
+set search_path to pg_catalog, pg_temp
+;
+
 -------------------------------------------------------------------------------
 -- _validate_embedding
 create or replace function ai._validate_embedding(config pg_catalog.jsonb) returns void
@@ -98,6 +122,8 @@ begin
             -- ok
         when 'voyageai' then
             -- ok
+        when 'litellm' then
+            -- ok
         else
             if _implementation is null then
                 raise exception 'embedding implementation not specified';

@@ -0,0 +1,58 @@
+-------------------------------------------------------------------------------
+-- litellm_embed
+-- generate an embedding from a text value
+create or replace function ai.litellm_embed
+( model text
+, input_text text
+, api_key text default null
+, api_key_name text default null
+, extra_options jsonb default null
+) returns @extschema:vector@.vector
+as $python$
+    #ADD-PYTHON-LIB-DIR
+    import ai.litellm
+    import ai.secrets
+    options = {}
+    if extra_options is not None:
+        import json
+        options = {k: v for k, v in json.loads(extra_options).items()}
+
+    api_key_resolved = ai.secrets.get_secret(plpy, api_key, api_key_name, None, SD)
+    for tup in ai.litellm.embed(model, [input_text], api_key=api_key_resolved, **options):
+        return tup[1]
+$python$
+language plpython3u immutable parallel safe security invoker
+set search_path to pg_catalog, pg_temp
+;
+
+-------------------------------------------------------------------------------
+-- litellm_embed
+-- generate embeddings from an array of text values
+create or replace function ai.litellm_embed
+( model text
+, input_texts text[]
+, api_key text default null
+, api_key_name text default null
+, extra_options jsonb default null
+) returns table
+( "index" int
+, embedding @extschema:vector@.vector
+)
+as $python$
+    #ADD-PYTHON-LIB-DIR
+    import ai.litellm
+    import ai.secrets
+    options = {}
+    if extra_options is not None:
+        import json
+        options = {k: v for k, v in json.loads(extra_options).items()}
+
+    plpy.log("options", options)
+
+    api_key_resolved = ai.secrets.get_secret(plpy, api_key, api_key_name, None, SD)
+    for tup in ai.litellm.embed(model, input_texts, api_key=api_key_resolved, **options):
+        yield tup
+$python$
+language plpython3u immutable parallel safe security invoker
+set search_path to pg_catalog, pg_temp
+;
@@ -20,6 +20,7 @@ CREATE EXTENSION
  function ai.create_vectorizer(regclass,name,jsonb,jsonb,jsonb,jsonb,jsonb,jsonb,name,name,name,name,name,name,name[],boolean)
  function ai.disable_vectorizer_schedule(integer)
  function ai.drop_vectorizer(integer,boolean)
+ function ai.embedding_litellm(text,integer,text,jsonb)
  function ai.embedding_ollama(text,integer,text,jsonb,text)
  function ai.embedding_openai(text,integer,text,text)
  function ai.embedding_voyageai(text,integer,text,text)
@@ -34,6 +35,8 @@ CREATE EXTENSION
  function ai.indexing_diskann(integer,text,integer,integer,double precision,integer,integer,boolean)
  function ai.indexing_hnsw(integer,text,integer,integer,boolean)
  function ai.indexing_none()
+ function ai.litellm_embed(text,text,text,text,jsonb)
+ function ai.litellm_embed(text,text[],text,text,jsonb)
  function ai.load_dataset_multi_txn(text,text,text,name,name,text,jsonb,integer,integer,integer,jsonb)
  function ai.load_dataset(text,text,text,name,name,text,jsonb,integer,integer,jsonb)
  function ai.ollama_chat_complete(text,jsonb,text,text,jsonb)
@@ -94,7 +97,7 @@ CREATE EXTENSION
  table ai.vectorizer_errors
  view ai.secret_permissions
  view ai.vectorizer_status
-(90 rows)
+(93 rows)
 
                                     Table "ai._secret_permissions"
  Column | Type | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 

@@ -20,6 +20,7 @@ CREATE EXTENSION
  function ai.create_vectorizer(regclass,name,jsonb,jsonb,jsonb,jsonb,jsonb,jsonb,name,name,name,name,name,name,name[],boolean)
  function ai.disable_vectorizer_schedule(integer)
  function ai.drop_vectorizer(integer,boolean)
+ function ai.embedding_litellm(text,integer,text,jsonb)
  function ai.embedding_ollama(text,integer,text,jsonb,text)
  function ai.embedding_openai(text,integer,text,text)
  function ai.embedding_voyageai(text,integer,text,text)
@@ -34,6 +35,8 @@ CREATE EXTENSION
  function ai.indexing_diskann(integer,text,integer,integer,double precision,integer,integer,boolean)
  function ai.indexing_hnsw(integer,text,integer,integer,boolean)
  function ai.indexing_none()
+ function ai.litellm_embed(text,text,text,text,jsonb)
+ function ai.litellm_embed(text,text[],text,text,jsonb)
  function ai.load_dataset_multi_txn(text,text,text,name,name,text,jsonb,integer,integer,integer,jsonb)
  function ai.load_dataset(text,text,text,name,name,text,jsonb,integer,integer,jsonb)
  function ai.ollama_chat_complete(text,jsonb,text,text,jsonb)
@@ -108,7 +111,7 @@ CREATE EXTENSION
  type ai.vectorizer_status[]
  view ai.secret_permissions
  view ai.vectorizer_status
-(104 rows)
+(107 rows)
 
                                     Table "ai._secret_permissions"
  Column | Type | Collation | Nullable | Default | Storage  | Compression | Stats target | Description