Skip to content

Add Amazon Bedrock Text vectorizer #248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ jobs:
AZURE_OPENAI_ENDPOINT: ${{secrets.AZURE_OPENAI_ENDPOINT}}
AZURE_OPENAI_DEPLOYMENT_NAME: ${{secrets.AZURE_OPENAI_DEPLOYMENT_NAME}}
OPENAI_API_VERSION: ${{secrets.OPENAI_API_VERSION}}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: |
poetry run test-cov

Expand All @@ -86,6 +88,8 @@ jobs:
AZURE_OPENAI_ENDPOINT: ${{secrets.AZURE_OPENAI_ENDPOINT}}
AZURE_OPENAI_DEPLOYMENT_NAME: ${{secrets.AZURE_OPENAI_DEPLOYMENT_NAME}}
OPENAI_API_VERSION: ${{secrets.OPENAI_API_VERSION}}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: |
cd docs/ && poetry run treon -v --exclude="./examples/openai_qna.ipynb"

Expand Down
7 changes: 7 additions & 0 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,13 @@ def gcp_location():
def gcp_project_id():
return os.getenv("GCP_PROJECT_ID")

@pytest.fixture
def aws_credentials():
return {
"aws_access_key_id": os.getenv("AWS_ACCESS_KEY_ID"),
"aws_secret_access_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
"aws_region": os.getenv("AWS_REGION", "us-east-1")
}

@pytest.fixture
def sample_data():
Expand Down
10 changes: 10 additions & 0 deletions docs/api/vectorizer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@ CohereTextVectorizer
:show-inheritance:
:members:

BedrockTextVectorizer
=====================

.. _bedrocktextvectorizer_api:

.. currentmodule:: redisvl.utils.vectorize.text.bedrock

.. autoclass:: BedrockTextVectorizer
:show-inheritance:
:members:

CustomTextVectorizer
====================
Expand Down
78 changes: 75 additions & 3 deletions docs/user_guide/vectorizers_04.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@
"3. Vertex AI\n",
"4. Cohere\n",
"5. Mistral AI\n",
"6. Bringing your own vectorizer\n",
"6. Amazon Bedrock\n",
"7. Bringing your own vectorizer\n",
"\n",
"Before running this notebook, be sure to\n",
"1. Have installed ``redisvl`` and have that environment active for this notebook.\n",
Expand Down Expand Up @@ -541,6 +542,76 @@
"# print(test[:10])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Amazon Bedrock\n",
"\n",
"Amazon Bedrock provides fully managed foundation models for text embeddings. Install the required dependencies:\n",
"\n",
"```bash\n",
"pip install 'redisvl[bedrock]' # Installs boto3\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Configure AWS credentials:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"if \"AWS_ACCESS_KEY_ID\" not in os.environ:\n",
" os.environ[\"AWS_ACCESS_KEY_ID\"] = getpass.getpass(\"Enter AWS Access Key ID: \")\n",
"if \"AWS_SECRET_ACCESS_KEY\" not in os.environ:\n",
" os.environ[\"AWS_SECRET_ACCESS_KEY\"] = getpass.getpass(\"Enter AWS Secret Key: \")\n",
"\n",
"os.environ[\"AWS_REGION\"] = \"us-east-1\" # Change as needed"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create embeddings:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from redisvl.utils.vectorize import BedrockTextVectorizer\n",
"\n",
"bedrock = BedrockTextVectorizer(\n",
" model=\"amazon.titan-embed-text-v2:0\"\n",
")\n",
"\n",
"# Single embedding\n",
"text = \"This is a test sentence.\"\n",
"embedding = bedrock.embed(text)\n",
"print(f\"Vector dimensions: {len(embedding)}\")\n",
"\n",
"# Multiple embeddings\n",
"sentences = [\n",
" \"That is a happy dog\",\n",
" \"That is a happy person\",\n",
" \"Today is a sunny day\"\n",
"]\n",
"embeddings = bedrock.embed_many(sentences)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -691,7 +762,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand All @@ -710,9 +781,10 @@
"source": [
"# load expects an iterable of dictionaries where\n",
"# the vector is stored as a bytes buffer\n",
"from redisvl.redis.utils import array_to_buffer\n",
"\n",
"data = [{\"text\": t,\n",
" \"embedding\": v}\n",
" \"embedding\": array_to_buffer(v, dtype=\"float32\")}\n",
" for t, v in zip(sentences, embeddings)]\n",
"\n",
"index.load(data)"
Expand Down
Loading
Loading