Welcome! The goal of this stack is to demonstrate:
- how to create a multi-tenant RAG application using AWS generative AI services.
- how to use OpenSearch Managed directly as a vector store provider.
- how to handle event-based ingestion of documents, without requiring manual or programmatic triggering of new document ingestion.
- how to implement sharing of document collections with other users.
- how to do multi-RAG query orchestration across multiple document collections
- how to do graph RAG, by providing example entity extraction pipelines and example prompt orchestration of user prompts across vector or graph databases, depending on the user prompt contents.
To get started, do the following. If you have CDK installed on your machine and working in your account alread, skip to using a pre-existing CDK installation.
-
If you don't have CDK working yet, the easiest way to get going will be to deploy the ec2 instance in optional_deployment_assets/deployment-instance-template.yaml by going to your CloudFormation console, clicking create new stack, and uploading the deployment-instance-template.yaml file as a new template.
-
After the install completes, go to the EC2 console click on the running instance that just installed, and click on the Connect button at the top. Use the Session Manager connect option, and click Connect.
-
Once in the terminal, paste in the following command:
sudo su - cdkuser && cd multi-tenant-full-stack-rag-application-demo
-
Use your favorite terminal editor to edit the backend/cdk.context.json file. Specifically, you need to at least edit the
allowed_email_domains
configuration option, so that you'll be able to create users through the web sign-up form. You may also want to change theapp_name
, theverification_email_subject
, and theverification_email_body
. Finally, you can add your own IP address or range to theos_dashboards_ec2_enable_traffic_from_ip
to allow you to connect to the OpenSearch Dashboards proxy EC2 instance. -
Then add AWS credentials for a user or role you've created to the terminal session by setting these variables:
export AWS_ACCESS_KEY_ID="your aws access key ID"
export AWS_SECRET_ACCESS_KEY="your aws secret key"
# optionally, use the session token if you're using short term credentials instead of an IAM user (best practice)
export AWS_SESSION_TOKEN="your aws session token"
- If you've never installed a CDK stack in this account and region, you'll need to bootstrap the region in that account. Run the following:
cdk bootstrap aws://account_id/region
where account_id
is your 12-digit AWS account ID and the region is the region code, like us-west-2
.
- Then run:
screen
./install_cdk.sh
Optionally add -y
if you want to approve of the IAM changes it's going to prompt you for. If you don't use -y
, be aware that there are multiple stacks being installed, and they'll all prompt you
The screen
command will make it so that if your installation session will be disconnected from your terminal session, so if you get disconnected, the install will continue.
If you do get disconnected, to reconnect to the screen from before, after reconnecting to the Session Manager terminal, type:
sudo su - cdkuser
screen -r
And type +A and then D to disconnect from the screen and leave it runnning.
- It may take take up to 60 minutes to deploy the whole stack, backend and frontend. At the end of the frontend deployment, it will print out the URL of the CloudFront distribution for the frontend UI. Click that to get started.
If you're familiar with using the CDK, and you already have your account bootstrapped, just do the following:
git clone https://github.com/aws-samples/multi-tenant-full-stack-rag-application-demo.git
cd multi-tenant-full-stack-rag-application-demo
- paste your AWS credentials into the terminal
- Bootstrap your account and region if you've never used CDK in this region (
cdk bootstrap aws://account_id/region_code
) ./install_cdk.sh -y
(the -y is optional to approve all of the IAM changes, equivalent to runningcdk deploy --require-approval never
)
-
When you first get to the login screen, use the create account tab and create a new user. You must use the same domain name you added to the cdk.context.json file in the allowed_email_domains list. If you skipped that part, you can create users in the Cognito console, or update the cdk.context.json file and rerun ./install_cdk.sh to reflect changes.
-
After setting up your Cognito user you should be able to log in.
-
Create a document collection and upload some documents. When creating the collection, use the description to describe specifically when to search this collection and when not to, as in the example below:
Use this document collection for (fill in the use case here). Don't use it for other topics.
- Click on Start a Conversation in the left nav. Click the first Anthropic model for Haiku (
anthropic.claude-3-haiku-20240307-v1:0
), and start asking questions about your documents!
Update coming soon after I optimize the first version a bit.
Feature | Status | Comments |
---|---|---|
Event-based, highly parallel, serverless ingestion | complete | Uses S3 events -> SQS -> Lambda to process docs in real-time when they hit the S3 bucket. Currently, users can upload files into the UI directly. Future functionality will include bulk upload capabilities. |
Document collections CRUD | complete | Document collections are like Bedrock KBs, backed by OpenSearch managed. Web uploads currently supported. Bulk uploads require backend access for now, but will be supported in the future by a cross-platform client (like the WorkDocs client) that would allow easy syncing of whole folders. Also, will support accepting an S3 URI and a role with permissions to access it, to sync from a given bucket and path. |
Prompt templates CRUD | complete | Prompt templates can be created/read/updated/deleted from the web UI, and associated with one or more models. For example, you can create a default template for all the Claude 3 models. |
Text generation | complete | Chat history and current message are automatically evaluated for whether a RAG search is needed, given all of the user's current document collections. In the future, will add faceted search-style document collection inclusion/exclusion. |
Model parameters in the UI | complete | Provide sliders for model parameters, like in the Bedrock chat playground. |
Multi-RAG query orchestration with graph data and semantic search | complete | Requires descriptive text in the document collections explaining with which subjects they are intended for use. |
Ingestion status of files | complete | |
Entity extraction pipeline | complete | when enabled, uses Haiku and a given extraction template to store data in Neptune database. |
Default prompts for each model family | complete | |
Sharing document collections across users at the collection level | complete | Works like Quip where you can search for named users. Doesn't yet support asterisk for sharing with everyone who's logged in. |
Conversation history | in progress | The current conversation has memory but it doesn't yet offer saving of conversations. |
Custom logos instead of "Multi-tenant full-stack rag demo" to enable customized field demos. | partially complete | no logos yet, but the app title is configurable through backend/cdk.context.json file. |
Saving conversation history | ||
Paging through long lists of uploaded files | ||
Feedback | ||
Image generation | ||
Multi-modal queries | ||
Response streaming | currently shows a "typing" gif, but not yet streaming. | |
Allow user to specify that doc collections are optimized for long or short answers | Currently optimized for longer answers which can also easily retrieve shorter answers to meet both needs in shortest term. | |
Sharing document collections across users at the individual document level | unlikely to be soon | |
Looping to solve complex code generation for Q&A tasks. | ||
Add support for other doc types besides plain text | right now it treats everything it doesn't recognize as plain text, so it can handle code files with any extension, for example, but doesn't do PDFs or Office docs yet. | |
Add crawlers |