Question and Answer application for competition notices using Amazon Bedrock, Langchain, Qdrant, AWS ECS, and FastAPI
The Retrieval Augmented Generation architecture combines the power of Large Language Models (LLMs) (the generation component) with an external vector store (the retrieve component) which stores proprietary data to create a more accurate answer. RAG combines existing information and new content generated by LLMs. The existing information solves a well-known problem with LLMs called hallucinations (incorrect results).
The retrieval component finds information that matches an input query and ranks the collection of documents stored in the Vector store database to return the best ones. Therefore, RAGs allow LLMs to generate new content about content that it has never been trained on without updating its weights.
Reference:
https://superlinked.com/vectorhub/retrieval-augmented-generation
Many people end up giving up reading competition notices due to different factors such as too much information, inaccessible font size, and difficulty in interpretation. This project aims to build a generative AI application to help candidates quickly and easily understand competition notices.
PT-BR
Muitas pessoas acabam por desistir de ler editais de concursos devidos a diferentes fatores como: muitas informações, tamanho de letras não acessíveis e dificuldade de interpretação. Este projeto tem como objetivo construir uma aplicação de IA generativa para auxiliar candidados a compreender de forma facil e rápida editais de concursos.
- Upload the Terraform state file to AWS S3.
- Push the code and Terraform scripts to GitHub.
- Trigger GitHub Actions.
- Use GitHub Actions to leverage Terraform for creating S3 and Lambda Function infrastructure, and for uploading documents.
- Trigger the Lambda Function via S3 to process documents.
- Utilize a container image stored in ECR within the Lambda Function. This image contains all the necessary code to convert PDFs to embeddings using Langchain and the AWS Bedrock embeddings model.
- Upload the embeddings to Qdrant Cloud using the Python API client.
- Use GitHub Actions and AWS CLI to upload the Qdrant URL and API key to AWS Secrets Manager.
- Use GitHub Actions and Terraform to create an ECR repository and all other required resources such as AWS network (VPC, Subnets, Internet gateway, NAT gateway, Routes, security groups, etc), AWS CloudWatcher, Elastic Load Balancer, API Gateway, and VPC link. Log in to ECR and use AWS aws-actions to build, tag, and push the Docker image to ECR.
- Pull the Docker image from ECR using ECS.
- Make a call to the AWS API Gateway from the user's end.
- Route the request from the AWS API Gateway to the VPC link, enabling communication between the API Gateway and the Amazon ECS service within the Amazon VPC.
- Redirect traffic via the Elastic Load Balancer to the least used node, ensuring a balanced load across each container running the same service.
- Retrieve the Qdrant Cloud credentials from the AWS Secrets Manager using the ECS service.
- Access Qdrant Cloud using its API to get the document collection via the ECS Service.
- Integrate the AWS Bedrock Foundation Model and the embeddings from Qdrant Cloud using Langchain.
- Generate an answer about the documents for the user using the embeddings from Qdrant Cloud via the LLM.
How to create a vector store with Lambda, s3, Bedrock and Qdrant Cloud
Microservice with ECS, Docker and FastAPI
How to upload the Terraform state file and PDFs
/
/ask
- Install pipx (in GitHub Codespaces jump to
Install Poetry
as it already has pipx installed)
Linux
sudo apt update
sudo apt install pipx
pipx ensurepath
Windows
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
Invoke-RestMethod -Uri https://get.scoop.sh | Invoke-Expression
scoop install pipx
pipx ensurepath
Now open a new terminal to use pipx
- Install Poetry
# Install Poetry
pipx install --force poetry
# Enable tab completion for Bash
poetry completions bash >> ~/.bash_completion
# Init Poetry
poetry init
# Install Poetry dependencies
poetry install
# Check Poetry version
poetry --version
- Install Terraform (Linux). For more information see Terraform
# Install Terraform by HashiCorp
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
# Add the official HashiCorp Linux repository
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
# Update and install
sudo apt update && sudo apt install terraform
# Verify the installation
terraform --version
Alternatively, run the Bash script install_terraform.sh
in the terminal.
- Enable Bedrock Foundation Models
Then, navigate to the AWS console, access Amazon Bedrock, and go to Template Access. Enable the base templates that you wish to utilize. I created a bedrock_tutorial tutorial for you on how to request model access.
- Install AWS CLI
Finally, we need to install AWS CLI to use Terraform with AWS provider. Refer to the cliv2-linux-install for more information:
To install the AWS CLI, execute the following commands in the terminal:
# Install the AWS CLI
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
# Unzip the AWS CLI
unzip awscliv2.zip
# Install the AWS CLI
sudo ./aws/install
# Clean up the files
rm -rf awscliv2.zip aws
# Verify the AWS CLI
aws --version
Alternatively, you can run the provided Bash script install_aws_cli.sh
in the terminal to streamline the installation process.
In the terminal run the following command:
# Configure the AWS CLI
aws configure
To verify your credentials, you can use one of the following commands in the terminal:
aws sts get-caller-identity
make aws-user
This command will retrieve details about the user, including user ID and account ID.
{
"UserId": "##############",
"Account": "############",
"Arn": "arn:aws:iam::###########:user/##########"
}
To access Qdrant Cloud via the Client SDK, you need to create a cluster in Qdrant Cloud and obtain a Token and the cluster URL.
- Follow the instructions on how to set up a free cluster by visiting the following link:
https://qdrant.tech/documentation/cloud/quickstart-cloud/
- Export the Qdrant token and cluster URL
Use the following command in the terminal to export secrets:
export QDRANT_URL="<YOUR QDRANT CLOUD URL>"
export QDRANT_API_KEY="<YOUR API KEY>"
- (Optional) Run the app locally for testing
make run-app
Next, navigate to http://127.0.0.1:8000 or http://127.0.0.1:8000/docs in your web browser.
This project offers two deployment options: manual execution in the terminal and CI/CD with GitHub Actions.
As the Terraform backend is configured to utilize a Terraform state file stored in AWS S3, the initial step is to upload the state file to S3.
- Execute the following command to initialize Terraform
make tf-init
- In the terminal, execute the following command to upload the state file to AWS S3:
make tf-upload
Follow the steps below to create the AWS infrastructure:
- First update the AWS region in the
src/app/mai.py
file if you are using another region
Directory: src/app/main/py
AWS_DEFAULT_REGION = "us-east-1" # Set this to your preferred AWS region, e.g. us-west-1
- Use the following command in the terminal to create all AWS resources using Terraform. This command will invoke Terraform to configure all the necessary infrastructure.
make tf-apply
- Deploy the application to ECS using the make command:
make aws-deploy
Automatically deploy using GitHub actions for Continuous Integration and Continuous Deployment (CI/CD)
If you want to deploy this application to AWS ECS using GitHub actions you will need to follow some more steps:
-
Generate a Terraform API Token and a secret key in GitHub. Refer to the Terraform API token inside this project
-
Save secret keys in GitHub Actions by providing your AWS credentials, and Qdrant credentials. Check out the Github Actions Secret Keys
-
Replace the following environment variables in
.github/workflows/ci.yml
,.github/workflows/cd.yml
,src/app/main.py
files if you are using a different AWS region
Directory: .github/workflows
env:
AWS_REGION: us-east-1 # Set this to your preferred AWS region, e.g. us-west-1
Directory: src/app/main/py
AWS_DEFAULT_REGION = "us-east-1" # Set this to your preferred AWS region, e.g. us-west-1
Congratulations! You are now ready to deploy this application using CI/CD
Terraform excels in this aspect, eliminating the need for manual navigation through the console to locate each created resource. With Terraform we can just use terraform destroy
or make tf-destroy
in the terminal:
cd terraform && terraform destroy
make tf-destroy
Amazon Bedrock is a fully managed service that provides a selection of high-performing foundation models (FMs) from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon. It offers a single API and a wide range of capabilities for building generative AI applications with a focus on security, privacy, and responsible AI.
Key benefits
-
Offers a choice of high-performing FMs from leading AI companies, allowing users to experiment with and evaluate the best models for their use case.
-
Provides the ability to privately customize FMs with user data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG).
-
As a serverless service, Amazon Bedrock eliminates the need for users to manage any infrastructure.
-
Allows for secure integration and deployment of generative AI capabilities into user applications using familiar AWS services such as Lambda Functions and Elastic Container Service (ECS).
Terraform is an open-source Infrastructure as Code (IaC) tool, crafted for provisioning and managing cloud resources.
Key benefits:
- Declarative approach
- Enable collaboration, versioning, and integration into CI/CD pipelines
- Reusable modules
- Multi-Cloud deployment
- Automation and standardization
Amazon ECS (Elastic Container Service) is a fully managed container orchestration service facilitating the effortless deployment and scaling of containerized applications on AWS.
Key benefits:
- Simplified Operation: Eliminate the need to install or manage your container orchestration
- Auto-Scaling Configuration: Easily configure auto-scaling to match application demands
- Multiple instance types, including EC2 and Fargate, to meet specific application requirements
Fargate
- Fargate is a serverless computing engine for containers. Fargate automatically scales in and out and manages the infrastructure
- It eliminates the need to choose EC2 instances, cluster capacity, and scaling
- Fargate has native integration with AWS VPC which permits to control of connectivity
Amazon ECR is a managed container registry service designed to store Docker images, supporting public and private repositories.
Key benefits:
- Image Scanning for vulnerabilities within your container images
- Effectively manage image lifecycles with customizable policies
- Cross-Region and Cross-Account Replication: Facilitate seamless replication of images across regions and accounts for enhanced accessibility and redundancy
API Gateway is a fully managed service that supports containerized and web applications. API Gateway makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale.
API: A set of rules that allow different software entities to communicate with each other.
Gateway: A point of entry into a system. It often serves as a proxy that forwards requests to multiple services.
Key benefits:
- Supports RESTful APIs and WebSocket APIs
- Handles traffic management and throttling
- Handles authorization and access control
- Monitoring, and API version management
GitHub Actions is a versatile CI/CD platform facilitating build, testing, and deployment pipelines. Key advantages include:
Key benefits:
- Support for automatic, manual, scheduled, and event-triggered workflows
- Compatibility with Linux, Windows, and macOS virtual machines for running workflows
- Intuitive visual workflow for efficient debugging and error resolution
- Seamless integration with AWS ECS and Terraform
Docker is a platform that uses OS-level virtualization to deliver software in packages called containers. We can use Docker to create microservices applications using FastAPI and run them locally or on cloud services as ECS.
Key benefits:
- Isolation
- Easy setup using Dockerfile
- Portability (run on on-premises servers and in the cloud)
Qdrant Cloud offers managed Qdrant instances on the cloud, serving as a powerful similarity search engine.
Key benefits:
- Seamless Integration with LangChain
- Software-as-a-Service (SaaS)
- Easily scalability
- Comprehensive Monitoring and Logging for Cluster Performance
- Availability on Major Cloud Platforms: AWS, GCP, and Azure
Lambda Function is a serverless computing service that allows you to run code without provisioning or managing servers. It provides automatic scaling based on workload.
Key benefits
-
Eliminates the need to provision or manage servers, allowing you to focus on writing code.
-
Automatically scales your applications in response to incoming requests or events, handling any scale of traffic.
-
Supports various programming languages including Python, Go, Java, and more.
-
Works with serverless and container tools such as Docker CLI for building, testing, and deploying functions.
Amazon S3 is an object storage service that offers industry-leading scalability, data availability, security, and performance. It caters to customers of all sizes and industries, providing storage solutions for a wide range of use cases.
Key benefits
-
Offers industry-leading scalability to store and protect any amount of data.
-
Provides cost-effective storage classes to help optimize costs.
AWS Secrets Manager is a service that helps manage, retrieve, and rotate database credentials, API keys, and other secrets throughout their lifecycles.
Key benefits
-
Provides a centralized service to manage secrets, such as database credentials and API keys.
-
Allows for secure and easy retrieval of secrets when needed.
-
Supports automatic rotation of secrets to enhance security.
Elastic Load Balancing is a service that automatically distributes incoming traffic across multiple targets in one or more Availability Zones, ensuring high availability and fault tolerance in your applications.
Key benefits
-
Automatically distributes incoming traffic across multiple targets, such as EC2 instances, containers, and IP addresses.
-
Monitors the health of its registered targets and routes traffic only to the healthy ones.
-
Scales as incoming traffic changes over time.
LangChain is a robust framework designed for developing applications powered by language models. It enables the creation of context-aware applications that can reason based on the provided context.
Key benefits
-
Allows the development of applications that can connect a language model to sources of context such as prompt instructions, few-shot examples, and content to ground its response in.
-
Includes Python and JavaScript libraries and integrations such as Qdrant for a myriad of components, a basic runtime for combining these components into chains and agents.