A project to explore various foundation models that have vision capabilities in Amazon Bedrock.
- Image Reader uses
Claude 3 multimodal models
to interpret images or transcribe text in the images. - Image Finder uses
Titan multimodal embedding model
to find the similar images by text or image. - Image Library uses
ChromaDB
as the vector database for storing images embeddings. - Image Generator uses
Titan image generator model
to generate images.
-
Use
Python 3.11+
, and install dependencies:pip install -r requirements.txt
. -
Default bedrock region is
us-west-2
, change the value ofBEDROCK_REGION
in constant.py accordingly if you use other region. -
Request access to
Claude 3 models
andTitan models
in Bedrock if you have not done that.
Setup AWS credentials, then run cd image-reader; streamlit run Home.py
Setup AWS credentials, then run
- Customize the config.yaml
- Install dependencies
cd cdk; npm install
- Deploy
npx cdk deploy --require-approval never