Skip to content

Vector/Hybrid Search & Retrieval on PostgreSQL database using Vision Language Model.

License

Notifications You must be signed in to change notification settings

dnth/postgresql-multimodal-retrieval

Repository files navigation

postgresql-multimodal-retrieval

Multimodal retrieval using Vision Language Model with PostgreSQL database - A full stack implementation.

img

  • Database: PostgreSQL
  • Vision Language Model: OpenAI CLIP (transformers implementation)
  • Dataset: Hugging Face Datasets
  • Frontend: Flet / Gradio
  • Deployment: Docker
  • Infrastructure: Hugging Face Spaces

Features:

  • Image to image search
  • Text to image search
  • Hybrid search

Setting up

Create a conda environment

conda create -n postgresql-multimodal python=3.10
conda activate postgresql-multimodal

Install PostgreSQL

conda install -c conda-forge postgresql
psql --version

Install pgvector

conda install -c conda-forge pgvector

Initialize and start PostgreSQL

initdb -D mylocal_db
pg_ctl -D mylocal_db -l logfile start

Create a database

createuser retrieval_user
createdb retrieval_db -O retrieval_user

Install packages

pip install -r requirements.txt

Install the pgmmr package

pip install -e .

Usage

Compute embeddings

Run

python compute_embeddings.py

This will compute embeddings for all the images in the database.

Query

Run hybrid search (vector search and keyword search). The results are ranked using RRF.

python query.py "a cat and flower" --num_results 12

result

Gradio App

Run

python gradio_app.py

hybrid-search-postgres.mp4

Flet App

flet run flet_app.py

alt text

References