KEDs

Implementation of the paper Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval (CVPR 2024)

Preparation

Download the CC3M dataset (we transform the image format into image_byte format, you can use the raw image data as well).
Install the GPU version Faiss library, then random sample 0.5M image-text pairs from CC3M as Bi-modality knowledge. You can encode the database using CLIP model first and save them into a .pt file (refer to the code in src/eval_retrieval.py)
Install python environment

pip install -r requirements.txt

For other preparation, please refer to Pic2word project.

Pretrained models and Random sampled databases

Please refer to the huggingface repo, where the cc_image_databases.pt and cc_text_databases.pt contains the bi-modality knowledge features encoded by CLIP-VIT-L-14 and image_stream.pt and text_stream.pt are the example checkpoints for the two stream networks.

Training command

For raw folders

python -u src/main.py --save-frequency 1 --train-data="./cc3m/image_byte_224" --dataset-type directory --warmup 10000 --batch-size=128  --lr=1e-4 --wd=0.1  --epochs=30 --workers=6 --openai-pretrained --model ViT-L/14  --dist-url tcp://127.0.0.1:6102 --seed 999

Demo inference command

python src/demo.py --openai-pretrained --resume ./pic2word_model.pt --retrieval-data imgnet --query_file "./data/test.jpg" --prompts "a cartoon of *" --demo-out ./demo_result --gpu 1 --model ViT-L/14

Evaluation for metrics command

python src/eval_retrieval.py --openai-pretrained --resume ./pic2word_model.pt --eval-mode cirr --gpu 0 --model ViT-L/14 --distributed --dist-url tcp://127.0.0.1:6101

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
database.py		database.py
extract_cc_noun.py		extract_cc_noun.py
requirements.txt		requirements.txt
valprep.sh		valprep.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KEDs

Preparation

Pretrained models and Random sampled databases

Training command

For raw folders

Demo inference command

Evaluation for metrics command

About

Releases

Packages

Languages

License

suoych/KEDs

Folders and files

Latest commit

History

Repository files navigation

KEDs

Preparation

Pretrained models and Random sampled databases

Training command

For raw folders

Demo inference command

Evaluation for metrics command

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages