Skip to content

Fine-tuning CLIP using ROCO dataset which contains image-caption pairs from PubMed articles.

License

Notifications You must be signed in to change notification settings

sarahESL/PubMedCLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PubMedCLIP in Medical Visual Question Answering

This repository includes PubMedCLIP, the fine-tuned version of CLIP with ROCO image--caption pairs. We also provide the pipelines for encorporating PubMedCLIP as the alternative pre-trained visual encoder in MEVF and QCR medical visual question answering pipelines. Our experiments illustrate that PubMedCLIP results in up tp 3% improvement in the medical visual question answering.

Citation

If you use this work in academic publication, please cite the paper by Sedigheh Eslami, Christoph Meinel, and Gerard de Melo.

BibTeX entry:

@inproceedings{eslami2023pubmedclip,
  title={PubMedCLIP: How Much Does CLIP Benefit Visual Question Answering in the Medical Domain?},
  author={Eslami, Sedigheh and Meinel, Christoph and De Melo, Gerard},
  booktitle={Findings of the Association for Computational Linguistics: EACL 2023},
  pages={1151--1163},
  year={2023}
}

About

Fine-tuning CLIP using ROCO dataset which contains image-caption pairs from PubMed articles.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published