This is a non-official (and probably much worse) implementation of the Set-of-Mark (SoM) tools described in the paper "Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V" by Jianwei Yang, Hao Zhang, Feng Li, Xueyan Zou, Chunyuan Li, Jianfeng Gao.
Run Google Colab with an unofficial SoM implementation. Load your image and select
the appropriate MIN_AREA_PERCENTAGE
and MAX_AREA_PERCENTAGE
values to label
the objects of interest.
- Wrap up the current SoM implementation in the Gradio app.
We would love your help in making this repository even better! If you have done some cool experiment that you would like to share, or if you have any suggestions for improvement, feel free to open an issue or submit a pull request.