Results comparison on the Segmentation in the Wild benchmark
Segment Anything in High Quality
Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan Liu, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu
ETH Zurich & HKUST
We organize the seginw
folder as follows.
seginw
|____data
|____pretrained_checkpoint
|____GroundingDINO
|____segment_anything
|____test_ap_on_seginw.py
|____test_seginw.sh
|____test_seginw_hq.sh
|____logs
cd seginw
python -m pip install -e GroundingDINO
Seginw (Segmentation in the Wild) dataset can be downloaded from hugging face link
cd data
wget https://huggingface.co/sam-hq-team/SegInW/resolve/main/seginw.zip
unzip seginw.zip
Expected dataset structure for SegInW
data
|____seginw
| |____Airplane-Parts
| |____Bottles
| |____Brain-Tumor
| |____Chicken
| |____Cows
| |____Electric-Shaver
| |____Elephants
| |____Fruits
| |____Garbage
| |____Ginger-Garlic
| |____Hand
| |____Hand-Metal
| |____House-Parts
| |____HouseHold-Items
| |____Nutterfly-Squireel
| |____Phones
| |____Poles
| |____Puppies
| |____Rail
| |____Salmon-Fillet
| |____Strawberry
| |____Tablets
| |____Toolkits
| |____Trash
| |____Watermelon
Init checkpoint can be downloaded by
cd pretrained_checkpoint
wget https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swinb_cogcoor.pth
wget https://huggingface.co/sam-hq-team/sam-hq-training/resolve/main/pretrained_checkpoint/sam_vit_h_4b8939.pth
wget https://huggingface.co/lkeab/hq-sam/resolve/main/sam_hq_vit_h.pth
pretrained_checkpoint
|____groundingdino_swinb_cogcoor.pth
|____sam_hq_vit_h.pth
|____sam_vit_h_4b8939.pth
To evaluate on 25 seginw datasets
# baseline Grounded SAM
bash test_seginw.sh
# Grounded HQ-SAM
bash test_seginw_hq.sh
To evaluate sam2 and sam-hq2
# baseline Grounded SAM
bash test_seginw_sam2.sh
# Grounded HQ-SAM
bash test_seginw_sam_hq2.sh
python test_ap_on_seginw.py -c GroundingDINO/groundingdino/config/GroundingDINO_SwinB.py -p pretrained_checkpoint/groundingdino_swinb_cogcoor.pth --anno_path data/seginw/Airplane-Parts/valid/_annotations_min1cat.coco.json --image_dir data/seginw/Airplane-Parts/valid/ --use_sam_hq --save_json
Model Name | SAM | GroundingDINO | Mean AP | Airplane-Parts | Bottles | Brain-Tumor | Chicken | Cows | Electric-Shaver | Elephants | Fruits | Garbage | Ginger-Garlic | Hand-Metal | Hand | House-Parts | HouseHold-Items | Nutterfly-Squireel | Phones | Poles | Puppies | Rail | Salmon-Fillet | Strawberry | Tablets | Toolkits | Trash | Watermelon |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Grounded SAM | vit-h | swin-b | 48.7 | 37.2 | 65.4 | 11.9 | 84.5 | 47.5 | 71.7 | 77.9 | 82.3 | 24.0 | 45.8 | 81.2 | 70.0 | 8.4 | 60.1 | 71.3 | 35.4 | 23.3 | 50.1 | 8.7 | 32.9 | 83.5 | 29.8 | 20.8 | 30.0 | 64.2 |
Grounded HQ-SAM | vit-h | swin-b | 49.6 | 37.6 | 66.3 | 12.0 | 84.5 | 47.8 | 72.1 | 77.5 | 82.3 | 25.0 | 45.6 | 81.2 | 74.8 | 8.5 | 60.1 | 77.1 | 35.3 | 20.1 | 50.1 | 7.7 | 42.2 | 85.6 | 29.7 | 21.8 | 30.0 | 65.6 |
The table below shows the zero-shot image segmentation AP performance of Grounded-SAM 2 and Grounded-HQ-SAM 2 on Seginw (Segmentation in the Wild) dataset.
Model Name | SAM | GroundingDINO | Mean AP | Airplane-Parts | Bottles | Brain-Tumor | Chicken | Cows | Electric-Shaver | Elephants | Fruits | Garbage | Ginger-Garlic | Hand-Metal | Hand | House-Parts | HouseHold-Items | Nutterfly-Squireel | Phones | Poles | Puppies | Rail | Salmon-Fillet | Strawberry | Tablets | Toolkits | Trash | Watermelon |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Grounded SAM2 | vit-l | swin-b | 49.5 | 38.3 | 67.1 | 12.1 | 80.7 | 52.8 | 72.0 | 78.2 | 83.3 | 26.0 | 45.7 | 73.7 | 77.6 | 8.6 | 60.1 | 84.1 | 34.6 | 28.8 | 48.9 | 14.3 | 24.2 | 83.7 | 29.1 | 20.1 | 28.4 | 66.0 |
Grounded HQ-SAM2 | vit-l | swin-b | 50.0 | 38.6 | 66.8 | 12.0 | 81.0 | 52.8 | 71.9 | 77.2 | 83.3 | 26.1 | 45.5 | 74.8 | 79.0 | 8.6 | 60.1 | 84.7 | 34.3 | 25.5 | 48.9 | 14.1 | 34.1 | 85.7 | 29.2 | 21.5 | 28.9 | 66.6 |