- [2024-3-27]: Considering that fine-tuning YOLO-World on COCO without
mask-refine
obtains bad results, e.g., YOLO-World-L obtains 48.6 AP withoutmask-refine
compared to 53.3 AP withmask-refine
, we rethink the training process and explore new training schemes for fine-tuning withoutmask-refine
. BTW, the COCO fine-tuning results are updated with higher performance (withmask-refine
)!
NOTE:
- APZS: AP evaluated in the zero-shot setting (w/o fine-tuning on COCO dataset).
mask-refine
: refine the box annotations with masks, and addCopyPaste
augmentation during training.
model | Schedule | mask-refine |
efficient neck | APZS | AP | AP50 | AP75 | weights | log |
---|---|---|---|---|---|---|---|---|---|
YOLO-World-v2-S | AdamW, 2e-4, 80e | ✔️ | ✖️ | 37.5 | 46.1 | 62.0 | 49.9 | HF Checkpoints | log |
YOLO-World-v2-M | AdamW, 2e-4, 80e | ✔️ | ✖️ | 42.8 | 51.0 | 67.5 | 55.2 | HF Checkpoints | log |
YOLO-World-v2-L | AdamW, 2e-4, 80e | ✔️ | ✖️ | 45.1 | 53.9 | 70.9 | 58.8 | HF Checkpoints | log |
YOLO-World-v2-L | AdamW, 2e-4, 80e | ✔️ | ✔️ | 45.1 | HF Checkpoints | log | |||
YOLO-World-v2-X | AdamW, 2e-4, 80e | ✔️ | ✖️ | 46.8 | 54.7 | 71.6 | 59.6 | HF Checkpoints | log |
YOLO-World-v2-L 🔥 | SGD, 1e-3, 40e | ✖️ | ✖️ | 45.1 | 52.8 | 69.5 | 57.8 | HF Checkpoints | log |