-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Which dataset was used to train EVF-SAM2? #45
Comments
Same as EVF-SAM1, we used RefCOCO/+/g, ADE20K, Objects365(filtered & machine annotated), PartImageNet, Humanparsing, Pascal-part. |
Hi, I’m a bit confused about the training scheme of EVF-SAM2. From what I see in the code, the EvfSam2Model class is implemented in both evf_sam2.py and evf_sam2_video.py. As far as I understand, the difference between them lies in the visual model: SAM2Base in evf_sam2.py and SAM2VideoPredictor in evf_sam2_video.py. Did you train the evf_sam2.py version with SAM2Base using the aforementioned image dataset and then perform inference with evf_sam2_video.py using the trained parameters? Thanks in advance! |
Yes, you are right. Both SAM2Base and SAM2VideoPredictor are wrappers of SAM2 components (image encoder + prompt encoder + mask decoder). We train EVF-SAM2 using image datasets by keeping all SAM2 params frozen, then the model is able to perform zero-shot video prediction. |
Thanks for the clarification! |
Which dataset was used to train EVF-SAM2?
The text was updated successfully, but these errors were encountered: