Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Open-source code for "Using OpenVCLIP features with AWT for zero-shot video recognition #1

Closed
RichardHuang0001 opened this issue Dec 22, 2024 · 1 comment

Comments

@RichardHuang0001
Copy link

Thank you for your excellent work and contributions to the community!

I was wondering if you have any plans to release the code or provide guidance on how to use OpenVCLIP to extract features and combine them with AWT for zero-shot video recognition?

Best regards

@zyuhan1999
Copy link
Collaborator

Hi,

Thank you for your interest in our work! AWT comprises three key components: augment, weight, and transportation. The only difference between zero-shot image classification and video classification lies in the augmentation step. For videos, in addition to randomly cropped and flipped images, frames retrieved from different video timestamps are also used.

You can download the Open-VCLIP pre-trained checkpoint and directly perform inference. The only manual effort required is organizing the image features of each video in the specified format, as outlined here. Once organized, you can use AWT_zero_shot/evaluate.py for AWT inference.

I hope this helps!

Best regards,
Yuhan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants