You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the paper, an image-net pretrained resnet18 model can achieve a score of 77.2 with only RGB modality. However, there is no code for UCF101 in the repo. I tried to train a resnet18 according to the settings in the paper and its accuracy is 0.43 with a setting of (batch_size, lr, epoch) = (32, 1E-3, 800). So I'm confused by such a performance gap. Can you provide some implementation details or the code for UCF101?
BTW, 3D resnet18 with a lot of tricks has a score of 74.1 in https://arxiv.org/pdf/2103.05905v2, so I think it's a little bit wield a resnet18 with only RGB modality to achieve a performance that easily.
The text was updated successfully, but these errors were encountered:
Here are our settings: batch size=64, lr=1e-4,scheduler = step_LR, step=40, decay_ratio=0.1, optimizer = sgd, weiht_decay = 1e-4
We use imagenet pre-trained ResNet18 as backbone.
For RGB modality, we evenly pick 3 frames for each sample.
For optical flow modality, we stack the horizontal vector u and vertical vector v in the way of [u,v,u] to form three channels as one frame and select 3 frames in total.
Here are our settings: batch size=64, lr=1e-4,scheduler = step_LR, step=40, decay_ratio=0.1, optimizer = sgd, weiht_decay = 1e-4
We use imagenet pre-trained ResNet18 as backbone. For RGB modality, we evenly pick 3 frames for each sample. For optical flow modality, we stack the horizontal vector u and vertical vector v in the way of [u,v,u] to form three channels as one frame and select 3 frames in total.
Thanks a lot for providing your settings! I'll try this again with the setting.
In the paper, an image-net pretrained resnet18 model can achieve a score of 77.2 with only RGB modality. However, there is no code for UCF101 in the repo. I tried to train a resnet18 according to the settings in the paper and its accuracy is 0.43 with a setting of (batch_size, lr, epoch) = (32, 1E-3, 800). So I'm confused by such a performance gap. Can you provide some implementation details or the code for UCF101?

BTW, 3D resnet18 with a lot of tricks has a score of 74.1 in https://arxiv.org/pdf/2103.05905v2, so I think it's a little bit wield a resnet18 with only RGB modality to achieve a performance that easily.
The text was updated successfully, but these errors were encountered: