Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to aggregate nx2048 features into one 2048 feature ? #2

Open
dixonhsiao opened this issue Sep 10, 2019 · 1 comment
Open

how to aggregate nx2048 features into one 2048 feature ? #2

dixonhsiao opened this issue Sep 10, 2019 · 1 comment

Comments

@dixonhsiao
Copy link

dixonhsiao commented Sep 10, 2019

It seems that in your training/eval data there is only one 2048 2d feature and one 2048 3d feature for a sentence. But using the feature extractor in https://github.com/antoine77340/video_feature_extractor , it seems that there will be nx2048 features for a sentence (if the sentence is n seconds in duration for 2d, and approximately n/1.5 seconds for 3d). How do I aggregate nx2048 features into one 2048 feature as stated in your paper by using temporal max-pooling ? Just select the max value for each dimension ?

@dixonhsiao dixonhsiao changed the title how to aggregate n*2048 features into one 2048 feature ? how to aggregate nx2048 features into one 2048 feature ? Sep 10, 2019
@bjuncek
Copy link

bjuncek commented Dec 30, 2019

Yes you can either max pool along the dimensions. For example, you could add
nn.AdaptiveMaxPool2d((1, 2048))
after feature loading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants