You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I have learned from the example of extracting features from speech using the AST model. I mimicked this example to extract features from new speech using my own model, and the shapes I obtained are all [1, 1214, 768]. However, I only want to get features similar to [1, 768]. So, I want to ask, are the features obtained from the final layer of AST all [1, 1214, 768]? Or have I made a mistake in my operation? Thank you for your assistance, and I look forward to your reply.
The text was updated successfully, but these errors were encountered:
Hello, I have learned from the example of extracting features from speech using the AST model. I mimicked this example to extract features from new speech using my own model, and the shapes I obtained are all [1, 1214, 768]. However, I only want to get features similar to [1, 768]. So, I want to ask, are the features obtained from the final layer of AST all [1, 1214, 768]? Or have I made a mistake in my operation? Thank you for your assistance, and I look forward to your reply.
The text was updated successfully, but these errors were encountered: