Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Tensors must have same number of dimensions: got 2 and 1 #10

Open
yachuchi opened this issue Nov 4, 2024 · 0 comments
Open

Comments

@yachuchi
Copy link

yachuchi commented Nov 4, 2024

Hi, there is an error when I tried to run Inference.py. I think there is just a dimension problem.
However, I didn't understand line 46 and line 49 in Inference.py since the audio_file.shape[0] of the wav file will be "1".
And then you do zero padding if "audio_file.shape[0] < (SAMPLE_RATE * set_length)".
I cannot understand what you are handling with this part. Can you explain it?

The bug I see is as the following.
root@c4f3aefb5d65:/workspace/Prefix_AAC_ICASSP2023# python3 Inference.py 2 1 ./AudioCaps/test/c3nlaAkv9bA.wav
/opt/conda/lib/python3.10/site-packages/torchvision/datapoints/init.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
/opt/conda/lib/python3.10/site-packages/torchvision/transforms/v2/init.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: pytorch/vision#6753, and you can also check out pytorch/vision#7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
warnings.warn(_BETA_TRANSFORMS_WARNING)
/opt/conda/lib/python3.10/site-packages/torchlibrosa/stft.py:686: UserWarning: Empty filters detected in mel frequency basis. Some channels will produce empty responses. Try increasing your sampling rate (and fmax) or reducing n_mels.
self.melW = librosa.filters.mel(sr=sr, n_fft=n_fft, n_mels=n_mels,
use GPT2 Tokenizer
temporal feature ver's mapping network : num_head = 8 num_layers = 4 prefix_vector_length = 15
global feature ver's mapping network : num_head = 8 num_layers = 4 prefix_vector_length = 11
Encoder freezing
GPT2 freezing
header trainable!
Traceback (most recent call last):
File "/workspace/Prefix_AAC_ICASSP2023/Inference.py", line 63, in
audio_file = torch.cat((audio_file, pad_val), dim=0)
RuntimeError: Tensors must have same number of dimensions: got 2 and 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant