RuntimeError: The size of tensor a (538) must match the size of tensor b (512) at non-singleton dimension 1 #31

j4ffle · 2022-06-08T18:54:18Z

I'm parsing conference calls and run into this error a couple of times. I used NLTK to parse the text components into sentences and then pass those sentences into the classifier following your example. It largely works, but I ran into this issue. From what I read, it arises because there are too many tokens (words) in the sentence. I manually inspect where I think the issue is occurring to identify a piece that is extra long. It occurs when there is a lot of semi-colons. So I could break up sentences with semi-colons, but that doesn't seem quite right. Using word_tokenize from nltk, there are only 488 tokens. How do you tokenize the words? I'm thinking I will truncate the sentence before passing to the model, but to do so accurately, I need to know how many tokens are created by the model.

Is my assessment of why this is happening correct and do you have a better solution than truncating? Thanks.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: The size of tensor a (538) must match the size of tensor b (512) at non-singleton dimension 1 #31

RuntimeError: The size of tensor a (538) must match the size of tensor b (512) at non-singleton dimension 1 #31

j4ffle commented Jun 8, 2022 •

edited

Loading

RuntimeError: The size of tensor a (538) must match the size of tensor b (512) at non-singleton dimension 1 #31

RuntimeError: The size of tensor a (538) must match the size of tensor b (512) at non-singleton dimension 1 #31

Comments

j4ffle commented Jun 8, 2022 • edited Loading

j4ffle commented Jun 8, 2022 •

edited

Loading