You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "debugging/grouped_beam_search.py", line 14, in <module>
out = model.generate(
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/mounts/Users/student/martin/.local/lib/python3.8/site-packages/transformers/generation_utils.py", line 1041, in generate
return self.group_beam_search(
File "/mounts/Users/student/martin/.local/lib/python3.8/site-packages/transformers/generation_utils.py", line 2161, in group_beam_search
next_token_scores = logits_processor(
File "/mounts/Users/student/martin/.local/lib/python3.8/site-packages/transformers/generation_logits_process.py", line 89, in __call__
scores = processor(input_ids, scores)
File "/mounts/Users/student/martin/.local/lib/python3.8/site-packages/transformers/generation_logits_process.py", line 458, in __call__
for batch_id, beam_sent in enumerate(input_ids.view(-1, self._num_beams, input_ids.shape[-1])):
RuntimeError: shape '[-1, 2, 1]' is invalid for input of size 1
Expected behavior
No error.
As far as I can tell, the PrefixConstrainedLogitsProcessor still receives the original number of beams even when grouped beam search is used. But it should be the number of subbeams. So replacing num_beams with num_beams // num_beam_groups in the constructor of PrefixConstrainedLogitsProcessor in method _get_logits_processor in file generation_utils.py should fix it.
What do you think?
The text was updated successfully, but these errors were encountered:
thanks for your bug report! Yes, you're right -> I think we should indeed replace num_beams by num_beams // num_beam_groups. Do you want to open a PR to fix it? :-) Otherwise, I can do it as well
Environment info
transformers
version: 4.3.3Who can help
@patrickvonplaten
Information
Model I am using (Bert, XLNet ...): T5
The problem arises when using: my own modified scripts
To reproduce
Steps to reproduce the behavior: run this simple script
This produces the following error:
Expected behavior
No error.
As far as I can tell, the
PrefixConstrainedLogitsProcessor
still receives the original number of beams even when grouped beam search is used. But it should be the number of subbeams. So replacingnum_beams
withnum_beams // num_beam_groups
in the constructor ofPrefixConstrainedLogitsProcessor
in method_get_logits_processor
in filegeneration_utils.py
should fix it.What do you think?
The text was updated successfully, but these errors were encountered: