Fix `beam_scores` shape when token scores shape changes after `logits_processor` #25980

BakerBunker · 2023-09-05T08:23:56Z

What does this PR do?

When token scores shape changes after logits_processor, next_token_scores_processed has different shape with beam_scores[:, None].expand_as(next_token_scores), this PR fixes this issue.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ x ] Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

amyeroberts · 2023-09-05T09:19:25Z

cc @gante

gante · 2023-09-13T15:49:50Z

Hi @BakerBunker 👋

You wrote "When token scores shape changes after logits_processor" as the cause for the proposed changes -- this situation should not happen 🤔

Would you be able to share an example?

BakerBunker · 2023-09-13T16:30:04Z

Sure, I trained a model with different sizes of input and output embeddings, because the output vocab of the model is much smaller compared to the input vocab. And because the input and output embeddings make up a large percentage of the parameters in the model, this saves a lot of GPU memory during training. However, during the generation process, I need to align the input and output input_ids to call the generate() interface properly. Here is my code for the align process:

class TokenAlignProcessor(LogitsProcessor):
    def __call__(self, input_ids, scores):
        new_score = torch.empty(scores.shape[0], len(tokenizer), device=DEVICE).fill_(
            -torch.inf
        )
        new_score[:, -OVERLAP_TOKEN_NUMS :] = scores
        return new_score

gante

@BakerBunker I see, thank you for the explanation! 🤗

Normally we don't accept changes for uses with custom code but, since this one is indifferent (under normal operation, next_token_scores_processed.shape == next_token_scores.shape), I'm pro accepting it to also enable your custom case :)

To make CI go green, you need to run make fixup in your local transformers root folder and commit the changes

gante · 2023-09-13T18:13:18Z

Thank you for the contribution, @BakerBunker 💛

…_processor` (huggingface#25980)

Fix beam_scores shape when scores change after logits_processor

dcd2a18

gante approved these changes Sep 13, 2023

View reviewed changes

fixup with black

c8606b3

gante merged commit 0fced06 into huggingface:main Sep 13, 2023

parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023

Fix beam_scores shape when token scores shape changes after `logits…

d04bd5c

…_processor` (huggingface#25980)

blbadger pushed a commit to blbadger/transformers that referenced this pull request Nov 8, 2023

Fix beam_scores shape when token scores shape changes after `logits…

2e8a84d

…_processor` (huggingface#25980)

EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 18, 2023

Fix beam_scores shape when token scores shape changes after `logits…

9dbb61d

…_processor` (huggingface#25980)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `beam_scores` shape when token scores shape changes after `logits_processor` #25980

Fix `beam_scores` shape when token scores shape changes after `logits_processor` #25980

BakerBunker commented Sep 5, 2023

amyeroberts commented Sep 5, 2023

gante commented Sep 13, 2023 •

edited

Loading

BakerBunker commented Sep 13, 2023

gante left a comment •

edited

Loading

gante commented Sep 13, 2023

Fix beam_scores shape when token scores shape changes after logits_processor #25980

Fix beam_scores shape when token scores shape changes after logits_processor #25980

Conversation

BakerBunker commented Sep 5, 2023

What does this PR do?

Before submitting

Who can review?

amyeroberts commented Sep 5, 2023

gante commented Sep 13, 2023 • edited Loading

BakerBunker commented Sep 13, 2023

gante left a comment • edited Loading

Choose a reason for hiding this comment

gante commented Sep 13, 2023

Fix `beam_scores` shape when token scores shape changes after `logits_processor` #25980

Fix `beam_scores` shape when token scores shape changes after `logits_processor` #25980

gante commented Sep 13, 2023 •

edited

Loading

gante left a comment •

edited

Loading