`RepetitionPenaltyLogitsProcessor` and `EncoderRepetitionPenaltyLogitsProcessor` contains incorrect and unclear docstrings #25970

larekrow · 2023-09-05T05:09:05Z

The class docstring of RepetitionPenaltyLogitsProcessor says "tokens with higher scores are less likely to be selected". However, according to the paper which states that "this penalized sampling works by discounting the scores of previously generated tokens" and the code which lowers the score when penalizing tokens (e.g. by multiplying a negative score with a 1.2 penalty, 1.2 being a value the paper highlighted), the docstring should be corrected to say that "tokens with higher scores are more likely to be selected".

transformers/src/transformers/generation/logits_process.py

Lines 314 to 317 in d8e13b3

    
           score = torch.gather(scores, 1, input_ids) 
        
           # if score < 0 then repetition penalty has to be multiplied to reduce the previous token probability 
        
           score = torch.where(score < 0, score * self.penalty, score / self.penalty)

EncoderRepetitionPenaltyLogitsProcessor requires an additional encoder_input_ids arg which docstring says "the encoder_input_ids that should not be repeated within the decoder ids". However, according to the class docstring, Add hallucination filter in generate() #18354 (comment), and the code with increases the score of tokens found within the original input ids (e.g. by multiplying a negative score with a 1 / 2 = 0.5 penalty, where hallucination_penalty = 2 and is a value the PR author used), these are the ids that should be repeated within the decoder ids.

transformers/src/transformers/generation/logits_process.py

Lines 338 to 346 in d8e13b3

    
               self.penalty = 1 / penalty 
        
               self.encoder_input_ids = encoder_input_ids 
        
           @add_start_docstrings(LOGITS_PROCESSOR_INPUTS_DOCSTRING) 
        
           def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.FloatTensor: 
        
               score = torch.gather(scores, 1, self.encoder_input_ids) 
        
               # if score < 0 then repetition penalty has to be multiplied to reduce the previous token probability 
        
               score = torch.where(score < 0, score * self.penalty, score / self.penalty)

Both RepetitionPenaltyLogitsProcessor and EncoderRepetitionPenaltyLogitsProcessor require a penalty input, which is enforced as a positive float. However, this input only works as expected when penalty > 1. If 0 < penalty < 1 is given, the "penalty" becomes a "reward". The docstring does not mention this in any way.

RepetitionPenaltyLogitsProcessor

transformers/src/transformers/generation/logits_process.py

Lines 307 to 308 in d8e13b3

    
           if not isinstance(penalty, float) or not (penalty > 0): 
        
               raise ValueError(f"`penalty` has to be a strictly positive float, but is {penalty}")

EncoderRepetitionPenaltyLogitsProcessor

transformers/src/transformers/generation/logits_process.py

Lines 335 to 336 in d8e13b3

    
           if not isinstance(penalty, float) or not (penalty > 0): 
        
               raise ValueError(f"`penalty` has to be a strictly positive float, but is {penalty}")

@gante

Before delving deeper into the source code and other resources, I was truly confused by the contradicting messages. I hope this will be rectified for other users.

The text was updated successfully, but these errors were encountered:

gante · 2023-09-12T11:40:13Z

@larekrow yes, I agree, there are some inconsistencies and some lack of documentation.

Issues 1 and 2 get partially resolved with your PR (#25971). On 2, there is clearly an example missing, but I'm reviewing examples ATM.

Regarding 3: it's a shame that penalty has the opposite meaning in on both processors, but changing it would be a breaking change. The best we can do is to clarify the docstring, including documenting the reward case! Would you like to open a PR to clarify this one? :)

larekrow · 2023-09-13T07:13:28Z

Thanks for the feedback @gante. I have attempted to clarify the reward cases in PR #26129 but as alluded to, I feel that the docstrings for EncoderRepetitionPenaltyLogitsProcessor will require some adjustments.

Both the class name EncoderRepetitionPenaltyLogitsProcessor and the original description (which I have left untouched) are misleading, because the class is not actually penalizing the encoder ids but only rewarding it when an intended hallucination_penalty value of >1 is given. In fact, it does not penalize any ids in that case.

The docstring I wrote for this class became somewhat convoluted because of this complication. A class name that would be more accurate would be EncoderRepetitionRewardLogitsProcessor, but this would be a breaking change as you pointed out.

Any suggestions as to how we should move forward?

gante · 2023-09-13T16:18:51Z

@larekrow I agree with your sentiment, the original implementation should be better (and I, as a reviewer, should have paid more attention to the implications) 🤗 Our north star here at transformers is to preserve backward compatibility, even if the original design is sub-optimal. We may lose in terms of clarity, but production users are reassured that we don't make sudden changes!

As such, documenting what's going on (like you did) is the best compromise solution 🤗 Thank you for iterating with me 💛

larekrow · 2023-09-15T02:23:40Z

No worries, we all appreciate the important work Hugging Face is doing (and there is a lot of work to be done). It's really cool how this huge project is driven by both the staff and the community. Happy to be a part of it 🤗

I've updated the docstrings according to your remarks in #26129. Please take a look whenever you can!

ArthurZucker · 2023-09-15T14:41:00Z

our amzing @gante is off for a few weeks, feel free to ping me once this is ready! 😉

larekrow · 2023-09-15T14:44:32Z

@ArthurZucker yep this is ready! Please take a look when you can.

larekrow · 2023-10-17T09:17:06Z

PR merged!

larekrow mentioned this issue Sep 5, 2023

Update logits_process.py docstrings #25971

Merged

larekrow mentioned this issue Sep 13, 2023

Update logits_process.py docstrings to clarify penalty and reward cases #26129

Closed

huggingface deleted a comment from github-actions bot Oct 11, 2023

larekrow mentioned this issue Oct 13, 2023

Update logits_process.py docstrings to clarify penalty and reward cases (attempt #2) #26784

Merged

larekrow closed this as completed Oct 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`RepetitionPenaltyLogitsProcessor` and `EncoderRepetitionPenaltyLogitsProcessor` contains incorrect and unclear docstrings #25970

`RepetitionPenaltyLogitsProcessor` and `EncoderRepetitionPenaltyLogitsProcessor` contains incorrect and unclear docstrings #25970

larekrow commented Sep 5, 2023 •

edited

Loading

gante commented Sep 12, 2023

larekrow commented Sep 13, 2023

gante commented Sep 13, 2023 •

edited

Loading

larekrow commented Sep 15, 2023

ArthurZucker commented Sep 15, 2023

larekrow commented Sep 15, 2023

larekrow commented Oct 17, 2023

RepetitionPenaltyLogitsProcessor and EncoderRepetitionPenaltyLogitsProcessor contains incorrect and unclear docstrings #25970

RepetitionPenaltyLogitsProcessor and EncoderRepetitionPenaltyLogitsProcessor contains incorrect and unclear docstrings #25970

Comments

larekrow commented Sep 5, 2023 • edited Loading

gante commented Sep 12, 2023

larekrow commented Sep 13, 2023

gante commented Sep 13, 2023 • edited Loading

larekrow commented Sep 15, 2023

ArthurZucker commented Sep 15, 2023

larekrow commented Sep 15, 2023

larekrow commented Oct 17, 2023

`RepetitionPenaltyLogitsProcessor` and `EncoderRepetitionPenaltyLogitsProcessor` contains incorrect and unclear docstrings #25970

`RepetitionPenaltyLogitsProcessor` and `EncoderRepetitionPenaltyLogitsProcessor` contains incorrect and unclear docstrings #25970

larekrow commented Sep 5, 2023 •

edited

Loading

gante commented Sep 13, 2023 •

edited

Loading