-
Notifications
You must be signed in to change notification settings - Fork 27.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix repetition penalty error in modeling_utils.py #2303
fix repetition penalty error in modeling_utils.py #2303
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2303 +/- ##
==========================================
- Coverage 73.54% 73.52% -0.02%
==========================================
Files 87 87
Lines 14789 14793 +4
==========================================
Hits 10876 10876
- Misses 3913 3917 +4
Continue to review full report at Codecov.
|
Good catch. |
I checked the code in https://github.com/salesforce/ctrl/blob/0f30306a8947ce0ede62e79c7e1f05a585cc56c9/generation.py#L217: So in the original code division is always used no matter what sign the When going a bit deeper and looking at the actual values of the logit in
for different models the following can be observed: For the models: ctrl, xlm the logit values tend to be positive, which explains why division by the For the models: gpt2, openai-gpt, xlnet the logit values tend to be negative, in which case dividing by a In the proposed PR, both cases would be correctly handled from a logical point of view. |
Ok, I see, thanks for documenting this. Let's go for this solution for now. |
Is this fix added to the pip package? So if we use pip install package this will be covered or not yet I have to install from source? |
Reading this after it was mentioned in the PPLM example PR.
if we apply the same penalty to both, we would want the probabilities to stay the same, but this is what happens:
On the other hand, if we apply the penalty to the probabilities after the softmax (and we renormalize) this is what happens:
The probabilities are intact, as we want, because we don't want to penalize negative values more than we penalize positive values. |
Sorry for the late response @w4nderlust ! I think you it makes a lot of sense what you are saying! To implement your solution with minimal code change one could simply change Eq. (1): to the equivalent Eq. (2) One question that remains is how the new repetition penalties to have a similar effect on the softmax. It is quite obvious that For the different LMHead models, I calculated The following values show by how much
It can be seen that |
So I think there are three possibilities:
I would tend to solution 2, giving a clear explanation of the variable in the argument section of the language generation function. What do you think @w4nderlust and @thomwolf ? |
Thank you for the thorough analysis @patrickvonplaten ! I believe 2 would be fine. The nog just scales things differently, but there's no specific reason to have it, as it is a user tunable parameter anyway. The fact that the default would be 0 instead of one I think could be explained and one could point to this conversation in a comment to give the full picture. Although I understand this is not a huge issue (because of what you say in 3), I kinda believe 2 is better as the could potentially be in the future a different model that actually outputs both positive and negative logits and it that case this could make a substantial difference in the quality of the sampling. |
fix bug mention in #2302