generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instability with the sentiment analysis example #417
Comments
Can you try without batched generation (in case you are using it)? |
https://wandb.ai/costa-huang/trl/runs/dak1k9il/code?workspace=user-costa-huang has the code associated with the run. The relevant code is
Does it use batched generation? |
This should have been solved with #487 so closing for now. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi all, I ran the latest TRL (after merging #410) with the sentiment analysis example for ten random seeds. I noticed 3 out 10 experiments would experience some amount of negative KL explosion, though one of them recovered 🫠... Related #235 (comment)
The text was updated successfully, but these errors were encountered: