-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error while saving checkpoint during training #26732
Comments
Hmmm we have very little visibility in the error due to your log of the error. Would it be possible to have it completely raise so as to have the traceback? Also could you try installing from source to see if your problem is fixed? You can do so with |
@LysandreJik Please find the full traceback below
Temporary Fix 🔧 Issue was happening when we save tokenizer while saving checkpoint. I was able to fix it by removing tokenizer parameter in trainer as below:
|
Hey! I think this will be fixed by #26570! Will keep you updated |
Hey @humza-sami could you try running your script with #26570? |
Hi @ArthurZucker , I followed this:
Still when I save the tokenizer, error is same.
ERROR
|
Bit strange, this worked for me |
@ArthurZucker If possible can you share a test code snippet you are using which I can test with my code ?
Its giving me error. I am using latest 4.34.1v of transformers |
Alright I can indee reproduce now, the |
I'm still working on the PR 😉 |
It's planned for this release! 🤗 One small test to fix and will be merged |
Thanks for you patience @ghost (oups) now fixed |
System Info
transformers
version: 4.34.0Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I am training codellama model on custom dataset. Training starts but when it tries to save the checkpoint then it gives the error and stop training.
ERROR:
2023-10-11 11:34:18,589 - ERROR - Error in Logs due to Object of type method is not JSON serializable
CODE:
Expected behavior
I am expecting that model should continue training without stopping while saving the checkpoints.
The text was updated successfully, but these errors were encountered: