-
Notifications
You must be signed in to change notification settings - Fork 970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use duck-typing to ensure underlying optimizer supports schedulefree hooks #3055
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
friendly ping cc @amyeroberts @muellerzr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, solution makes sense to me. cc @BenjaminBossan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM.
Just for my understanding: This is necessary because in transformers, we want to do stuff like:
if hasattr(self.optimizer, "eval") and callable(self.optimizer.eval):
self.optimizer.eval()
In accelerate, it was assumed that optimizer.train()
and optimizer.eval()
are only called if the underlying optimizer supports it, but with the proposed change to transformers, they are called if the method exists, which breaks with the accelerate assumption.
IMO this could be confusing to debug and it would be better if there were a dedicated method to check this, like: if optimizer.supports_train_eval_mode()
or so. But overall, the proposed solution is also okay.
Agreed - this would be a nicer way to handle! |
@BenjaminBossan while that's good, there's the problem of minimum accelerate versions. We can do this, in a follow-up I'll include the |
the new duck-typing approach in huggingface/transformers#30079 is causing build failures over there, because of the path through accelerate
we need to replicate that logic here, ensuring this code path is safe to call for all optimizers
cc @amyeroberts @muellerzr @winglian #2631