-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit arguments in from_pretrained
#24306
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for owrking on this. I think ignore_mismatched_sizes
and use_safetensors
might be worth exposing as well. We should also rework the docstring to group the arguments exposed first and the the others second (with titles like main arguments and power user arguments).
TODO:
|
@sgugger Would be nice if you can take a quick look 🙏 . And do you want me to deal with all framework (TF/Flax), tokenizer/processor, and also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to add tests of the new token
argument as it's not passed along properly (unless I missed something).
from_pretrained
from_pretrained
You have a lot of tests failing to fix 😅 , sure you want a review yet? |
@sgugger No, I didn't request a new review since last time you have a look. But the changes pushed triggered you 😆 |
def _set_token_in_kwargs(self, kwargs, token=None): | ||
"""Temporary method to deal with `token` and `use_auth_token`. | ||
|
||
This method is to avoid apply the same changes in all model config classes that overwrite `from_pretrained`. | ||
|
||
Need to clean up `use_auth_token` in a follow PR. | ||
""" | ||
# Some model config classes like CLIP define their own `from_pretrained` without the new argument `token` yet. | ||
if token is None: | ||
token = kwargs.pop("token", None) | ||
use_auth_token = kwargs.pop("use_auth_token", None) | ||
|
||
if use_auth_token is not None: | ||
warnings.warn( | ||
"The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers.", FutureWarning | ||
) | ||
if token is not None: | ||
raise ValueError( | ||
"`token` and `use_auth_token` are both specified. Please set only the argument `token`." | ||
) | ||
token = use_auth_token | ||
|
||
if token is not None: | ||
# change to `token` in a follow-up PR | ||
kwargs["use_auth_token"] = token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have quite some model classes (e.g. clip-like family) whose config classes have their own from_pretrained
.
This private _set_token_in_kwargs
method is to make life easier when dealing with token
and use_auth_token
.
We will see what the best way is in a follow up PR when we want to make those customized from_pretrained
with explicit arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sgugger Now it's the good time if you are motivated after 🦷.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh... Annoying! This fix works in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
def _set_token_in_kwargs(self, kwargs, token=None): | ||
"""Temporary method to deal with `token` and `use_auth_token`. | ||
|
||
This method is to avoid apply the same changes in all model config classes that overwrite `from_pretrained`. | ||
|
||
Need to clean up `use_auth_token` in a follow PR. | ||
""" | ||
# Some model config classes like CLIP define their own `from_pretrained` without the new argument `token` yet. | ||
if token is None: | ||
token = kwargs.pop("token", None) | ||
use_auth_token = kwargs.pop("use_auth_token", None) | ||
|
||
if use_auth_token is not None: | ||
warnings.warn( | ||
"The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers.", FutureWarning | ||
) | ||
if token is not None: | ||
raise ValueError( | ||
"`token` and `use_auth_token` are both specified. Please set only the argument `token`." | ||
) | ||
token = use_auth_token | ||
|
||
if token is not None: | ||
# change to `token` in a follow-up PR | ||
kwargs["use_auth_token"] = token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh... Annoying! This fix works in the meantime.
What does this PR do?
[still incomplete]
Need to apply the same changes to other files containing
from_pretrained
(other framework, other classes like config, processor, auto, etc.) but @sgugger let me know if I am not lost already in the early stage.