Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use code on the Hub from another repo #22698

Merged
merged 6 commits into from
Apr 17, 2023
Merged

Use code on the Hub from another repo #22698

merged 6 commits into from
Apr 17, 2023

Conversation

sgugger
Copy link
Collaborator

@sgugger sgugger commented Apr 10, 2023

What does this PR do?

This makes it easier to maintain only one source of ground truth when using the code on the Hub feature by storing the repo ID on top of the module containing the class inside the config. Thus, when saving and re-pushing a model using code on the Hub, the code is not copied over anymore, but a reference to the original repo containing the code is put.

This might be breaking if some users relied on the code being copied over when save_pretrained(xxx) is executed. To enable that old behavior, one only needs to call the register_for_auto_class method:

from transformers import AutoModel

model = AutoModel.from_pretrained("hf-internal-testing/test_dynamic_model", trust_remote_code=True)
model.save_pretrained(some_path)

then some_path only contains the config and weights of the model. The config will contain links to the repo where the code of the model is defined (hf-internal-testing/test_dynamic_model) so that it can be reloaded via

AutoModel.from_pretrained(some_path)

To get the custom code file copied other (behavior before this PR) just do:

from transformers import AutoModel

model = AutoModel.from_pretrained("hf-internal-testing/test_dynamic_model", trust_remote_code=True)
model.register_for_auto_class("AutoModel")
model.save_pretrained(some_path)

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 10, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice 🔥 Looking forward to seeing this being used. Thanks for adding!

@@ -667,6 +667,11 @@ def _get_config_dict(
else:
logger.info(f"loading configuration file {configuration_file} from cache at {resolved_config_file}")

if "auto_map" in config_dict and not is_local:
config_dict["auto_map"] = {
k: (f"{pretrained_model_name_or_path}--{v}" if "--" not in v else v)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is v always a single object here and never a list or tuple, like in the conversion that happens in tokenization_utils_base.py ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be, though it doesn't hurt to be careful. Will adapt.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, LGTM! Played a bit with it locally, seems to work well. Will play a bit more with it once it lands on main.

@sgugger sgugger merged commit ea7b0a5 into main Apr 17, 2023
@sgugger sgugger deleted the other_repo_code branch April 17, 2023 15:36
@sgugger sgugger restored the other_repo_code branch April 17, 2023 18:21
sgugger added a commit that referenced this pull request Apr 17, 2023
sgugger added a commit that referenced this pull request Apr 17, 2023
Revert "Use code on the Hub from another repo (#22698)"

This reverts commit ea7b0a5.
novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023
* initial work

* Add other classes

* Refactor code

* Move warning and fix dynamic pipeline

* Issue warning when necessary

* Add test
novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023
Revert "Use code on the Hub from another repo (huggingface#22698)"

This reverts commit ea7b0a5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants