Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AutoModel.from_pretrained when the module name ends with y or p #31106

Conversation

monochromegane
Copy link

What does this PR do?

This PR fixes a loading error in AutoModel.from_pretrained where module names ending with y or p were incorrectly altered due to get_class_in_module removing these characters.

In the current implementation, rstrip is used to remove ".py" as a suffix from the file name, but according to the Python documentation, rstrip should not be used for this purpose.

The chars argument is not a suffix; rather, all combinations of its values are stripped

Therefore, in this pull request, I replace rstrip with str.removesuffix() as recommended by the documentation.
Below are the steps to reproduce the issue and the results after the fix.

Steps to Reproduce of get_class_in_module

test_get_class_in_module.py

from transformers import dynamic_module_utils

print(dynamic_module_utils.get_class_in_module("Foo", "test_ends_with_foo.py"))
print(dynamic_module_utils.get_class_in_module("Yp", "test_ends_with_yp.py"))
print(dynamic_module_utils.get_class_in_module("Y", "test_ends_with_y.py"))
print(dynamic_module_utils.get_class_in_module("P", "test_ends_with_p.py"))
$ HF_MODULES_CACHE="." python test_get_class_in_module.py
<class 'test_ends_with_foo.Foo'> # Correct module name
<class 'test_ends_with_.Yp'>  # Removed yp from module name
<class 'test_ends_with_.Y'>  #  Removed y from module name
<class 'test_ends_with_.P'>. #  Removed p from module name

test_ends_with_foo.py

class Foo:
    ...

test_ends_with_yp.py

class Yp:
    ...

test_ends_with_y.py

class Y:
    ...

test_ends_with_p.py

class P:
    ...

Steps to Reproduce of AutoModel.from_pretrained

test_auto_model.py

from transformers import AutoModel

AutoModel.from_pretrained(
    "line-corporation/clip-japanese-base",
    revision="ff0d63968516eeb4ccec05e11bf679215cdd67d7",
    trust_remote_code=True,
)
$ pip install torch timm
$ python test_auto_model.py
Traceback (most recent call last):
  File "BASE_DIR/test_auto_model.py", line 3, in <module>
    AutoModel.from_pretrained(
  File "BASE_DIR/src/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
    cls.register(config.__class__, model_class, exist_ok=True)
  File "BASE_DIR/src/transformers/models/auto/auto_factory.py", line 584, in register
    raise ValueError(
ValueError: The model class you are passing has a `config_class` attribute that is not consistent with the config class you passed (model has <class 'transformers_modules.line-corporation.clip-japanese-base.ff0d63968516eeb4ccec05e11bf679215cdd67d7.configuration_clyp.CLYPConfig'> and you passed <class 'transformers_modules.line-corporation.clip-japanese-base.ff0d63968516eeb4ccec05e11bf679215cdd67d7.configuration_cl.CLYPConfig'>. Fix one of those so they match!

As indicated in the error message, the library expects configuration_clyp, but it fails to load because configuration_cl (with yp removed) is provided.

Results after the fix.

$ HF_MODULES_CACHE="." python test_get_class_in_module.py
<class 'test_ends_with_foo.Foo'>
<class 'test_ends_with_yp.Yp'>
<class 'test_ends_with_y.Y'>
<class 'test_ends_with_p.P'>
$ python test_auto_model.py
model.safetensors: 100%|███████████████████████████████████████████████████████████| 787M/787M [00:29<00:00, 26.9MB/s]

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@monochromegane
Copy link
Author

@amyeroberts @ydshieh
Based on the results of git blame, I have mentioned the two reviewers involved in the related pull request.
I would appreciate it if you could review this pull request.

@ydshieh
Copy link
Collaborator

ydshieh commented May 29, 2024

Previously it was .replace(".py", ""), it's my bad sorry.

@@ -198,7 +198,7 @@ def get_class_in_module(class_name: str, module_path: Union[str, os.PathLike]) -
Returns:
`typing.Type`: The class looked for.
"""
name = os.path.normpath(module_path).rstrip(".py").replace(os.path.sep, ".")
name = os.path.normpath(module_path).removesuffix(".py").replace(os.path.sep, ".")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get

AttributeError: 'str' object has no attribute 'removesuffix'

however (on python 3.8)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review.
Oh, I failed to consider Python 3.8 support.

@Rocketknight1
Copy link
Member

I have a PR open for this that works on Python 3.8 here

@monochromegane
Copy link
Author

I support the pull request which also supports Python 3.8.

@monochromegane
Copy link
Author

Thank you @Rocketknight1 @ydshieh

@monochromegane monochromegane deleted the fix-broken-module-name-endswith-yp branch May 29, 2024 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants