Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falcon Models saved with save_pretrained no longer get saved with python files #24737

Closed
4 tasks
fadynakhla opened this issue Jul 10, 2023 · 13 comments · Fixed by #24785
Closed
4 tasks

Falcon Models saved with save_pretrained no longer get saved with python files #24737

fadynakhla opened this issue Jul 10, 2023 · 13 comments · Fixed by #24785
Assignees

Comments

@fadynakhla
Copy link
Contributor

System Info

  • transformers version: 4.30.2
  • Platform: Linux-5.15.0-75-generic-x86_64-with-glibc2.35
  • Python version: 3.10.3
  • Huggingface_hub version: 0.16.4
  • Safetensors version: 0.3.1
  • PyTorch version (GPU?): 2.0.1+cu117 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: No or N/A
  • Using distributed or parallel set-up in script?: No or N/A

Who can help?

@ArthurZucker @younesbelkada

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

When saving tiiuae/falcon models using

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("tiiuae/falcon-7b-instruct")
model.save_pretrained("/path/to/save")

the python files configuration_RW.py and modelling_RW.py are no longer saved. Loading the model with from_pretrained(...) results in the following error:

>>> model = AutoModelForCausalLM.from_pretrained("/data/test-models/falcon-40b-instruct", trust_remote_code=True)
Could not locate the configuration_RW.py inside /data/test-models/falcon-40b-instruct.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/recoverx/.cache/pypoetry/virtualenvs/test-tgi-yWaeKVH5-py3.10/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 456, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
  File "/home/recoverx/.cache/pypoetry/virtualenvs/test-tgi-yWaeKVH5-py3.10/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 953, in from_pretrained
    config_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
  File "/home/recoverx/.cache/pypoetry/virtualenvs/test-tgi-yWaeKVH5-py3.10/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 431, in get_class_from_dynamic_module
    final_module = get_cached_module_file(
  File "/home/recoverx/.cache/pypoetry/virtualenvs/test-tgi-yWaeKVH5-py3.10/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 247, in get_cached_module_file
    resolved_module_file = cached_file(
  File "/home/recoverx/.cache/pypoetry/virtualenvs/test-tgi-yWaeKVH5-py3.10/lib/python3.10/site-packages/transformers/utils/hub.py", line 388, in cached_file
    raise EnvironmentError(
OSError: /data/test-models/falcon-40b-instruct does not appear to have a file named configuration_RW.py. Checkout 'https://huggingface.co//data/test-models/falcon-40b-instruct/None' for available files.

Expected behavior

To be able to load the model with from_pretrained after saving it with save_pretrained either by having the python files saved or pulling them from the hub.

With transformers version = 4.27.4 using save_pretrained() does actually save the python files and the saved model can be loaded right away

@ydshieh ydshieh self-assigned this Jul 11, 2023
@ydshieh
Copy link
Collaborator

ydshieh commented Jul 12, 2023

Hi @sgugger

I checked the code snippet and indeed only config and model bin files are saved. (tested on main branch of July 10th)
I am more than happy to help and learn, but I would like to know if this behavior is expected before taking action.
(and if you want to fix directly, ok for me)

total 27038084
-rw-r--r-- 1 root root        773 Jul 12 12:41 config.json
-rw-r--r-- 1 root root        116 Jul 12 12:41 generation_config.json
-rw-r--r-- 1 root root 9962615667 Jul 12 12:41 pytorch_model-00001-of-00003.bin
-rw-r--r-- 1 root root 9939388767 Jul 12 12:42 pytorch_model-00002-of-00003.bin
-rw-r--r-- 1 root root 7784945757 Jul 12 12:42 pytorch_model-00003-of-00003.bin
-rw-r--r-- 1 root root      16924 Jul 12 12:42 pytorch_model.bin.index.json

@sgugger
Copy link
Collaborator

sgugger commented Jul 12, 2023

This is expected as the config will keep references to where the code lives, you can see it has:

  "auto_map": {
    "AutoConfig": "tiiuae/falcon-7b-instruct--configuration_RW.RWConfig",
    "AutoModelForCausalLM": "tiiuae/falcon-7b-instruct--modelling_RW.RWForCausalLM"
  },

Saving then reloading with from_pretrained from the local dir works without issue on main. I don't know what exact code sample caused the issue but on my side:

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("tiiuae/falcon-7b-instruct", trust_remote_code=True)
model.save_pretrained("/path/to/save")

new_model = AutoModelForCausalLM.from_pretrained("/path/to/save", trust_remote_code=True)

works.

@fadynakhla
Copy link
Contributor Author

fadynakhla commented Jul 12, 2023

Hey @sgugger apologies for the misunderstanding you're right I was mistaken and over simplified the code snippet causing the issue; after taking another look I've realized that the issue is how I've downloaded the model. Rather than using

AutoModelForCausalLM.from_pretrained("tiiuae/falcon-7b-instruct", trust_remote_code=True)

I first download the model locally with

git lfs install
git clone git@hf.co:tiiuae/falcon-7b-instruct

if I inspect config.json I see this:

"auto_map": {
  "AutoConfig": "configuration_RW.RWConfig",
  "AutoModelForCausalLM": "modelling_RW.RWForCausalLM"
},

which matches what is in the hub here: https://huggingface.co/tiiuae/falcon-7b-instruct/blob/main/config.json.
Then when running

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("/local/falcon-7b-instruct", trust_remote_code=True)
model.save_pretrained("/path/to/save")

new_model = AutoModelForCausalLM.from_pretrained("/path/to/save", trust_remote_code=True)

I get the error above. It may be that this is the expected behavior but it works fine with version 4.27.4 as in that case save_pretrained() actually copies over configuration_RW.py and modelling_RW.py.

My assumption is that this is issue is due to RWConfig and RWModel being defined within the model repo as opposed to within the transformers library but I may be mistaken.

@sgugger
Copy link
Collaborator

sgugger commented Jul 12, 2023

That I can reproduce. This should be fixed by the PR mentioned above.

@fadynakhla
Copy link
Contributor Author

fadynakhla commented Jul 12, 2023

That's awesome thanks, just a question or two if that's alright so I can see if I understand what's going on here:

            if os.path.isdir(pretrained_model_name_or_path):
                model_class.register_for_auto_class(cls.__name__)
            else:
                cls._model_mapping.register(config.__class__, model_class, exist_ok=True)

in case we are loading from a local trust remote code repo model_class.register_for_auto_class() sets model_class._auto_class = cls.__name__ which I believe in the case of falcon results in RWForCausalLM._auto_class = "RWForCausalLM"

Then in the call to save_pretrained() this block:

        if self._auto_class is not None:
            custom_object_save(self, save_directory, config=self.config)

get's executed which results in the modelling files being saved along with the the weights and config files. Is that correct?

Edit: one other question is there a reason why this cls._model_mapping.register(config.__class__, model_class, exist_ok=True) is used in stead of cls.register(config.__class__, model_class, exist_ok=True)?

@sgugger
Copy link
Collaborator

sgugger commented Jul 12, 2023

That's completely correct!

As for the second question, I haven't deep-dived to make sure the two do exactly the same thing, but it's probably the same yes. This line is only there so that pipeline does not complain that the model doesn't belong to the corresponding auto class when using remote code.

@fadynakhla
Copy link
Contributor Author

fadynakhla commented Jul 12, 2023

Thanks again for all your help really appreciate it! Tested this with your PR and works on my end for local falcon models!

Also cls.register() just calls cls._model_mapping.register() with an additional check

    @classmethod
    def register(cls, config_class, model_class, exist_ok=False):
        if hasattr(model_class, "config_class") and model_class.config_class != config_class:
            raise ValueError(
                "The model class you are passing has a `config_class` attribute that is not consistent with the "
                f"config class you passed (model has {model_class.config_class} and you passed {config_class}. Fix "
                "one of those so they match!"
            )
        cls._model_mapping.register(config_class, model_class, exist_ok=exist_ok)

Switching that line out to cls.register doesn't cause the above value error at least when loading falcon with from_pretrained but not sure if there are cases where it would be benificial to not have the restriction that model_class.config_class == config_class

@sgugger
Copy link
Collaborator

sgugger commented Jul 13, 2023

I think it would be fine if we add an and not exist_ok in the test. Would you like to make a PR with those changes?

@fadynakhla
Copy link
Contributor Author

Yeah would love to just want to make sure I understand the rationale behind adding and not exist_ok.
Correct me if I'm wrong but I think the reason is that if exists_ok = True we will overwrite _model_mapping anyway so we don't want to enforce the restriction that model_class.config_class == config_class; is that the right idea?

@sgugger
Copy link
Collaborator

sgugger commented Jul 13, 2023

Oh I completely misread your comment, thanks for asking a clarification. The test should be left as is, it is a consistency check, not an exist ok check. We can do the switch without adding anything.

@fadynakhla
Copy link
Contributor Author

Ok makes sense; more than happy to still make a PR for that switch if it would be helpful

@sgugger
Copy link
Collaborator

sgugger commented Jul 13, 2023

Please go ahead!

@fadynakhla
Copy link
Contributor Author

PR is linked above! One of us will have to rebase/fix conflicts as I've made these changes on top of main which hasn't incorporated your PR yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants