Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to load model through jaxformer #2

Open
damengdameng opened this issue Jul 11, 2023 · 0 comments
Open

failed to load model through jaxformer #2

damengdameng opened this issue Jul 11, 2023 · 0 comments

Comments

@damengdameng
Copy link

damengdameng commented Jul 11, 2023

First of all, thank you for your excellent work and effort! I have some questions regarding running this project, and I hope you can help me out.

I followed the README to create the environment, and I am confident that the environment meets all the requirements of this project and the jaxformer project. However, when I tried to load the pretrained Codegen-6B-mono model with jaxformer.hf.codegen.modeling_codegen.CodeGenForCausalLM.from_pretrained like this:

from jaxformer.hf import sample
from jaxformer.hf.codegen import modeling_codegen

model = modeling_codegen.CodeGenForCausalLM.from_pretrained("codegen-6B-mono",low_cpu_mem_usage=True)

I encountered the following error:

Traceback (most recent call last):
  File "jaxtest.py", line 16, in <module>
    from jaxformer.hf import sample
  File "/xxxx/ILF-for-code-generation-main/src/jaxformer/jaxformer/hf/sample.py", line 29, in <module>
    from jaxformer.hf.codegen.modeling_codegen import CodeGenForCausalLM
  File "/xxxx/ILF-for-code-generation-main/src/jaxformer/jaxformer/hf/codegen/modeling_codegen.py", line 27, in <module>
    from transformers.utils import add_code_sample_docstrings, add_start_docstrings, add_start_docstrings_to_model_forward, logging
ImportError: cannot import name 'add_code_sample_docstrings' from 'transformers.utils' (/xxxx/llm_env/ilf/lib/python3.7/site-packages/transformers/utils/__init__.py)

According to this issue[https://github.com/salesforce/jaxformer/pull/30], I bumped transformers from 4.12.5 to 4.30.0. Then I got the following message:

Some weights of the model checkpoint at /xxxx/ILF-for-code-generation-main/checkpoints/codegen-6B-mono were not used when initializing CodeGenForCausalLM: ['transformer.h.15.attn.masked_bias', 'transformer.h.30.attn.bias', 'transformer.h.17.attn.masked_bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.18.attn.bias', 'transformer.h.13.attn.masked_bias', 'transformer.h.12.attn.masked_bias', 'transformer.h.23.attn.bias', 'transformer.h.28.attn.bias', 'transformer.h.26.attn.masked_bias', 'transformer.h.9.attn.bias', 'transformer.h.15.attn.bias', 'transformer.h.21.attn.bias', 'transformer.h.19.attn.masked_bias', 'transformer.h.19.attn.bias', 'transformer.h.8.attn.bias', 'transformer.h.21.attn.masked_bias', 'transformer.h.30.attn.masked_bias', 'transformer.h.1.attn.bias', 'transformer.h.29.attn.bias', 'transformer.h.25.attn.bias', 'transformer.h.25.attn.masked_bias', 'transformer.h.22.attn.bias', 'transformer.h.17.attn.bias', 'transformer.h.0.attn.masked_bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.31.attn.bias', 'transformer.h.13.attn.bias', 'transformer.h.14.attn.masked_bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.2.attn.bias', 'transformer.h.6.attn.bias', 'transformer.h.20.attn.bias', 'transformer.h.4.attn.bias', 'transformer.h.26.attn.bias', 'transformer.h.0.attn.bias', 'transformer.h.27.attn.bias', 'transformer.h.20.attn.masked_bias', 'transformer.h.11.attn.masked_bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.bias', 'transformer.h.10.attn.bias', 'transformer.h.31.attn.masked_bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.16.attn.masked_bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.27.attn.masked_bias', 'transformer.h.24.attn.bias', 'transformer.h.29.attn.masked_bias', 'transformer.h.32.attn.bias', 'transformer.h.11.attn.bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.16.attn.bias', 'transformer.h.3.attn.bias', 'transformer.h.28.attn.masked_bias', 'transformer.h.23.attn.masked_bias', 'transformer.h.7.attn.bias', 'transformer.h.32.attn.masked_bias', 'transformer.h.24.attn.masked_bias', 'transformer.h.12.attn.bias', 'transformer.h.9.attn.masked_bias', 'transformer.h.14.attn.bias', 'transformer.h.22.attn.masked_bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.18.attn.masked_bias']
- This IS expected if you are initializing CodeGenForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing CodeGenForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of CodeGenForCausalLM were not initialized from the model checkpoint at /DATA/disk1/wanming/ILF-for-code-generation-main/checkpoints/codegen-6B-mono and are newly initialized: ['transformer.h.32.attn.causal_mask', 'transformer.h.24.attn.causal_mask', 'transformer.h.18.attn.causal_mask', 'transformer.h.14.attn.causal_mask', 'transformer.h.2.attn.causal_mask', 'transformer.h.11.attn.causal_mask', 'transformer.h.27.attn.causal_mask', 'transformer.h.1.attn.causal_mask', 'transformer.h.25.attn.causal_mask', 'transformer.h.22.attn.causal_mask', 'transformer.h.29.attn.causal_mask', 'transformer.h.30.attn.causal_mask', 'transformer.h.21.attn.causal_mask', 'transformer.h.10.attn.causal_mask', 'transformer.h.0.attn.causal_mask', 'transformer.h.17.attn.causal_mask', 'transformer.h.26.attn.causal_mask', 'transformer.h.23.attn.causal_mask', 'transformer.h.31.attn.causal_mask', 'transformer.h.3.attn.causal_mask', 'transformer.h.12.attn.causal_mask', 'transformer.h.5.attn.causal_mask', 'transformer.h.8.attn.causal_mask', 'transformer.h.4.attn.causal_mask', 'transformer.h.13.attn.causal_mask', 'transformer.h.9.attn.causal_mask', 'transformer.h.15.attn.causal_mask', 'transformer.h.6.attn.causal_mask', 'transformer.h.28.attn.causal_mask', 'transformer.h.16.attn.causal_mask', 'transformer.h.20.attn.causal_mask', 'transformer.h.7.attn.causal_mask', 'transformer.h.19.attn.causal_mask']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.

It seems like the model did not load correctly. Could you please help me identify what went wrong in the process?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant