You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you for your excellent work and effort! I have some questions regarding running this project, and I hope you can help me out.
I followed the README to create the environment, and I am confident that the environment meets all the requirements of this project and the jaxformer project. However, when I tried to load the pretrained Codegen-6B-mono model with jaxformer.hf.codegen.modeling_codegen.CodeGenForCausalLM.from_pretrained like this:
from jaxformer.hf import sample
from jaxformer.hf.codegen import modeling_codegen
model = modeling_codegen.CodeGenForCausalLM.from_pretrained("codegen-6B-mono",low_cpu_mem_usage=True)
I encountered the following error:
Traceback (most recent call last):
File "jaxtest.py", line 16, in <module>
from jaxformer.hf import sample
File "/xxxx/ILF-for-code-generation-main/src/jaxformer/jaxformer/hf/sample.py", line 29, in <module>
from jaxformer.hf.codegen.modeling_codegen import CodeGenForCausalLM
File "/xxxx/ILF-for-code-generation-main/src/jaxformer/jaxformer/hf/codegen/modeling_codegen.py", line 27, in <module>
from transformers.utils import add_code_sample_docstrings, add_start_docstrings, add_start_docstrings_to_model_forward, logging
ImportError: cannot import name 'add_code_sample_docstrings' from 'transformers.utils' (/xxxx/llm_env/ilf/lib/python3.7/site-packages/transformers/utils/__init__.py)
According to this issue[https://github.com/salesforce/jaxformer/pull/30], I bumped transformers from 4.12.5 to 4.30.0. Then I got the following message:
Some weights of the model checkpoint at /xxxx/ILF-for-code-generation-main/checkpoints/codegen-6B-mono were not used when initializing CodeGenForCausalLM: ['transformer.h.15.attn.masked_bias', 'transformer.h.30.attn.bias', 'transformer.h.17.attn.masked_bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.18.attn.bias', 'transformer.h.13.attn.masked_bias', 'transformer.h.12.attn.masked_bias', 'transformer.h.23.attn.bias', 'transformer.h.28.attn.bias', 'transformer.h.26.attn.masked_bias', 'transformer.h.9.attn.bias', 'transformer.h.15.attn.bias', 'transformer.h.21.attn.bias', 'transformer.h.19.attn.masked_bias', 'transformer.h.19.attn.bias', 'transformer.h.8.attn.bias', 'transformer.h.21.attn.masked_bias', 'transformer.h.30.attn.masked_bias', 'transformer.h.1.attn.bias', 'transformer.h.29.attn.bias', 'transformer.h.25.attn.bias', 'transformer.h.25.attn.masked_bias', 'transformer.h.22.attn.bias', 'transformer.h.17.attn.bias', 'transformer.h.0.attn.masked_bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.31.attn.bias', 'transformer.h.13.attn.bias', 'transformer.h.14.attn.masked_bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.2.attn.bias', 'transformer.h.6.attn.bias', 'transformer.h.20.attn.bias', 'transformer.h.4.attn.bias', 'transformer.h.26.attn.bias', 'transformer.h.0.attn.bias', 'transformer.h.27.attn.bias', 'transformer.h.20.attn.masked_bias', 'transformer.h.11.attn.masked_bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.bias', 'transformer.h.10.attn.bias', 'transformer.h.31.attn.masked_bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.16.attn.masked_bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.27.attn.masked_bias', 'transformer.h.24.attn.bias', 'transformer.h.29.attn.masked_bias', 'transformer.h.32.attn.bias', 'transformer.h.11.attn.bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.16.attn.bias', 'transformer.h.3.attn.bias', 'transformer.h.28.attn.masked_bias', 'transformer.h.23.attn.masked_bias', 'transformer.h.7.attn.bias', 'transformer.h.32.attn.masked_bias', 'transformer.h.24.attn.masked_bias', 'transformer.h.12.attn.bias', 'transformer.h.9.attn.masked_bias', 'transformer.h.14.attn.bias', 'transformer.h.22.attn.masked_bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.18.attn.masked_bias']
- This IS expected if you are initializing CodeGenForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing CodeGenForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of CodeGenForCausalLM were not initialized from the model checkpoint at /DATA/disk1/wanming/ILF-for-code-generation-main/checkpoints/codegen-6B-mono and are newly initialized: ['transformer.h.32.attn.causal_mask', 'transformer.h.24.attn.causal_mask', 'transformer.h.18.attn.causal_mask', 'transformer.h.14.attn.causal_mask', 'transformer.h.2.attn.causal_mask', 'transformer.h.11.attn.causal_mask', 'transformer.h.27.attn.causal_mask', 'transformer.h.1.attn.causal_mask', 'transformer.h.25.attn.causal_mask', 'transformer.h.22.attn.causal_mask', 'transformer.h.29.attn.causal_mask', 'transformer.h.30.attn.causal_mask', 'transformer.h.21.attn.causal_mask', 'transformer.h.10.attn.causal_mask', 'transformer.h.0.attn.causal_mask', 'transformer.h.17.attn.causal_mask', 'transformer.h.26.attn.causal_mask', 'transformer.h.23.attn.causal_mask', 'transformer.h.31.attn.causal_mask', 'transformer.h.3.attn.causal_mask', 'transformer.h.12.attn.causal_mask', 'transformer.h.5.attn.causal_mask', 'transformer.h.8.attn.causal_mask', 'transformer.h.4.attn.causal_mask', 'transformer.h.13.attn.causal_mask', 'transformer.h.9.attn.causal_mask', 'transformer.h.15.attn.causal_mask', 'transformer.h.6.attn.causal_mask', 'transformer.h.28.attn.causal_mask', 'transformer.h.16.attn.causal_mask', 'transformer.h.20.attn.causal_mask', 'transformer.h.7.attn.causal_mask', 'transformer.h.19.attn.causal_mask']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
It seems like the model did not load correctly. Could you please help me identify what went wrong in the process?
The text was updated successfully, but these errors were encountered:
First of all, thank you for your excellent work and effort! I have some questions regarding running this project, and I hope you can help me out.
I followed the README to create the environment, and I am confident that the environment meets all the requirements of this project and the jaxformer project. However, when I tried to load the pretrained Codegen-6B-mono model with
jaxformer.hf.codegen.modeling_codegen.CodeGenForCausalLM.from_pretrained
like this:I encountered the following error:
According to this issue[https://github.com/salesforce/jaxformer/pull/30], I bumped transformers from 4.12.5 to 4.30.0. Then I got the following message:
It seems like the model did not load correctly. Could you please help me identify what went wrong in the process?
The text was updated successfully, but these errors were encountered: