Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: 150000 is not in list #5

Open
superhg opened this issue Mar 25, 2023 · 5 comments
Open

ValueError: 150000 is not in list #5

superhg opened this issue Mar 25, 2023 · 5 comments

Comments

@superhg
Copy link

superhg commented Mar 25, 2023

使用belle数据训练的时候,遇到这个错误,看了一下是训练文本文本中含有150000这个数字,出现了很多次。

File "/tal-vePFS/LLM/hegang/workspace/ChatGLM-chinese-insturct/modeling_chatglm.py", line 836, in forward mask_position = seq.index(mask_token) ValueError: 150000 is not in list

@superhg
Copy link
Author

superhg commented Mar 25, 2023

使用belle数据训练的时候,遇到这个错误
File "/tal-vePFS/LLM/hegang/workspace/ChatGLM-chinese-insturct/modeling_chatglm.py", line 836, in forward mask_position = seq.index(mask_token) ValueError: 150000 is not in list

@yanqiangmiffy
Copy link
Owner

继续补充一个问题model.config.eos_token_id=20002 tokenizer的结束id和model.config的结束id不一致

@yanqiangmiffy
Copy link
Owner

这个你用的官方tokenizer吗,我用的这个https://huggingface.co/THUDM/chatglm-6b

@ty33123
Copy link

ty33123 commented Apr 12, 2023

这个问题解决了吗?我也出现了这个错误。

@ty33123
Copy link

ty33123 commented Apr 13, 2023

这个问题解决了吗?我也出现了这个错误。

在Huggingface下载最新版权重,即可解决。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants