Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Non-consecutive added token '<extra_id_99>' found. Should have index 32100 but has index 32000 in saved vocabulary. #213

Closed
2 tasks
GallenShao opened this issue Aug 30, 2024 · 5 comments
Assignees

Comments

@GallenShao
Copy link

System Info / 系統信息

diffusers==0.30.1

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

pipe = diffusers.DiffusionPipeline.from_pretrained('THUDM/CogVideoX-2b')
text_encoder/config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 783/783 [00:00<00:00, 105kB/s]
vae/config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 799/799 [00:00<00:00, 114kB/s]
Fetching 14 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:07<00:00, 1.85it/s]
Loading pipeline components...: 20%|██████████████████████▊ | 1/5 [00:09<00:38, 9.70s/it]
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 876, in from_pretrained
loaded_sub_model = load_sub_model(
File "/usr/local/lib/python3.8/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 700, in load_sub_model
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/usr/local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1805, in from_pretrained
return cls._from_pretrained(
File "/usr/local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2015, in _from_pretrained
raise ValueError(
ValueError: Non-consecutive added token '<extra_id_99>' found. Should have index 32100 but has index 32000 in saved vocabulary.

Expected behavior / 期待表现

直接用最新diffusers下载2b模型会报错?请问是什么原因呢

@zRzRzRzRzRzRzR
Copy link
Member

要不直接下载到本地再从本地加载呢,没出现过,我在huggingface space是这么运行,没有问题
另外,python3.8可能太低了,我们推理用的python3.10.14起

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Aug 30, 2024
@Aliang-CN
Copy link

我也是报这个错误,zRzRzRzRzRzRzR看看有没有tokenizer_file这个文件

@Aliang-CN
Copy link

我把added_tokens.json这个文件删除就ok了

@Aliang-CN
Copy link

我不知道<extra_id_01> ~ <extra_id_99> 这段编码的作用是什么

@zsh2000
Copy link

zsh2000 commented Oct 8, 2024

需要将transformers升级至4.44.2以上

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants