Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: ChatGLMForConditionalGeneration does not support gradient checkpointing. #25

Open
deepeye opened this issue Apr 19, 2023 · 7 comments

Comments

@deepeye
Copy link

deepeye commented Apr 19, 2023

训练的时候提示如下错误:

(venv) [xinjingjing@dev-gpu-node-09 InstructGLM]$ python train_lora.py \
>     --dataset_path data/belle \
>     --lora_rank 8 \
>     --per_device_train_batch_size 2 \
>     --gradient_accumulation_steps 1 \
>     --max_steps 52000 \
>     --save_steps 1000 \
>     --save_total_limit 2 \
>     --learning_rate 2e-5 \
>     --fp16 \
>     --remove_unused_columns false \
>     --logging_steps 50 \
>     --output_dir output

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.2/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 112
CUDA SETUP: Loading binary /data/chat/InstructGLM/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda112.so...
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:12<00:00,  1.51s/it]
Traceback (most recent call last):
  File "/data/chat/InstructGLM/train_lora.py", line 170, in <module>
    main()
  File "/data/chat/InstructGLM/train_lora.py", line 128, in main
    model.gradient_checkpointing_enable()
  File "/data/chat/InstructGLM/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1584, in gradient_checkpointing_enable
    raise ValueError(f"{self.__class__.__name__} does not support gradient checkpointing.")
ValueError: ChatGLMForConditionalGeneration does not support gradient checkpointing.
@amazingtanm
Copy link

我也遇到这种情况,有没有大佬解释一下该怎么办啊

@CapSOSkw
Copy link

CapSOSkw commented May 1, 2023

+1,同样这个问题

@cat1222
Copy link

cat1222 commented May 10, 2023

+1

@moliqingwa
Copy link

用最新的chatglm-6b的代码替换下

@riichg5
Copy link

riichg5 commented May 30, 2023

用最新的chatglm-6b的代码替换下

能具体一点吗,chatglm-6b代码里面没有找到相关代码

@Nicole19960319
Copy link

有人解决了这个问题吗?求答案

@riichg5
Copy link

riichg5 commented Jun 2, 2023

有人解决了这个问题吗?求答案

注释掉报错这行代码即可

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants