ValueError: ChatGLMForConditionalGeneration does not support gradient checkpointing. #25

deepeye · 2023-04-19T06:41:49Z

训练的时候提示如下错误：

(venv) [xinjingjing@dev-gpu-node-09 InstructGLM]$ python train_lora.py \
>     --dataset_path data/belle \
>     --lora_rank 8 \
>     --per_device_train_batch_size 2 \
>     --gradient_accumulation_steps 1 \
>     --max_steps 52000 \
>     --save_steps 1000 \
>     --save_total_limit 2 \
>     --learning_rate 2e-5 \
>     --fp16 \
>     --remove_unused_columns false \
>     --logging_steps 50 \
>     --output_dir output

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /usr/local/cuda-11.2/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 112
CUDA SETUP: Loading binary /data/chat/InstructGLM/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda112.so...
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:12<00:00,  1.51s/it]
Traceback (most recent call last):
  File "/data/chat/InstructGLM/train_lora.py", line 170, in <module>
    main()
  File "/data/chat/InstructGLM/train_lora.py", line 128, in main
    model.gradient_checkpointing_enable()
  File "/data/chat/InstructGLM/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1584, in gradient_checkpointing_enable
    raise ValueError(f"{self.__class__.__name__} does not support gradient checkpointing.")
ValueError: ChatGLMForConditionalGeneration does not support gradient checkpointing.

The text was updated successfully, but these errors were encountered:

amazingtanm · 2023-04-24T02:32:44Z

我也遇到这种情况，有没有大佬解释一下该怎么办啊

CapSOSkw · 2023-05-01T03:12:30Z

+1，同样这个问题

cat1222 · 2023-05-10T02:27:28Z

+1

moliqingwa · 2023-05-16T06:52:35Z

用最新的chatglm-6b的代码替换下

riichg5 · 2023-05-30T06:43:24Z

用最新的chatglm-6b的代码替换下

能具体一点吗，chatglm-6b代码里面没有找到相关代码

Nicole19960319 · 2023-06-02T08:32:08Z

有人解决了这个问题吗？求答案

riichg5 · 2023-06-02T08:36:04Z

有人解决了这个问题吗？求答案

注释掉报错这行代码即可

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: ChatGLMForConditionalGeneration does not support gradient checkpointing. #25

ValueError: ChatGLMForConditionalGeneration does not support gradient checkpointing. #25

deepeye commented Apr 19, 2023

amazingtanm commented Apr 24, 2023

CapSOSkw commented May 1, 2023

cat1222 commented May 10, 2023

moliqingwa commented May 16, 2023

riichg5 commented May 30, 2023

Nicole19960319 commented Jun 2, 2023

riichg5 commented Jun 2, 2023

ValueError: ChatGLMForConditionalGeneration does not support gradient checkpointing. #25

ValueError: ChatGLMForConditionalGeneration does not support gradient checkpointing. #25

Comments

deepeye commented Apr 19, 2023

amazingtanm commented Apr 24, 2023

CapSOSkw commented May 1, 2023

cat1222 commented May 10, 2023

moliqingwa commented May 16, 2023

riichg5 commented May 30, 2023

Nicole19960319 commented Jun 2, 2023

riichg5 commented Jun 2, 2023