[Usage] ValueError: The generation config instance is invalid #1144

shipengai · 2024-02-18T02:32:46Z

Describe the issue

Issue: 在finetune阶段，存储模型失败。
transformers==4.37.2
deepspeed==0.12.6

Command:

deepspeed llava/train/train_mem.py \
    --deepspeed ./scripts/zero3.json \
    --model_name_or_path pretained_weights/vicuna-7b-v1.5 \
    --version v1 \
    --data_path llava_v1_5_mix665k.json \
    --image_folder llava-stage2 \
    --vision_tower pretained_weights/clip-vit-large-patch14-336 \
    --pretrain_mm_mlp_adapter checkpoints/llava-v1.5-7b-pretrain/mm_projector.bin \
    --mm_projector_type mlp2x_gelu \
    --mm_vision_select_layer -2 \
    --mm_use_im_start_end False \
    --mm_use_im_patch_token False \
    --image_aspect_ratio pad \
    --group_by_modality_length True \
    --bf16 True \
    --output_dir checkpoints/llava-v1.5-7b \
    --num_train_epochs 1 \
    --per_device_train_batch_size 16 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 1 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 50000 \
    --save_total_limit 1 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --dataloader_num_workers 4 \
    --lazy_preprocess True \
    --report_to tensorboard

Log:

Traceback (most recent call last):
  File "/workdir/conda_envs/llava16/lib/python3.10/site-packages/transformers/trainer.py", line 2873, in save_model
    self._save(output_dir, state_dict=state_dict)
  File "/workdir/llava/train/llava_trainer.py", line 255, in _save
    super(LLaVATrainer, self)._save(output_dir, state_dict)
  File "/workdir/conda_envs/llava16/lib/python3.10/site-packages/transformers/trainer.py", line 2958, in _save
    self.model.save_pretrained(
  File "/workdir/conda_envs/llava16/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2364, in save_pretrained
    model_to_save.generation_config.save_pretrained(save_directory)
  File "/workdir/conda_envs/llava16/lib/python3.10/site-packages/transformers/generation/configuration_utils.py", line 560, in save_pretrained
    raise ValueError(
ValueError: The generation config instance is invalid -- `.validate()` throws warnings and/or exceptions. Fix these issues to save the configuration.

Thrown during validation:
[UserWarning('`do_sample` is set to `False`. However, `temperature` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.'), UserWarning('`do_sample` is set to `False`. However, `top_p` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.')]

Screenshots:
You may attach screenshots if it better explains the issue.

The text was updated successfully, but these errors were encountered:

jsg921019 · 2024-02-18T06:37:15Z

This error appears to be a problem that occurred while upgrading transformers version.
I fixed this problem by manually adding
do_sample: true
in vicuna's generation_config.json file.

wentaoyuan · 2024-03-04T08:39:46Z

Hi, I met the same issue too. Where did you guys find vicuna's generation_config.json? What I have to do is resetting the model's generation_config attributes after loading the model with its from_pretrained method at https://github.com/haotian-liu/LLaVA/blob/main/llava/train/train.py#L842, but doing this gives me the following warning:

Your generation config was originally created from the model config, but the model config has changed since then. Unless you pass the `generation_config` argument to this model's `generate` calls, they w
ill revert to the legacy behavior where the base `generate` parameterization is loaded from the model config instead. To avoid this behavior and this warning, we recommend you to overwrite the generation
 config model attribute before calling the model's `save_pretrained`, preferably also removing any generation kwargs from the model config. This warning will be raised to an exception in v4.41.

ppx-hub · 2024-03-05T06:52:53Z

Hi, I met the same issue too. Where did you guys find vicuna's generation_config.json? What I have to do is resetting the model's generation_config attributes after loading the model with its from_pretrained method at https://github.com/haotian-liu/LLaVA/blob/main/llava/train/train.py#L842, but doing this gives me the following warning:
Your generation config was originally created from the model config, but the model config has changed since then. Unless you pass the `generation_config` argument to this model's `generate` calls, they w
ill revert to the legacy behavior where the base `generate` parameterization is loaded from the model config instead. To avoid this behavior and this warning, we recommend you to overwrite the generation
 config model attribute before calling the model's `save_pretrained`, preferably also removing any generation kwargs from the model config. This warning will be raised to an exception in v4.41.

On my system, I found the soft-link named generation_config.json under the .cache cache in the /root/.cache/huggingface/hub/models--lmsys--vicuna-13b-v1.5/snapshots/ 3deb0106f72a3a433f0c6ea0cb978bdf14bcd3a6/ directory, I tried to add do_sample: true manually and it added the following, I'm not sure if this is working or not and am experimenting.

ppx-hub · 2024-03-06T14:29:16Z

Hi, I met the same issue too. Where did you guys find vicuna's generation_config.json? What I have to do is resetting the model's generation_config attributes after loading the model with its from_pretrained method at https://github.com/haotian-liu/LLaVA/blob/main/llava/train/train.py#L842, but doing this gives me the following warning:
Your generation config was originally created from the model config, but the model config has changed since then. Unless you pass the `generation_config` argument to this model's `generate` calls, they w
ill revert to the legacy behavior where the base `generate` parameterization is loaded from the model config instead. To avoid this behavior and this warning, we recommend you to overwrite the generation
 config model attribute before calling the model's `save_pretrained`, preferably also removing any generation kwargs from the model config. This warning will be raised to an exception in v4.41.
On my system, I found the soft-link named generation_config.json under the .cache cache in the /root/.cache/huggingface/hub/models--lmsys--vicuna-13b-v1.5/snapshots/ 3deb0106f72a3a433f0c6ea0cb978bdf14bcd3a6/ directory, I tried to add do_sample: true manually and it added the following, I'm not sure if this is working or not and am experimenting.

I have shown here that it works, the model is successfully saved

boolmriver · 2024-03-17T06:12:01Z

此错误似乎是升级变压器版本时发生的问题。我通过在vicuna的generation_config.json文件中手动添加do_sample：true来解决此问题。

Thank U，bro

XindiWu · 2024-04-16T15:37:25Z

This error appears to be a problem that occurred while upgrading transformers version.
I fixed this problem by manually adding
do_sample: true
in vicuna's generation_config.json file.

Thanks! I changed the do_sample in the configuration_utils.py and that also works, on my local machine it's located at:
ananconda3/envs/llava/lib/python3.10/site-packages/transformers/generation/configuration_utils.py, you can find the corresponding file based on your env. For the class GenerationConfig(PushToHubMixin), __init__, I changed self.do_sample = kwargs.pop("do_sample", False) to be self.do_sample = kwargs.pop("do_sample", True).

dsn01 · 2024-07-14T06:07:07Z

This error appears to be a problem that occurred while upgrading transformers version.
I fixed this problem by manually adding
do_sample: true
in vicuna's generation_config.json file.

Thanks! I changed the do_sample in the configuration_utils.py and that also works, on my local machine it's located at: ananconda3/envs/llava/lib/python3.10/site-packages/transformers/generation/configuration_utils.py, you can find the corresponding file based on your env. For the class GenerationConfig(PushToHubMixin), __init__, I changed self.do_sample = kwargs.pop("do_sample", False) to be self.do_sample = kwargs.pop("do_sample", True).

Thank you bro!

dacian7 · 2024-08-14T07:30:36Z

lass GenerationConfig(PushToHubMixin), init,

Thanks!

shipengai closed this as completed Feb 18, 2024

YiqunChen1999 mentioned this issue Mar 15, 2024

GenerationConfig.from_pretrained raise ValueError after training, maybe raise it earlier? huggingface/transformers#29665

Closed

4 tasks

LZH-YS1998 mentioned this issue Sep 26, 2024

为什么训练过程中，只能跑到checkpoint-2400，就报错 HKUDS/UrbanGPT#21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Usage] ValueError: The generation config instance is invalid #1144

[Usage] ValueError: The generation config instance is invalid #1144

shipengai commented Feb 18, 2024

jsg921019 commented Feb 18, 2024

wentaoyuan commented Mar 4, 2024

ppx-hub commented Mar 5, 2024

ppx-hub commented Mar 6, 2024

boolmriver commented Mar 17, 2024

XindiWu commented Apr 16, 2024

dsn01 commented Jul 14, 2024

dacian7 commented Aug 14, 2024

[Usage] ValueError: The generation config instance is invalid #1144

[Usage] ValueError: The generation config instance is invalid #1144

Comments

shipengai commented Feb 18, 2024

Describe the issue

jsg921019 commented Feb 18, 2024

wentaoyuan commented Mar 4, 2024

ppx-hub commented Mar 5, 2024

ppx-hub commented Mar 6, 2024

boolmriver commented Mar 17, 2024

XindiWu commented Apr 16, 2024

dsn01 commented Jul 14, 2024

dacian7 commented Aug 14, 2024