TypeError: argument of type 'NoneType' is not iterable when merging weights to 16bit and pushing to hub #666

premsa · 2024-06-19T08:44:15Z

hey guys, I get the following error message, after successfully fine-tuning when trying to merge weights and push to hub:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/x/.venv/lib/python3.10/site-packages/unsloth/save.py", line 1211, in unsloth_push_to_hub_merged
    unsloth_save_model(**arguments)
  File "/x/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/x/.venv/lib/python3.10/site-packages/unsloth/save.py", line 686, in unsloth_save_model
    internal_model.save_pretrained(**save_pretrained_settings)
  File "/x/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2634, in save_pretrained
    model_card = create_and_tag_model_card(
  File "/x/projects/mistral-finetune/.venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 1144, in create_and_tag_model_card
    if model_tag not in model_card.data.tags:
TypeError: argument of type 'NoneType' is not iterable

from unsloth import FastLanguageModel
import torch
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported


from utils import dataset 

max_seq_length = 1048

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/mistral-7b-instruct-v0.3", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
    max_seq_length = 1048,
    dtype = None,
    load_in_4bit = True,
    )

model = FastLanguageModel.get_peft_model(
    model,
    r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 239,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
    )


trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 10,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        #num_train_epochs = 1, 
        max_steps = 1, # Set num_train_epochs = 1 for full training runs
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 239,
        output_dir = "outputs",
    ),
)

trainer_stats = trainer.train()


model.push_to_hub_merged("user/this-is-my-project", tokenizer, save_method = "merged_16bit", token)

The above code create the config files, but fails before the weights are stored.

When saving the adapter without merging, the script does not fail and stores the adapter weights.

model.push_to_hub("user/this-is-my-project", token = token) 
tokenizer.push_to_hub("user/this-is-my-project", token = token)

My environment:

accelerate==0.31.0
aiohttp==3.9.5
aiosignal==1.3.1
async-timeout==4.0.3
attrs==23.2.0
bitsandbytes==0.43.1
certifi==2024.6.2
charset-normalizer==3.3.2
datasets==2.20.0
dill==0.3.7
docstring_parser==0.16
einops==0.8.0
filelock==3.15.1
flash-attn==2.5.9.post1
frozenlist==1.4.1
fsspec==2024.5.0
huggingface-hub==0.23.4
idna==3.7
Jinja2==3.1.4
markdown-it-py==3.0.0
MarkupSafe==2.1.5
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.15
networkx==3.3
ninja==1.11.1.1
numpy==2.0.0
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.5.40
nvidia-nvtx-cu12==12.1.105
packaging==24.1
pandas==2.2.2
peft==0.11.1
protobuf==3.20.3
psutil==5.9.8
pyarrow==16.1.0
pyarrow-hotfix==0.6
Pygments==2.18.0
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.1
regex==2024.5.15
requests==2.32.3
rich==13.7.1
safetensors==0.4.3
sentencepiece==0.2.0
shtab==1.7.1
six==1.16.0
sympy==1.12.1
tokenizers==0.19.1
torch==2.3.0
tqdm==4.66.4
transformers==4.41.2
triton==2.3.0
trl==0.8.6
typing_extensions==4.12.2
tyro==0.8.4
tzdata==2024.1
unsloth @ git+https://github.com/unslothai/unsloth.git@87703089fa0ad60f008b7a7990f5cf3e77ccd26e
urllib3==2.2.2
xformers==0.0.26.post1
xxhash==3.4.1
yarl==1.9.4

Any ideas what could be going wrong?

The text was updated successfully, but these errors were encountered:

danielhanchen · 2024-06-19T10:55:12Z

Ok that's weird - I tried on Colab and it's fine - did you add extra tags?

premsa · 2024-06-19T11:35:13Z

Ok that's weird - I tried on Colab and it's fine - did you add extra tags?

No, I did not add anything else besides what is in the above code! I am running the script on a H100 on the ampere version of unsloth.

danielhanchen · 2024-06-20T13:38:33Z

Hmm weird

whranclast · 2024-07-01T14:38:07Z

I've had the same issue and it comes from not having a tag in your hugging face directory, you need to create one manually.

danielhanchen · 2024-07-02T05:35:04Z

OO interesting I'll check the tag issue

rishiraj · 2024-09-05T07:40:14Z

@danielhanchen @premsa this should fix the issue huggingface/transformers#33315

danielhanchen added the currently fixing Am fixing now! label Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: argument of type 'NoneType' is not iterable when merging weights to 16bit and pushing to hub #666

TypeError: argument of type 'NoneType' is not iterable when merging weights to 16bit and pushing to hub #666

premsa commented Jun 19, 2024

danielhanchen commented Jun 19, 2024

premsa commented Jun 19, 2024

danielhanchen commented Jun 20, 2024

whranclast commented Jul 1, 2024

danielhanchen commented Jul 2, 2024

rishiraj commented Sep 5, 2024

TypeError: argument of type 'NoneType' is not iterable when merging weights to 16bit and pushing to hub #666

TypeError: argument of type 'NoneType' is not iterable when merging weights to 16bit and pushing to hub #666

Comments

premsa commented Jun 19, 2024

danielhanchen commented Jun 19, 2024

premsa commented Jun 19, 2024

danielhanchen commented Jun 20, 2024

whranclast commented Jul 1, 2024

danielhanchen commented Jul 2, 2024

rishiraj commented Sep 5, 2024