-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to load the model and the checkpoint after trained the model? #674
Comments
Are you using PEFT for fine-tuning? And can you the code you are using to load the model? |
Following are my code. from datasets import load_dataset
from trl import SFTTrainer
from transformers import AutoModel, DataCollatorForLanguageModeling, AutoTokenizer, TrainingArguments, AutoModelForCausalLM
from peft import LoraConfig
# 加载模型和tokenizer
MODEL_PATH = "/home/qiji/chatglm2-6b"
model = AutoModelForCausalLM.from_pretrained(MODEL_PATH, trust_remote_code=True).half().cuda()
# model = AutoModel.from_pretrained(MODEL_PATH, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
# tokenizer.padding_side = 'right'
# 设置微调参数
training_arguments = TrainingArguments(
output_dir='/home/qiji/Container/jinkundong/SFT/results',
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
save_steps=5000,
logging_steps=1000,
learning_rate=2e-4,
fp16=True,
max_grad_norm=0.3,
max_steps=5000,
warmup_ratio=0.03,
group_by_length=True,
lr_scheduler_type='constant',
)
model.config.use_cache = False
peft_config = LoraConfig(
r=64,
lora_alpha=16,
lora_dropout=0.1,
bias="none",
task_type="CAUSAL_LM",
)
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer,
mlm=False,
)
no_deprecation_warning=True
dataset = load_dataset("/home/qiji/Container/jinkundong/SFT/SFT_dataset", split="train")
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
dataset_text_field="input",
max_seq_length=512,
peft_config=peft_config,
args=training_arguments,
data_collator=data_collator,
packing=False,
)
trainer.train()
model_save_path = "/home/qiji/Container/jinkundong/SFT_2"
trainer.save_model(model_save_path) As I konw, |
hi @ccwdb import torch
from peft import AutoPeftModelForCausalLM
model = AutoPeftModelForCausalLM.from_pretrained(output_dir, torch_dtype=torch.float16) Make sure to install |
I tried your code, but it says that I didn't have a file named config.json. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
After checking the source code. I think you should load the model from: from transformers.trainer_utils import EvalPrediction, get_last_checkpoint
last_checkpoint = get_last_checkpoint(script_args.output_dir)
trainer.train(resume_from_checkpoint=last_checkpoint) Hello, may you provide how to save the model? Can I directly use trainer.save |
I trained my model using the code in the sft_trainer.py. And I save the checkpoint and the model in the same dir.
But I don't know how to load the model with the checkpoint. Or I just want to konw that
trainer.save_model(script_args.output_dir)
means I have save a trained model, not just a checkpoint?I try many ways to load the trained model but errors like
So, how to load the model???
The text was updated successfully, but these errors were encountered: