-
Notifications
You must be signed in to change notification settings - Fork 31.2k
Description
System Info
Environment: (Google Colab)
Python 3.11.13
torch==2.6.0+cu124
transformers==4.55.2
bitsandbytes==0.47.0
peft==0.17.0
accelerate==1.10.0
numpy==1.26.4
scipy==1.14.1
GPU
NVIDIA L4
Driver Version: 550.54.15
CUDA Version: 12.4
Model Quantized with QLoRA
Dataset:
Train Dataset
{'text': Value('string'), 'embeddings': List(Value('float64')), 'tfidf_vector': List(Value('float64')), 'roberta_sent_neg': Value('float64'), 'roberta_sent_pos': Value('float64'), 'names': Value('int64'), 'organizations': Value('int64'), 'dates': Value('int64'), 'count_tokens': Value('int64'), 'label': Value('int64'), 'input_ids': List(Value('int32')), 'token_type_ids': List(Value('int8')), 'attention_mask': List(Value('int8'))}
Val Dataset
{'text': Value('string'), 'embeddings': List(Value('float64')), 'tfidf_vector': List(Value('float64')), 'roberta_sent_neg': Value('float64'), 'roberta_sent_pos': Value('float64'), 'names': Value('int64'), 'organizations': Value('int64'), 'dates': Value('int64'), 'count_tokens': Value('int64'), 'label': Value('int64'), 'input_ids': List(Value('int32')), 'token_type_ids': List(Value('int8')), 'attention_mask': List(Value('int8'))}
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Code Setup
model = "google-bert/bert-base-uncased"
tokenizer_bert = AutoTokenizer.from_pretrained(model)
if tokenizer_bert.pad_token is None:
tokenizer_bert.pad_token = tokenizer_bert.eos_token
tokenizer_bert.padding_side = "right"
compute_dtype = getattr(torch, "float16")
bnb_config = BitsAndBytesConfig(
load_in_4bit = True,
bnb_4bit_quant_type = "nf4",
bnb_4bit_use_double_quant = True,
bnb_4bit_compute_dtype = compute_dtype,
)
original_model_bert = AutoModelForSequenceClassification.from_pretrained(
model,
num_labels = 2,
quantization_config= bnb_config,
)
lora_config = LoraConfig(
r = 8,
lora_alpha = 16,
lora_dropout=0.1,
bias = "none",
task_type=TaskType.SEQ_CLS,
)
kbit_model_bert = prepare_model_for_kbit_training(original_model_bert)
kbit_model_bert.gradient_checkpointing_enable()
peft_model_bert = get_peft_model(kbit_model_bert, lora_config)
def compute_metrics(eval_pred):
logits, labels = eval_pred
predictions = np.argmax(logits, axis = -1)
accuracy = accuracy_score(labels, predictions)
precision = precision_score(labels, predictions)
recall = recall_score(labels, predictions)
f1 = f1_score(labels, predictions, average = "binary")
print(f1)
print("\n")
return {"accuracy": accuracy,
"precision": precision,
"recall": recall,
"f1": f1
}
output_dir = f'/content/drive/}'
args = TrainingArguments(
output_dir = output_dir,
weight_decay=0.22511642804764023,
warmup_ratio=0.12890328790683203,
adam_beta1=0.9348819720458172,
adam_beta2=0.9285998615546803,
adam_epsilon=1.9972958061508847e-07,
max_grad_norm=4.222172817940239,
gradient_accumulation_steps=2,
max_steps=712,
do_train = True,
do_eval = True,
lr_scheduler_type='polynomial',
warmup_steps=488,
metric_for_best_model = "eval_f1",
optim='paged_adamw_32bit',
learning_rate = 2.1106713456200193e-05,
num_train_epochs = 40,
logging_dir = "./logs/",
logging_strategy = "epoch",
eval_strategy = "epoch",
save_strategy = "epoch",
label_names = ["label"],
load_best_model_at_end = True,
save_total_limit = 3,
)
trainer = Trainer(
model = peft_model_bert,
args = args,
train_dataset = train_dataset,
eval_dataset = dev_train_dataset,
compute_metrics = compute_metrics,
data_collator = DataCollatorWithPadding(tokenizer = tokenizer_bert, padding=True),
callbacks = [EarlyStoppingCallback(early_stopping_patience=3)]
)
trainer.train()
Expected behavior
Error Message:
Expected Behavior:
Train the model.