-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PEFT training and explicit kwarg passthrough #3480
Conversation
Hello @janpf could you provide a small test script how to train a model (for instance for NER) using PEFT? That would make it easier to test. |
will do. hopefully this week :) |
Ok, I got a minimal example. I adapted this: https://flairnlp.github.io/docs/tutorial-training/how-to-train-text-classifier
git+https://github.com/flairNLP/flair.git@refs/pull/3480/merge
bitsandbytes
peft
scipy==1.10.1 The code then looks like this: from flair.data import Corpus
from flair.datasets import TREC_6
from flair.embeddings import TransformerDocumentEmbeddings
from flair.models import TextClassifier
from flair.trainers import ModelTrainer
corpus: Corpus = TREC_6()
label_type = "question_class"
label_dict = corpus.make_label_dictionary(label_type=label_type)
# this is new
from peft import LoraConfig, TaskType
import torch
import bitsandbytes as bnb
# set the quantization config (bitsandbytes)
bnb_config = {
"device_map": "auto",
"load_in_8bit": True,
}
# set lora config (peft)
peft_config = LoraConfig(
task_type=TaskType.FEATURE_EXTRACTION,
inference_mode=False,
)
document_embeddings = TransformerDocumentEmbeddings(
"uklfr/gottbert-base",
fine_tune=True,
# pass both configs using the newly introduced kwargs
transformers_model_kwargs=bnb_config,
peft_config=peft_config,
)
classifier = TextClassifier(
document_embeddings, label_dictionary=label_dict, label_type=label_type
)
trainer = ModelTrainer(classifier, corpus)
trainer.fine_tune(
"resources/taggers/question-classification-with-transformer",
learning_rate=5.0e-5,
mini_batch_size=4,
# i believe explicitly swapping out the optimizer is recommended
optimizer=bnb.optim.adamw.AdamW,
max_epochs=1, the resulting model is quite bad, but all QLoRA-hyperparameters have been kept at the original values.
and also: |
Hi @janpf this looks good. I tested for a standard BERT model (for which quantization seems not to be available), and I'm getting competitive results to full fine-tuning when setting a slightly higher learning rate for LoRA: from peft import LoraConfig, TaskType
document_embeddings = TransformerDocumentEmbeddings(
"bert-base-uncased",
fine_tune=True,
# set LoRA config
peft_config=LoraConfig(
task_type=TaskType.FEATURE_EXTRACTION,
inference_mode=False,
),
)
classifier = TextClassifier(document_embeddings, label_dictionary=label_dict, label_type=label_type)
trainer = ModelTrainer(classifier, corpus)
trainer.fine_tune(
"resources/taggers/question-classification-with-transformer",
learning_rate=5.0e-4,
mini_batch_size=4,
max_epochs=1,
) Unfortunately, I don't know what is causing the storage error. This is now affecting all PRs. |
Thanks again for adding this @janpf! Since the tests are now running through, we can merge! |
This PR adds the ability to train models using PEFT (LoRA and QLoRA) and some nicer handling for model and config explicit kwargs. For example, passing through kwargs to the model but not the config was not possible before.
If PEFT is not installed and not used, no error should be thrown either.