Fine-tuning the GPT-2 model to classify emotions in tweets. The process includes loading a pre-trained foundation model, performing parameter-efficient fine-tuning (PEFT) using LoRA, evaluating performance, and comparing the foundation model with the fine-tuned model.
- Model:
- GPT-2 is used because it is compatible with the sequence classification task and compatible with LoRA.
- PEFT Technique:
- LoRA (Low-Rank Adaptation) is utilized as it allows for efficient fine-tuning without significant impact on the original model weights.
- Evaluation:
- Hugging Face's
Trainer
.evaluate
method is used to compare the performance of both the foundation and the fine-tuned models.
- Hugging Face's
- Dataset:
- The dataset consists of labeled tweets on emotion, provided by Hugging Face
datasets
.
- The dataset consists of labeled tweets on emotion, provided by Hugging Face
- Choose the dataset: DONE
- Choose the foundation model: DONE
- Perform inference for the text classification task with the foundation model: DONE
- Evaluate the performance of the foundation model: DONE
- Load the foundation model as a PEFT model: DONE
- Define the PEFT/LORA configuration: DONE
- Train the LoRA model with Hugging Face Trainer: DONE
- Evaluate the PEFT model: DONE
- Save the PEFT model: DONE
- Load the saved PEFT model from local storage: DONE
- Run inference and generate text/label with the tuned model: DONE
- Can be found in the notebook run
- Evaluation accuracy of the foundation model on this task is 'eval_accuracy': 0.096
- While the Evaluation accuracy of the tuned model is 'eval_accuracy': 0.9225
- This is almost a 10x increase