(From a Hackathon hosted by Deka)
This repository contains a Jupyter Notebook that demonstrates the fine-tuning and evaluation of a BERT-based model for sequence classification tasks using PyTorch and Hugging Face's Transformers library. The notebook walks through the process of training a BERT model on a custom dataset, monitoring the training process, and evaluating model performance using accuracy metrics.
This notebook is from my participation at a workshop event at Deka, and was intended for educational purposes only.
- Setup and Configuration
- Data Preparation
- Model Training
- Model Evaluation
- Results and Analysis
Ensure you have the following Python packages installed:
numpy
pandas
torch
transformers
keras
matplotlib
scikit-learn
-
Reproducibility: The notebook sets seeds for
random
,numpy
, andtorch
to ensure reproducibility across different runs. The seed value used is42
. -
Data Preparation: The notebook assumes you have a dataset ready for training and validation. The data should be preprocessed and loaded into PyTorch DataLoaders.
-
Load the Notebook: Open the Jupyter Notebook file (
.ipynb
) in your Jupyter environment. -
Adjust Hyperparameters: Modify parameters such as the number of epochs, learning rate, and batch size according to your needs.
-
Execute Cells: Run all the cells in the notebook to execute the training and evaluation pipeline.
-
Data Preparation:
- Tokenization: Convert text data into input IDs and attention masks suitable for BERT.
- DataLoaders: Create PyTorch DataLoaders for batching and shuffling training and validation data.
-
Model Training:
- Initialization: Load a pre-trained BERT model and configure the optimizer and learning rate scheduler.
- Training Loop: Perform fine-tuning by iterating over the training data, calculating loss, and updating model parameters using backpropagation.
-
Model Evaluation:
- Validation: Evaluate the model on the validation dataset to measure performance.
- Accuracy Calculation: Use a custom accuracy function to compute the proportion of correct predictions.
- Training Loss: Monitored during each epoch to observe the convergence of the model.
- Validation Accuracy: Calculated at the end of each epoch to evaluate the model's performance on unseen data.
- Matthews Correlation Coefficient (MCC): In the end we managed to achieve a MCC of ~0.375 which is decent considering we didnt do any hyperparamater tuning.
- Hugging Face: For providing the BERT model and Transformers library.
- PyTorch: For its deep learning framework that enables efficient training and evaluation.
- Deka
- Chris McCormick's Blog