This repository contains the code of the paper "Data Augmentation for Automated Essay Scoring using Transformer Models" by Kshitij Gupta. The paper is available at IEEE Xplore.
Automated essay scoring (AES), also known as automated essay grading or machine scoring, refers to the use of artificial intelligence and natural language processing techniques to assess and evaluate essays written by humans. This technology aims to streamline the grading process, providing rapid and consistent feedback to students.
To use the code and scripts in this repository, please follow these steps:
-
Clone the repository:
git clone https://github.com/kjgpta/Data-Augmentation-for-Automated-Essay-Scoring-using-Transformer-Models.git
-
Install the required dependencies. You can use
pip
:pip install -r requirements.txt
-
Setup any additional configuration or environment variables as necessary.
This section describes how to use the code and scripts provided in this repository.
-
Firstly, we summarize the topic of each essay using the BART model.
-
Next, we normalize the training and validation score out of 10 such that we have 11 bracket of score from 0 to 10.
-
We then added the summary of each topic to the training data and validation data. The data augmentation process is shown below:
-
We trained the model on the augmented data and evaluate the model on the validation data.
-
After thorough analysis, we used the test data and added that data to the training data and trained the model on the augmented data using the updated hyperparameters.
-
We then evaluated the model on the test data.
The results of the experiments are shown below for all the 4 topics from the test set:
This project is licensed under the CC0-1.0 License.