DL-ByT5

Deep learning project within NLP using the ByT5 model to grammatical error correction.

The ByT5 has been fine-tuned with a sentence-2-sentence approach based on Google pre-trained ByT5 small model. [https://huggingface.co/google/byt5-small]

The so-called multilexnorm model is based on the work of [https://huggingface.co/ufal/byt5-small-multilexnorm2021-da]. It is also based on the small ByT5 but it has already been fine-tuned on twitter data with a masked-word-2-word approach. In this project we further fine-tune the model.

The data that we have fine-tuned with are not public, hence we can only upload a limited part of the data set - it corresponds to the test set such that the main results can be reproduced.

We use the word-error-rate, BLEU and GLEU scores as performance metrics.

We have gathered our main findings in two jupyter notebooks, each corresponding to an approach. They should be viewed as demos. For the "working" code we refer to the folders named "approach"_individual_files. The fine-tuning, inferring and evaluation of the models is located in these folders.

sentence-2-sentence approach:

See ByT5.ipynb for training, inferring, evaluation and visualisation of the model.

masked-word-2-word approach:

See MLN.ipynb for training, inferring and evaluation.

Contact us at s183568@student.dtu.dk or s193713@student.dtu.dk

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
ByT5_individual_files		ByT5_individual_files
MLN_individual_files		MLN_individual_files
data		data
.DS_Store		.DS_Store
ByT5.ipynb		ByT5.ipynb
MLN.ipynb		MLN.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DL-ByT5

About

Releases

Packages

Languages

jens-create/DL-ByT5

Folders and files

Latest commit

History

Repository files navigation

DL-ByT5

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages