Have you ever wanted to train a Transformer that's significantly slower and probably incorrect than the many existing libraries? Well then you've come to the right place. This entire architecture was written in purely numpy and no other dependency (except pyyaml) is required.
Simply use pipenv install
and pipenv shell
to create the virtual environment and get started.
This repository contains all the tools you need to construct a basic transformer using the existing layers provided. This is for LEARNING PURPOSES ONLY! Please do not try to build a production ready transformer with this code.
Please, if you see any errors with my gradient calculations or anything that doesn't make sense, PLEASE MAKE A PULL REQUEST! I am so certain I made some mistakes in my calculations and I would love your help.
Use the train.py
script and checkout the configs folder for a sample training configuration.
Just use get_vocab.py
to download the merges and vocab file to the specified folder.
Please create a pull request if you'd like to contribute to this project. I'm a busy student but I'll be sure to review it as soon as possible!
Write clear unit tests for each module (right now each module just has testing code when ran independently.)