🤖
Setup the local development environment using
./scripts/setup_env.sh # Create virtual env & download dependencies
source .venv/bin/activate # Activate it
-
notebooks/transformer_from_scratch_exercise.ipynb
contains a practice notebook to build a transformer from scratch by filling in missing portions of the code. It's a good way to review your knowledge of Transformers. -
The solution is in
notebooks/transformer_from_scratch_solution.ipynb
.
Training for the GPU poor T_T
- Upload the
notebooks/train_model.ipynb
notebook into colab (or kaggle) and run on GPU
If you're GPU self-sufficient you can run locally:
- Login to huggingface with
huggingface-cli login
- Run
python transformer/train.py
A variety of resources that really helped us out in understanding and implementing the Transformer model
- Attention Is All You Need
- Coding a Transformer from scratch on PyTorch, with full explanation, training and inference by Umar Jamil
- The Annotated Transformer
- Visualizing Attention, a Transformer's Heart by 3Blue1Brown
- Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch by Sebastian Raschka
- Self Attention in Transformer Neural Networks (with Code!) by CodeEmporium
- Visualizing attention matrix using BertViz