An implementation of Transformers in PyTorch .
A single-layer transformer encoder + a linear classifer is trained end-to-end for sentiment analysis on IMDb dataset (~70 Accuracy).
- Use pre-trained word embeddings like GloVe
- Handle strings with more than 512 tokens in a better way
- Use a deeper network for better accuracy
- Implement Vision Transformer (ViT)