This repository is to provide a detailed and simple explanation of the transformer architecture. The repository also contains experiments and bits of code for building and training transformer networks.
At the moment the repo contain explanations on vanilla transformers but will include explanations on spin-off transformers such as Vision Transformers.
-
A Jupyter notebook displaying the workflow for obtaining embeddings from sentences. This includes tokenization of the sentence + getting the positional encodings
-
The README detailing the theory behind the terms embeddings, tokenization and positional encodings
Seminal papers:
Other cool papers:
Pre-trained transformers for acoustic data:
- BEATs. But better to use our repository as it contains a running example on ESC50.
- AST: Audio Spectrogram Transformer
- PaSST: Efficient Training of Audio Transformers with Patchout
Tutorials related to transformers:
Other cool resources on transformers: