Skip to content

retrogtx/attention-is-all-you-need

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Attention Is All You Need - Implementation from Scratch

This repository contains the Python code implementation of the original transformer model as described in the paper "Attention is All You Need" by Vaswani et al.

The code is 100% possible with the help of Pytorch Transformers from Scratch (Attention is all you need) video by Aladdin Persso.

Overview
The transformer model has been a significant breakthrough in the field of machine learning, particularly in Natural Language Processing (NLP). It introduced the concept of "attention" mechanism that allows the model to focus on certain parts of the input sequence when producing an output, thereby improving the model's ability to handle long sequences.

In this project, we read and implement the transformer model from scratch, providing a detailed understanding of its inner workings.

Code Structure
The main implementation of the Self-Attention mechanism, which is the core of the transformer model, is in the main.py file. It defines a SelfAttention class that extends PyTorch's nn.Module class.

How to Run
To run the code, you need to have Python and PyTorch installed on your machine. You can then run the main.py file using a Python interpreter.

About

attention is all you need, implementation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages