Skip to content

the OG transformer model from the paper "Attention is all you need", implemented from scratch in python.

Notifications You must be signed in to change notification settings

Itssshikhar/transformer.p

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Transformers Architecture from Scratch in Python

Overview

This project implements the Transformers architecture from scratch in Python. It includes the following components:

  • Self-Attention
  • Causal Attention
  • Cross Attention
  • GPT (Generative Pre-trained Transformer) Model
  • BERT (Bidirectional Encoder Representations from Transformers) Model

Components

Self-Attention

Self-attention mechanism computes the attention score for each position in the input sequence.

Causal Attention

Causal attention is a variant of self-attention where each position can only attend to previous positions.

Cross Attention

Cross attention computes the attention score between two sequences, typically used in tasks such as machine translation.

GPT Model

The Generative Pre-trained Transformer (GPT) model is a variant of the Transformer architecture used for various natural language processing tasks.

BERT Model

BERT (Bidirectional Encoder Representations from Transformers) is a Transformer-based machine learning model for natural language processing tasks.

Implementation

The project is implemented in Python using Pytorch for numerical computations.

Files

  • attention.py: Implementation of self-attention mechanism.
  • attention.py: Implementation of causal attention mechanism.
  • attention.py: Implementation of cross attention mechanism.
  • GPT.py: Implementation of the GPT model.
  • bert.py: Implementation of the BERT model.

Dependencies

  • Python 3.10.12
  • Pytorch

Acknowledgments

This project is inspired by the Transformers architecture introduced in the paper "Attention Is All You Need" by Vaswani et al.

About

the OG transformer model from the paper "Attention is all you need", implemented from scratch in python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages