Reformers - Efficient Transformers

This repository containes implementations of reformers as described in the following paper - https://openreview.net/pdf?id=rkgNKkHtvB. The following repository contains implementations of reformers in PyTorch as well as Tensorflow Keras. We will be performing more experiments on these over the course of time.

Install

clone the repository locally and install dependencies using pip install. The dependencies are present in the requirements.txt file.

Usage

PyTorch

import torch
from reformers import ReformerLM

model = ReformerLM(
    num_tokens= 20000,
    emb = 512,
    depth = 12,
    max_seq_len = 8192,
    heads = 8,
    lsh_dropout = 0.1,
    causal = True,        # auto-regressive or not
    bucket_size = 64,     # average size of qk per bucket, 64 was recommended in paper
    n_hashes = 4,         # 4 is permissible per author, 8 is the best but slower
    ff_chunks = 200,      # number of chunks for feedforward layer, make higher if there are memory issues
    weight_tie = False,   # tie parameters of each layer for no memory per additional depth
    attn_chunks = 8,        # process lsh attention in chunks, only way for memory to fit when scaling to 16k tokens
    use_full_attn = False   # use full self attention, for comparison
).cuda()

x = torch.randint(0, 20000, (1, 8192)).long().cuda()
y = model(x) # (1, 8192, 20000)

import torch
from reformers import Reformer

model = Reformer(
    emb = 512,
    depth = 12,
    max_seq_len = 8192,
    heads = 8,
    lsh_dropout = 0.1,
    causal = True
).cuda()

x = torch.randn(1, 8192, 512).cuda()
y = model(x) # (1, 8192, 512)

Tensorflow

model_tf = TFReformerLM(
    num_tokens= 20000,
    emb = 512,
    depth = 1,
    max_seq_len = 32000,
    heads = 8,
    lsh_dropout = 0.1,
    causal = True,        # auto-regressive or not
    bucket_size = 64,     # average size of qk per bucket, 64 was recommended in paper
    n_hashes = 4,         # 4 is permissible per author, 8 is the best but slower
    ff_chunks = 1600,      # number of chunks for feedforward layer, make higher if there are memory issues
    weight_tie = False,   # tie parameters of each layer for no memory per additional depth
    attn_chunks = 8,        # process lsh attention in chunks, only way for memory to fit when scaling to 16k tokens
    use_full_attn = False   # use full self attention, for comparison
)

model_tf.build(input_shape=(1,32000))
model_tf.summary()

# Model: "tf_reformer_lm"
# _________________________________________________________________
# Layer (type)                 Output Shape              Param #   
# =================================================================
# embedding (Embedding)        multiple                  10240000  
# _________________________________________________________________
# embedding_1 (Embedding)      multiple                  16384000  
# _________________________________________________________________
# tf_reformer (TFReformer)     multiple                  2888704   
# _________________________________________________________________
# dense_5 (Dense)              multiple                  10260000  
# =================================================================
# Total params: 39,772,704
# Trainable params: 39,772,704
# Non-trainable params: 0

Source for PyTorch code - https://github.com/lucidrains/reformer-pytorch

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
reformers		reformers
train		train
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.py		example.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reformers - Efficient Transformers

Install

Usage

PyTorch

Tensorflow

About

Releases

Packages

Languages

License

cerebroai/reformers

Folders and files

Latest commit

History

Repository files navigation

Reformers - Efficient Transformers

Install

Usage

PyTorch

Tensorflow

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages