Skip to content

fangyuan-ksgk/selective-attention-transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

Selective Attention Transformer

This repository contains an unofficial implementation of the Selective Attention Transformer, a novel architecture designed to enhance attention mechanisms in deep learning models.

For more details, you can refer to the arXiv paper.

Overview

The Selective Attention Transformer aims to improve the efficiency and effectiveness of attention mechanisms by selectively focusing on relevant parts of the input data. This implementation is based on the original paper by Yaniv Leviathan.

Features

  • Selective Attention Mechanism: Focuses on the most relevant parts of the input data.
  • Scalable Architecture: Designed to work efficiently with large datasets.
  • Modular Design: Easy to integrate with existing deep learning frameworks.
  • Key-Value Cache (kv_cache): Enhances performance by caching key-value pairs for faster access during inference.

About

Unofficial Implementation of Selective Attention Transformer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published