CoPE Contextual Position Encoding Kernels

I wrote some custom CUDA kernels for flip, which hasn't got an official PyTorch implementation (as I am aware of) , and cumsum, which has one but I implemented it also in a similar style for congruency. Both were created for the ROPE paper. The CoPE core Implementation is taken from there. This repository is currently in toy status, so use it at your own risk.

Talk is cheap. How to run the code.

pip install -e .

After that, you should be able to import the custom Ops into your Python code. Run Tests

pytest tests

Train MNIST

python train_mmnist.py

Outlook: What to Do Next

Introduce a benchmark in order to optimize the kernel better
Make kernels faster
Implement the entire forward pass in CUDA
~~Introduce einops just cause it's einops~~ DONE
A very interesting idea was mentioned by https://www.youtube.com/@marinepower under https://www.youtube.com/watch?v=qcMsvU-wYZA by Gabriel Mongaras.

"Wonder if this method could be improved by having a new projection matrix of size [hidden_dim x 1] that computes the width of each token. We take the sigmoid, the cumulative sum, we do the interpolation as described, but we add it to the queries and keys, then do normal attention."
- This requires a new (small) matrix but would allow us to use flash attention directly without needing a new CoPE kernel.
Implement automatic testing
Better README.md

Pull Requests are encouraged : )

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
train_mmnist.py		train_mmnist.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoPE Contextual Position Encoding Kernels

Talk is cheap. How to run the code.

Outlook: What to Do Next

About

Releases

Packages

Languages

juvi21/CoPE-cuda

Folders and files

Latest commit

History

Repository files navigation

CoPE Contextual Position Encoding Kernels

Talk is cheap. How to run the code.

Outlook: What to Do Next

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages