Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize SwiGLU related python code #1158

Open
warpuv opened this issue Nov 21, 2024 · 0 comments · May be fixed by #1160
Open

Generalize SwiGLU related python code #1158

warpuv opened this issue Nov 21, 2024 · 0 comments · May be fixed by #1160

Comments

@warpuv
Copy link
Contributor

warpuv commented Nov 21, 2024

🚀 Feature

Generalize SwiGLU related python code. Create base classes and generalized functions to reuse by SwiGLU and other GLU-like activation functions that could be implemented in the future.

Motivation

This and some other changes allow for other similar activation functions (like GeGLU) to be implemented using the same codebase.

Additional context

I plan to add GeGLU with fused ops and packed weights implementation based on the SwiGLU code. This change is necessary to avoid code duplication.

@warpuv warpuv linked a pull request Nov 21, 2024 that will close this issue
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant