-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about discrepancy between implementations available in the repo and related papers #5
Comments
These are correct, we are cleaning up the code/algorithms for the fast
block FFT and three pass algorithm across into a single package, this
repository is focused on the architecture pieces for now. Will update this
issue when released!
…On Tue, Mar 14, 2023 at 6:52 AM Lee Seung Yul ***@***.***> wrote:
Hi, I'm bit confused about current implementations of the repo and
implementations used/discussed in related papers. I'll just state what I
think is true. Please correct me if I'm wrong.
-
Flashconv from h3
Fused kernel is implemented at fftconv_cuda.cu but it is *not using
block FFT*.
-
FlashButterfly in "Simple Hardware-Efficient Long Convolutions for
Sequence Modeling"
long_conv.py uses BlockFFT (which is same as Butterfly Decomposition)
with support for learnable parameters for dft_matrix. But *not using
fused kernel* and *Three-pass algorithm* is also not implemented.
—
Reply to this email directly, view it on GitHub
<#5>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABDDIIX33MA6IYWP65TVIALW4BEWFANCNFSM6AAAAAAV2IUUYI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks for verifying :) |
Hopefully soon! I’ve been traveling for a bit, but have some time to code
again soon.
…On Tue, Mar 14, 2023 at 8:39 AM Lee Seung Yul ***@***.***> wrote:
Thanks for verifying :)
When can I expect this performance update? Will it happen anytime soon?
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABDDIIQDGU7RRW6X2BNZCX3W4BRJXANCNFSM6AAAAAAV2IUUYI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, I'm bit confused about current implementations of the repo and implementations used/discussed in related papers. I'll just state what I think is true. Please correct me if I'm wrong.
Flashconv from h3
Fused kernel is implemented at fftconv_cuda.cu but it is not using block FFT.
FlashButterfly in "Simple Hardware-Efficient Long Convolutions for Sequence Modeling"
long_conv.py uses BlockFFT (which is same as Butterfly Decomposition) with support for learnable parameters for dft_matrix. But not using fused kernel and Three-pass algorithm is also not implemented.
The text was updated successfully, but these errors were encountered: