-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast Fourier Transform #313
Conversation
I'll change the base branch of this PR to |
ad0918b
to
7fc0f11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let me know if you want to apply the minor comments, of if you'd like to merge this.
I'd also like to better understand the data flow in this algorithm to see if we could eliminate some of the unsafe indexing, but we can also do that later.
def listToTable ((AsList n xs): List a) : (Fin n)=>a = xs | ||
|
||
def odd_sized_palindrome (mid:a) (seq:n=>a) : | ||
({backward:n | mid:Unit | zforward:n}=>a) = -- Alphabetical order matters here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... Right. I think that the order being alphabetical is a bit of a coincidence and in reality it is left "implementation defined"? I guess that this is fine for now, although things like that might deserve their own user-space defined index sets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is bad, and when we get user-space defined index sets I'd like to do what you suggest.
examples/fft.dx
Outdated
|
||
def reflect (i:n) : n = | ||
s = size n | ||
(s - 1 - ordinal i)@n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unsafeFromOrdinal
should be good here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. I also got rid of all the rest of the @
in this file. Surprisingly to me, it made compilation much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked at the IR in quite a while, but when I did recently I realized that since we've made errors use IO for printing in the frontend, any invocation of @
got really complicated. cc @dougalm
examples/fft.dx
Outdated
for i:(Fin log2n). | ||
ipow2 = intpow 2 (ordinal i) | ||
copy = get refOuter | ||
refOuter := yieldAccum (AddMonoid Complex) \ref. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updates to State
should be atomic, so you can keep get
ting from refOuter
in the body FWIW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I didn't realize that. Thanks for the tip, it made the code shorter, avoids an extra copy, and made it a bit more readable, imo.
I addressed the ones I could, I'd like to merge now, I think.
There are 4 remaining unsafe indexes besides |
for i:(Fin log2n). | ||
ipow2 = intpow 2 (ordinal i) | ||
xRef := yieldAccum (AddMonoid Complex) \bufRef. | ||
for j:(Fin halfn). -- Executes in parallel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wish 😆 It reads from a stateful reference, so I doubt we would ever try to parallelize this. Of course we could (and should) optimize it so that reads are ok, but it's not implemented at the moment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I think it's an effect of me advocating for moving the copy
part into the loop 😕
The inner loop is now parallelizable by the compiler, since it uses
yieldAccum
overAddMonoid Complex
.It would be nice to be able to enforce that the input size is a power of two using the type system. I worked out how to do this in a way that typechecks:
((Fin pow)=>m)=>a
expresses arrays of sizem^pow
. But this crashes currently due to #146.At some point, I'm hoping to make a blog post comparing it to the Fhutark FFT example that I initially based this one off of. The main emphasis would be on the additional safety that comes from types (and maybe more generality once
Complex
can be parameterized by its floating-point representation). But the coding style is also very different - just like when writing the fluidsim, I originally started writing lots of helper functions for vectorized operations, likegather
,scatter
, and different flavors ofzip
/unzip
. But then I realized it would be easier just to write fused for loops. So the code for the same function ends up looking very different.Misc notes:
@
s to marked unsafe one-liners, following @apaszke 's style, or remove them by using tables as index sets.