Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FFT (fast fourier transform) operator #218

Merged
merged 12 commits into from
Aug 22, 2023
Merged

Add FFT (fast fourier transform) operator #218

merged 12 commits into from
Aug 22, 2023

Conversation

achoum
Copy link
Collaborator

@achoum achoum commented Aug 10, 2023

Commonly used operation in TS processing.

Open question:

  • The window size is expressed in number of events (instead of time duration). The name of this argument is num_events. FFT is the first operation like that, and we can probably expect a few more. Q: How to make sure the user does not use "num_events" as a window_length.

@achoum achoum marked this pull request as ready for review August 11, 2023 00:06
Copy link
Collaborator

@DonBraulio DonBraulio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great!
Left some comments that maybe should be tackled in future PRs.

temporian/core/operators/fft.py Outdated Show resolved Hide resolved
if window is not None:
selected_src_values = selected_src_values * window
fft_res = np.fft.fft(selected_src_values)[:num_spectral_lines]
ft_ampl = np.abs(fft_res).astype(np.float32)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have an option to return real and imaginary parts instead of magnitude only (phase information is lost).

Maybe we can add the arg. in a future PR. But we have to consider that the number of output features changes, and naming too. Thoughts?

Copy link
Collaborator Author

@achoum achoum Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for phrase / real / img parts. There are multiple options available e.g., return multiple event-sets, return a dataclass of event set with the corresponding fields, having a input arg to specify what the output should be, rename the function "rfft", etc.

While we don't have complex operators like that for now, we will likely have some in the future. How to address this case is a great brainstorming question :)

temporian/core/operators/fft.py Outdated Show resolved Hide resolved

The operator returns the amplitude of each spectral line as
separate tp.float32 features named "a0", "a1", "a2", etc. By default,
`num_events // 2` spectral line are returned.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually, the FFT only makes sense with uniform sampling.
I'm not sure but maybe we should add a note here, or introduce a resample or check_sampling arguments. Any thoughts on this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonzalo found out there's a non-uniform fft:
https://www.mathworks.com/help/matlab/ref/double.nufft.html#mw_f93dca1a-78a2-4dd4-8b98-84462b3233c3

In python, there's this:
https://jyhmiinlin.github.io/pynufft/overview/pynufft.html

I didn't know about any of that but seems exciting! Let's talk about it tomorrow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I propose:

  • I'll rename the op from "fft" to "fast_fourier_transform" and require for "num_events" to be passed by wargs.
  • In a next PR, I'll implement the "fourier_transform" operator (that support non-uniform sampling).
  • Once we have a solution from the brainstorming, in a next PR, I'll make the update.

Does this make sense?

num_events: int,
window: Optional[str] = None,
num_spectral_lines: Optional[int] = None,
) -> EventSetOrNode:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most common case is to have a hop_size parameter as well, usually a half or quarter of window_size.
Maybe it should be added in a future PR but the num_events name sounds weird to me. Maybe window_events or window_n? The fact that it's an int already makes pretty clear that this is not duration in seconds.

Copy link
Collaborator Author

@achoum achoum Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Adding "hop_size" argument.
  • For the window_size, I want to make it super clear that this is not a window_length (both for code writer and readers). While I don't think "num_events" is very good, I think "window_size" is too close to "window_length". However, if we use "hop_size", "window_size" make sense...

WDYT?

@achoum achoum merged commit 7bd7a91 into main Aug 22, 2023
12 checks passed
@achoum achoum deleted the gbm_fft branch August 22, 2023 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants