You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It looks like there is just an ansi implementation of dsps_bit_rev_sc16 at this time. On the other hand, there is an optimized implementation of the dsps_bit_rev2r_fc32 but the fc32 FFT is not accelerated where the sc16 one is.
Considering the optimized dsps_fft2r_sc16 on the S3, it is a blocker for any effective, fast and fixed point FFT on the S3 to not have an optimized bit reversal function as well. Currently, if the first step of the FFT takes time T then just the final bit reversal (ansi) will take 2T.
Please consider adding. I am not sure about the effort, but maybe it is similar to the floating point version with just type changes?!
P.S. similar issue with the dsps_cplx2real_sc16, but lower priority.
The text was updated successfully, but these errors were encountered:
tom-borcin
changed the title
No optimized dsps_bit_rev_sc16 resulting slow sc16 S3 FFT performance overall
No optimized dsps_bit_rev_sc16 resulting slow sc16 S3 FFT performance overall
Apr 3, 2023
github-actionsbot
changed the title
No optimized dsps_bit_rev_sc16 resulting slow sc16 S3 FFT performance overall
No optimized dsps_bit_rev_sc16 resulting slow sc16 S3 FFT performance overall (DSP-98)
Apr 3, 2023
In float version of FFT we have an table base implementation of bit reverse, and that's why it's faster.
We will think to add the same functionality for the int16 version.
About dsps_cplx2real_sc16 the approach is different. The ansi version and asm version will not have such big difference as FFT or other functions, that's why we have onlu ansi version for this function.
It looks like there is just an ansi implementation of
dsps_bit_rev_sc16
at this time. On the other hand, there is an optimized implementation of thedsps_bit_rev2r_fc32
but the fc32 FFT is not accelerated where the sc16 one is.Considering the optimized
dsps_fft2r_sc16
on the S3, it is a blocker for any effective, fast and fixed point FFT on the S3 to not have an optimized bit reversal function as well. Currently, if the first step of the FFT takes time T then just the final bit reversal (ansi) will take 2T.Please consider adding. I am not sure about the effort, but maybe it is similar to the floating point version with just type changes?!
P.S. similar issue with the dsps_cplx2real_sc16, but lower priority.
The text was updated successfully, but these errors were encountered: