Skip to content
Phil Schatzmann edited this page Jul 17, 2024 · 90 revisions

The Fast Fourier Transform (FFT) is a mathematical operation that changes the domain (x-axis) of a signal from time to frequency. This is particularly useful for determining the frequency of a signal or for decomposing a signal consisting of multiple frequencies. In other words: you feed an audio signal and get the frequencies and the corresponding strength as a result. That's very handy e.g. if you want to determine the related musical notes that are played.

Length and Stride

The length defines the number of samples that are used to run the FFT. The length must be a value of the power of 2 (e.g. 1024). This parameter impacts

  • the calculation speed and frequency of calculation
  • the memory which is needed
  • the frequency resolution of the result (number of bins)

The length is determining the number of frequency bins with the following formula: frequency bins = length / 2. A big length is consuming a lot of memory and uses more processing time, so make sure that you keep the length as small as possible: 512 might be a good starting value!

The stride defines the number of samples that we use to advance at each step. If we do not define any stride we just consume the samples sequentially: FFT1

If we define a stride, we move ahead in steps as follows

FFT1

As you can see in the image above the last length-stride samples are reprocessed with each step.

RealFFT

In our framework we can use the AudioRealFFT class as copy destination. You start the processing by calling the begin method which is expecting configuration information where we define the length and stride. We also need to provide the regular audio parameters (channels, sample_rate, bits_per_sample). Finally we can define a callback method which will be called when we get a new result.

In order to determine the result

  • we can use the result() method which provides the best AudioFFTResult or
  • if we want to get the N best values we can call the resultArray() method to which we pass an array of AudioFFTResult.
  • The magnitudes() method returns the array of all magnitudes and has the length that can be determinded by the size() method. You can determine the corresponding frequencies by calling frequency(int idx)
  • If you are limited on memory you can loop from 0 to size() and call the magnitude(int idx) and frequency(int idx) methods instead.

The AudioFFTResult contains

  • the frequency
  • the magnitude (of the frequency)
  • the musical note which is corresponding to the frequency (frequencyAsNote())

An example can be found on Github

Window Functions

FFT windows reduce the effects of leakage but can not eliminate leakage entirely. In effect, they only change the shape of the leakage. In addition, each type of window affects the spectrum in a slightly different way. For further details, I recommend to consult Wikipedia. We support the the following Window Functions, but you can easily add your own subclasses. If you have enough memory, I recommend that you use the buffered implementation:

RealFFT fft;
auto cfg = fft.defaultConfig();
// buffered
cfg.window_function = new BufferedWindow(new Hamming());
// not buffered
cfg.window_function = new Hamming();

Other FFT Implementation

Alternatively you can use the KissFFT, ESP32-FFT which you will need to install separately: Some additional implementations are based on ARM CMSIS DSP and Espressif DSP Library which are part of the corresponding Arduino implementations.


Ext Library Include Class Name Comment
n/a AudioLibs/AudioRealFFT.h AudioRealFFT included in AudioTools
KissFFT Library AudioLibs/AudioKissFFT.h AudioKissFFT
ESP32-FFT AudioLibs/AudioESP32FFT.h AudioESP32FFT
n/a AudioLibs/AudioEspressifFFT.h AudioEspressifFFT included in Arduino (esp-dsp)
n/a AudioLibs/AudioCmsisFFT.h AudioCmsisFFT Included in Arduino RP2040

Or you can easily integrate your own implementation. Just have a look how the above includes have been implemented.

Performance

I created some test cases to measure the speed of FFT. The table below gives the speed in ms of fft with 4096 samples:


ESP32 ESP32-S3 STM32F411 RP2040 STM32H743
AudioRealFFT 3.3 2.6 12.1 71.0 1.1
AudioKissFFT 5.9 n/a 26.9 20.5 2.8
AudioESP32FFT 1.1 1.25 5.9 68.2 1.0
AudioCmsisFFT n/a n/a 8.2 86.1 1.0
AudioEspressifFFT 3.5 3.2 n/a n/a n/a

Please note, that the performance of the different libraries dependes on the sample size. AudioRealFFT seems to perform better, with bigger values!

Further Information

Clone this wiki locally