FFT

The Fast Fourier Transform (FFT) is a mathematical operation that changes the domain (x-axis) of a signal from time to frequency. This is particularly useful for determining the frequency of a signal or for decomposing a signal consisting of multiple frequencies. In other words: you feed an audio signal and get the frequencies and the corresponding strength as a result. That's very handy e.g. if you want to determine the related musical notes that are played.

Length and Stride

The length defines the number of samples that are used to run the FFT. The length must be a value of the power of 2 (e.g. 1024). This parameter impacts

the calculation speed and frequency of calculation
the memory which is needed
the frequency resolution of the result (number of bins)

The length is determining the number of frequency bins with the following formula: frequency bins = length / 2. A big length is consuming a lot of memory and uses more processing time, so make sure that you keep the length as small as possible: 512 might be a good starting value!

The stride defines the number of samples that we use to advance at each step. If we do not define any stride we just consume the samples sequentially: FFT1

If we define a stride, we move ahead in steps as follows

As you can see in the image above the last length-stride samples are reprocessed with each step.

RealFFT

In our framework we can use the AudioRealFFT class as copy destination. You start the processing by calling the begin method which is expecting configuration information where we define the length and stride. We also need to provide the regular audio parameters (channels, sample_rate, bits_per_sample). Finally we can define a callback method which will be called when we get a new result.

In order to determine the result

we can use the result() method which provides the best AudioFFTResult or
if we want to get the N best values we can call the resultArray() method to which we pass an array of AudioFFTResult.
The magnitudes() method returns the array of all magnitudes and has the length that can be determinded by the size() method. You can determine the corresponding frequencies by calling frequency(int idx)
If you are limited on memory you can loop from 0 to size() and call the magnitude(int idx) and frequency(int idx) methods instead.

The AudioFFTResult contains

the frequency
the magnitude (of the frequency)
the musical note which is corresponding to the frequency (frequencyAsNote())

An example can be found on Github

Window Functions

FFT windows reduce the effects of leakage but can not eliminate leakage entirely. In effect, they only change the shape of the leakage. In addition, each type of window affects the spectrum in a slightly different way. For further details, I recommend to consult Wikipedia. We support the the following Window Functions, but you can easily add your own subclasses. If you have enough memory, I recommend that you use the buffered implementation:

RealFFT fft;
auto cfg = fft.defaultConfig();
// buffered
cfg.window_function = new BufferedWindow(new Hamming());
// not buffered
cfg.window_function = new Hamming();

Channels

FFT can only be executed on one channel. If you provide autio with multiple channels you need to indicate which channel should be used to analyse the data. By default we use the first channel (= 0)

cfg.channel_used = 0;

Other FFT Implementation

Alternatively you can use the KissFFT, ESP32-FFT which you will need to install separately: Some additional implementations are based on ARM CMSIS DSP and Espressif DSP Library which are part of the corresponding Arduino implementations.

Ext Library	Include	Class Name	Comment
n/a	AudioLibs/AudioRealFFT.h	AudioRealFFT	included in AudioTools
KissFFT Library	AudioLibs/AudioKissFFT.h	AudioKissFFT
ESP32-FFT	AudioLibs/AudioESP32FFT.h	AudioESP32FFT
n/a	AudioLibs/AudioEspressifFFT.h	AudioEspressifFFT	included in Arduino (esp-dsp)
n/a	AudioLibs/AudioCmsisFFT.h	AudioCmsisFFT	Included in Arduino RP2040

Or you can easily integrate your own implementation. Just have a look how the above includes have been implemented.

Performance

I created some test cases to measure the speed of FFT. The table below gives the speed in ms of fft with 4096 samples:

	ESP32	ESP32-S3	STM32F411	RP2040	STM32H743
AudioRealFFT	3.3	2.6	12.1	71.0	1.1
AudioKissFFT	5.9	n/a	26.9	20.5	2.8
AudioESP32FFT	1.1	1.25	5.9	68.2	1.0
AudioCmsisFFT	n/a	n/a	8.2	86.1	1.0
AudioEspressifFFT	3.5	3.2	n/a	n/a	n/a

Please note, that the performance of the different libraries dependes on the sample size. AudioRealFFT seems to perform better, with bigger values!

Inverse FFT

The inverse FFT is generating audio samples in the time domain from the spektrum information. This functionality is executed automatically when we read the data from the fft source: The spektrum information can be set in the callback.

This implementation also supports a stride and a window function: the samples are combined using the overlap add method.

Here is an example that generates a tone across the full spektrum:

#include "AudioTools.h"
#include "AudioTools/AudioLibs/AudioRealFFT.h" // using RealFFT

AudioInfo info(44100, 2, 16);
AudioRealFFT afft; // or AudioKissFFT
//CsvOutput<int16_t> out(Serial);
I2SStream out;
StreamCopy copier(out, afft);
int bin_idx = 0;

// privide fft data
void fftFillData(AudioFFTBase &fft) {
  fft.clearBins();
  FFTBin bin{1.0f,1.0f};
  fft.setBin(bin_idx, bin);

  // restart from first bin
  if (++bin_idx>=fft.size()) bin_idx = 0;
}

void setup() {
  AudioToolsLogger.begin(Serial, AudioToolsLogLevel::Warning);

  // Setup FFT
  auto tcfg = afft.defaultConfig(RX_MODE);
  tcfg.copyFrom(info);
  tcfg.length = 1024;
  tcfg.callback = fftFillData;
  afft.begin(tcfg);

  // setup output
  auto ocfg = out.defaultConfig(TX_MODE);
  ocfg.copyFrom(info);
  out.begin(ocfg);
}

void loop() { copier.copy(); }

We just copy the data from the fft source to the i2s sink: the fftFillData() callback is just setting the bin that represents the frequency. At each call the bin (and frequency) is increased.

Uh oh!

FFT

Length and Stride

RealFFT

Window Functions

Channels

Other FFT Implementation

Performance

Inverse FFT

Further Information

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally