Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disabled SDCARD as the default mode for Feather board, CODEC support Added. Updated weight… #266

Merged
merged 20 commits into from
Nov 30, 2022
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion .github/workflows/clang-format-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ jobs:
CLANG_VERSION=14
cat .clang-format

CHANGE_FILES=$(git diff --ignore-submodules --name-only remotes/origin/main '*.c' '*.h' ':!*_regs.h' ':!Libraries/FCL' ':!Libraries/FreeRTOS' ':!Libraries/lwIP' ':!Libraries/littlefs' ':!Libraries/FreeRTOS-Plus' ':!Libraries/LC3' ':!Libraries/SDHC' ':!Libraries/MAXUSB')
CHANGE_FILES=$(git diff --ignore-submodules --name-only remotes/origin/main '*.c' '*.h' ':!*_regs.h' ':!*weights.h' ':!*cnn.h' ':!*cnn.c' ':!*sampledata.h' ':!*sampleoutput.h' ':!*softmax.c' ':!Libraries/FCL' ':!Libraries/FreeRTOS' ':!Libraries/lwIP' ':!Libraries/littlefs' ':!Libraries/FreeRTOS-Plus' ':!Libraries/LC3' ':!Libraries/SDHC' ':!Libraries/MAXUSB')
CHECK_FAIL=0

set +e
Expand All @@ -64,10 +64,17 @@ jobs:
clang-format-${CLANG_VERSION} --verbose --style=file -n -Werror ${change_file}
if [ $? != 0 ];
then
echo ""
echo "===================================================="
echo "FAIL: ${change_file}"
echo "----------------------------------------------------"
echo "Expected:"
clang-format-${CLANG_VERSION} --style=file ${change_file} > ${change_file}.clang
diff -u --color=always ${change_file} ${change_file}.clang
rm ${change_file}.clang
CHECK_FAIL=1
echo "===================================================="
echo ""
fi
fi
done
Expand Down
50 changes: 31 additions & 19 deletions Examples/MAX78000/CNN/kws20_demo/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# MAX78000 Keyword Spotting Demo v.3.1
# MAX78000 Keyword Spotting Demo v.3.2



Expand All @@ -18,15 +18,17 @@ The following 20 keyword subset from the complete dataset is used for this demo:

['**up', 'down', 'left', 'right', 'stop', 'go', 'yes', 'no', 'on', 'off', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'zero**']

Rest of keywords and unrecognized words fall into "**Unknown**" category.
The rest of the keywords and unrecognized words fall into the "**Unknown**" category.

To improve the unknown detection, the model in version 3.2 is trained with an additional 100hrs speech data from LibriSpeech (https://www.openslr.org/resources/12/train-clean-100.tar.gz), segmented to 1-sec audio data and labeled as unknown.



## Keyword Spotting Demo Software

### Building firmware

Navigate directory where KWS20 demo software is located and build the project:
Navigate the directory where the KWS20 demo software is located and build the project:

```bash
$ cd /Examples/MAX78000/CNN/kws20_demo
Expand Down Expand Up @@ -137,9 +139,9 @@ If using Linux, perform this step:

### MAX78000 Feather operations

The KWS20 demo starts automatically after power-up or pressing reset button (SW4).
The KWS20 demo starts automatically after power-up or pressing the reset button (SW4).
The TFT display is optional and not supplied with the MAX78000 Feather board.
User should use PC terminal program to observe KWS20 demo result as described in "Using Debug Terminal" section.
Users should use PC terminal program to observe KWS20 demo result as described in the "Using Debug Terminal" section.

The MAX78000 Feather compatible 2.4'' TFT FeatherWing display can be ordered here:

Expand Down Expand Up @@ -327,15 +329,27 @@ KWS20 demo works in two modes: Using microphone (real-time), or offline process

### Microphone Mode

In this mode, EVKIT I2S Mic is initialized to operate at 16KHz 32-bit samples. In the main loop, I2S buffer is checked and sampled are stored into **pChunkBuff** buffer.
In this mode, EVKIT I2S Mic is initialized to operate at 16KHz 32-bit samples. In the main loop, I2S buffer is checked and samples are stored in **pChunkBuff** buffer.

### CODEC Mode

In this mode, the left channel (tip of the J5 3.5mm audio jack) of the line-in of MAX9867 audio CODEC (is used as the audio input source.

To enable using CODEC as the audio input source, make sure the `PROJ_CFLAGS` line is uncommented. This mode can only be enabled for the Feather board.

```make
# If enabled, it captures audio from line input of MAX9867 audio codec instead of the on-board mic.
# Note that SEND_MIC_OUT_SDCARD should be disabled in this mode
PROJ_CFLAGS+=-DENABLE_CODEC_MIC
```

### Offline Mode

if **ENABLE_MIC_PROCESSING** is not defined, a header file containing the 16-bit samples (e.g. **kws_five.h**) should be included in the project to be used as the input . To create a header file from a wav file, use included utilities to record a wav file and convert it to header file.

```bash
# record 3sec of 16-bit 16KHz sampled wav file
$ python VoiceRecorder.py -d 3 -o voicefile.wav
# record 1sec of 16-bit 16KHz sampled wav file
$ python VoiceRecorder.py -d 1 -o voicefile.wav
# convert to header
$ python RealtimeAudio.py -i voicefile.wav -o voicefile.h
```
Expand All @@ -360,8 +374,7 @@ The LED

- stays *green* when it is listening
- blinks *green* if a keyword is detected
- blinks *red* if an unknown keyword is detected
- blinks *yellow* if detection confidence is low
- blinks *yellow* if detection confidence is low or unknown keyword
- stays *red* if there is an error in the SD card interface

A utility (`bin2wav.py`) is provided in the `/Utility` folder to convert these files into wave (.wav) format to listen to.
Expand All @@ -371,7 +384,7 @@ To convert individual files:
$ python bin2wav.py -i <sound snippet file>
```

To convert all the files in current directory and all subdirectories:
To convert all the files in the current directory and all subdirectories:

```bash
$ python bin2wav.py -a
Expand All @@ -383,21 +396,21 @@ To convert all the files in a directory and all its subdirectories:
$ python bin2wav.py -a -d <folder name>
```

When option `-a` is used, each file is converted to wave file once and subsequent execution of the command skips all the files that have previously converted to wave files.
When option `-a` is used, each file is converted to wav file once and subsequent execution of the command skips all the files that have previously been converted to wave files.

***Note 1: When `SEND_MIC_OUT_SDCARD` is selected, the Wake Up Timer (WUT) is disabled.***

***Note 2: When `SEND_MIC_OUT_SDCARD` is selected, the `ENABLE_TFT` is disabled regardless of make options.***

### KWS20 Demo Firmware Structure

Following figure shows the processing in KWS20 Demo firmware:
The following figure shows the processing in KWS20 Demo firmware:

![](Resources/KWS_Demo_flowchart.png)

Collected samples from mic/file are 18/16 bit signed and are converted to 8 bit signed to feed into CNN. If Microphone mode, a high pass filter is used to filter out the DC level in captured samples. Scaled samples are stored in **pPreambleCircBuffer** circular buffer in chunks of 128 samples (bytes).
Collected samples from mic/file are 18/16 bit signed and are converted to 8-bit signed to feed into CNN. If Microphone mode, a high pass filter is used to filter out the DC level in captured samples. Scaled samples are stored in **pPreambleCircBuffer** circular buffer in chunks of 128 samples (bytes).

Following parameters in the firmware can be tuned:
The following parameters in the firmware can be tuned:

```c
#define SAMPLE_SCALE_FACTOR 4 // multiplies 16-bit samples by this scale factor before converting to 8-bit
Expand All @@ -408,17 +421,16 @@ Following parameters in the firmware can be tuned:
#define INFERENCE_THRESHOLD 49 // min probability (0-100) to accept an inference
```

When the average absolute values of samples during last 128 number of samples goes above a threshold, the beginning of a word is marked.
When the average absolute values of samples during the last 128 samples go above a threshold, the beginning of a word is marked.

The end of a word is signaled when the **SILENCE_COUNTER_THRESHOLD** back to back chunks of samples with average absolute threshold lower than **THRESHOLD_LOW** is observed.
The end of a word is signaled when the **SILENCE_COUNTER_THRESHOLD** back-to-back chunks of samples with an average absolute threshold lower than **THRESHOLD_LOW** is observed.

The CNN requires 1sec worth of samples (128*128) to start processing. This window starts at **PREAMBLE_SIZE** samples prior to the beginning of the word, and ends after 16384 samples. If the end of a word is determined earlier, the pAI85Buffer sample buffer is padded with zeros.

The CNN related API functions are in **cnn.c**. They are used to load weights and data, start CNN, wait for CNN to complete processing and unload the result.
The CNN-related API functions are in **cnn.c**. They are used to load weights and data, start CNN, wait for CNN to complete processing, and unload the result.

If a new network is developed and synthesized, the new weight file and related API functions are needed to be ported from automatically generated kws20 example project. Furthermore, if the input layer or organization of 128x128 sample sets in the trained network is changed, **AddTranspose()** function should be changed to reflect the new sample data arrangement in CNN memory.

### References

https://github.com/MaximIntegratedAI/MaximAI_Documentation

Binary file removed Examples/MAX78000/CNN/kws20_demo/README.pdf
Binary file not shown.
Loading