BLE Audio Classifier

This application uses TensorFlow Lite for Microcontrollers to run audio classification machine learning models to classify words from audio data recorded from a microphone. The detection is visualized using the LED’s on the board and the classification results are written to the VCOM serial port. Additionally, the classification results are transmitted to a connected Bluetooth Low Energy (BLE) client.

NOTE: Currently, this application only supports the BRD2601 embedded platform

Pac-Man Demo

A webpage demo has been created to work with this example application (live demo here). The webpage connects to the development board via Bluetooth Low Energy (BLE). Once connected, this application uses machine learning to detect the keywords:

  • Left

  • Right

  • Up

  • Down

  • Stop

  • Go

When a keyword is detected, the result is sent to the webpage via BLE which then moves the Pac-Man on the webpage accordingly.

The source code for the webpage may be here at: mltk repo/cpp/shared/apps/ble_audio_classifier/web/pacman.

Behavior

Upon startup, the application begins advertising via Bluetooth Low Energy (BLE) the service:

  • Device Name: MLTK KWS

  • Service UUID: c20ffe90-4ed4-46b9-8f6c-ec143fce3e4e

Which has one characteristic with the properties: Read and Notify. The contents of this characteristic is a string with the format:

<detect-class-id>,<confidence>

Where:

  • detect class id - Is the class ID of the detected keyword

  • confidence - Is the probability score of the detected keyword as a uint8 (i.e. 255 = highest confidence that the correct keyword was detected)

The BLE client is notified whenever a keyword is detected.

NOTE: Keyword detection is only active when a BLE client is connected.

Additionally, the application is using two LEDs to show detection and activity and it is printing detection results and debug log output on the VCOM serial port. In the application configuration file called ble_audio_classifier_config.h the user can select which LED to use for activity and which LED to use for detection. By default the detection LED is green/led1 and the activity LED is red/led0.

At a regular interval the application will perform an inference and the result will be processed to find the average score for each class in the current window. If the top result score is higher than a detection threshold then a detection is triggered and the detection LED (green) will light up for about 750 ms.

Once the detection LED turns off the application goes back to responding to the input data. If the change in model output is greater than a configurable sensitivity threshold, then the activity LED (red) will blink for about 500 ms.

The activity LED indicates that audio has been detected on the input and the model output is changing, but no clear classification was made.

In audio classification, it is common to have some results that map to silence or unknown. These results are something that we usually want to ignore. This is being filtered out in the audio classifier application based on the label text. By default, any labels that start with an underscore are ignored when processing results. This behavior can be disabled in the application configuration file.

Updating the model

The default model used in this application is called keyword_spotting_pacman.tflite and is able to classify audio into 6 different classes labeled “left”, “right”, “up”, “down”, “stop”, “go”. The source for the model can be found here: https://github.com/siliconlabs/mltk/blob/master/mltk/models/siliconlabs/keyword_spotting_pacman.py

The application is designed to work with an audio classification model created using the Silicon Labs Machine Learning Toolkit (MLTK). Use the MLTK to train a new audio classifier model and replace the model inside this example with the new audio classification model.

via Simplicity Studio

To replace the default model, rename your .tflite file to 1_<your model named>.tflite and copy it into the config/tflite folder of the Simplicity Studio project. (Simplicity Studio sorts the models alphabetically in ascending order, adding 1_ forces the model to come first). After a new .tflite file is added to the project Simplicity Studio will automatically use the flatbuffer converter tool to convert a .tflite file into a c file which is added to the project.

Refer to the online documentation for more details.

via CMake

The model can also be updated when building this application from Visual Studio Code or the CMake Command Line.

To update the model, create/modify the file: <mltk repo root>/user_options.cmake and add:

mltk_set(GECKO_SDK_ENABLE_BLUETOOTH ON)
mltk_set(BLE_AUDIO_CLASSIFIER_MODEL <model name or path>)

where <model name or path> is the file path to your model’s .tflite or the MLTK model name.

With this variable set, when the ble_audio_classifier application is built the specified model will be built into the application.

Build, Run, Debug

See the online documentation for how to build and run this application:

Simplicity Studio

If using Simplicity Studio select the MLTK - BLE Audio Classifier Project.

Visual Studio Code

If using Visual Studio Code select the mltk_ble_audio_classifier CMake target.

To build this app, create/modify the file: <mltk repo root>/user_options.cmake and add:

mltk_set(GECKO_SDK_ENABLE_BLUETOOTH ON)

Command-line

If using the Command Line select the mltk_ble_audio_classifier CMake target.

To build this app, create/modify the file: <mltk repo root>/user_options.cmake and add:

mltk_set(GECKO_SDK_ENABLE_BLUETOOTH ON)

Model Parameters

In order for the audio classification to work correctly, we need to use the same audio feature generator configuration parameters for inference as is used when training the model. When using the MLTK to train an audio classification model the model parameters will be embedded in the metadata section of the .tflite file. The model parameters are extracted from the .tflite at runtime.

Additional Reading