Reference Datasets

The MLTK comes with datasets that are used by the reference models.

The source code for these datasets may be found on Github: https://github.com/siliconlabs/mltk/tree/master/mltk/datasets.

Audio Datasets

mltk.datasets.audio.speech_commands.speech_commands_v2

Google Speech Commands v2

mltk.datasets.audio.direction_commands

Direction Commands

mltk.datasets.audio.on_off

On/Off

mltk.datasets.audio.yes_no

Yes/No

mltk.datasets.audio.ten_digits

Ten Digits

mltk.datasets.audio.mlcommons.ml_commons_keywords

ML Commons Keywords

mltk.datasets.audio.mlcommons.ml_commons_voice

ML Commons Voice Subset

mltk.datasets.audio.background_noise.ambient

Generic Background Noise

mltk.datasets.audio.background_noise.brd2601

BRD2601 Background Noise

mltk.datasets.audio.background_noise.esc50

ESC: Dataset for Environmental Sound Classification

mltk.datasets.audio.mit_ir_survey

MIT Impulse Response Survey

Image Datasets

mltk.datasets.image.rock_paper_scissors_v1

Rock, Paper, Scissors v1

mltk.datasets.image.rock_paper_scissors_v2

Rock, Paper, Scissors v2

mltk.datasets.image.mnist

MNIST

mltk.datasets.image.cifar10

CIFAR10

mltk.datasets.image.fashion_mnist

Fashion-MNIST

Accelerometer Datasets

mltk.datasets.accelerometer.tflm_magic_wand

Tensorflow-Lite Micro Magic Wand