Skip to content

Machine Learning Toolkit Reference Datasets

MLTK Github Repository

Reference Datasets¶

The MLTK comes with datasets that are used by the reference models.

The source code for these datasets may be found on Github: https://github.com/siliconlabs/mltk/tree/master/mltk/datasets.

Audio Datasets¶

mltk.datasets.audio.speech_commands.speech_commands_v2

Google Speech Commands v2

mltk.datasets.audio.direction_commands

Direction Commands

mltk.datasets.audio.on_off

On/Off

mltk.datasets.audio.yes_no

Yes/No

mltk.datasets.audio.ten_digits

Ten Digits

mltk.datasets.audio.hey_gecko

Hey Gecko

mltk.datasets.audio.mlcommons.ml_commons_keywords

ML Commons Keywords

mltk.datasets.audio.mlcommons.ml_commons_voice

ML Commons Voice Subset

mltk.datasets.audio.background_noise.ambient

Generic Background Noise

mltk.datasets.audio.background_noise.brd2601

BRD2601 Background Noise

mltk.datasets.audio.background_noise.esc50

ESC: Dataset for Environmental Sound Classification

mltk.datasets.audio.mit_ir_survey

MIT Impulse Response Survey

Image Datasets¶

mltk.datasets.image.rock_paper_scissors_v1

Rock, Paper, Scissors v1

mltk.datasets.image.rock_paper_scissors_v2

Rock, Paper, Scissors v2

mltk.datasets.image.mnist

MNIST

mltk.datasets.image.cifar10

CIFAR10

mltk.datasets.image.fashion_mnist

Fashion-MNIST

Accelerometer Datasets¶

mltk.datasets.accelerometer.tflm_magic_wand

Tensorflow-Lite Micro Magic Wand