mltk.datasets.image.cifar10

CIFAR10

This is a dataset of 50,000 32x32 color training images and 10,000 test images, labeled over 10 categories. See more info at the CIFAR homepage

The classes are:

  • airplane

  • automobile

  • bird

  • cat

  • deer

  • dog

  • frog

  • horse

  • ship

  • truck

Example

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
assert x_train.shape == (50000, 32, 32, 3)
assert x_test.shape == (10000, 32, 32, 3)
assert y_train.shape == (50000, 1)
assert y_test.shape == (10000, 1)

Variables

DOWNLOAD_URL

Public download URL

VERIFY_SHA1

SHA1 hash of archive file

Functions

load_data([dest_dir, dest_subdir, logger, ...])

Download the dataset, extract, load into memory, and return as a tuple of numpy arrays

load_data_directory([dest_dir, dest_subdir, ...])

Download the dataset, extract all sample images to a directory, and return the path to the directory.

DOWNLOAD_URL = 'https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'

Public download URL

VERIFY_SHA1 = '6d958be074577803d12ecdefd02955f39262c83c16fe9348329d7fe0b5c001ce'

SHA1 hash of archive file

load_data(dest_dir=None, dest_subdir='datasets/cifar10', logger=None, clean_dest_dir=False)[source]

Download the dataset, extract, load into memory, and return as a tuple of numpy arrays

Returns:

(x_train, y_train), (x_test, y_test)

x_train: uint8 NumPy array of grayscale image data with shapes (50000, 32, 32, 3), containing the training data. Pixel values range from 0 to 255.

y_train: uint8 NumPy array of labels (integers in range 0-9) with shape (50000, 1) for the training data.

x_test: uint8 NumPy array of grayscale image data with shapes (10000, 32, 32, 3), containing the test data. Pixel values range from 0 to 255.

y_test: uint8 NumPy array of labels (integers in range 0-9) with shape (10000, 1) for the test data.

Return type:

Tuple of NumPy arrays

Parameters:
  • dest_dir (str) –

  • logger (Logger) –

load_data_directory(dest_dir=None, dest_subdir='datasets/cifar10', logger=None, clean_dest_dir=False)[source]

Download the dataset, extract all sample images to a directory, and return the path to the directory.

Each sample type is extract to its corresponding subdirectory, e.g.:

~/.mltk/datasets/cifar10/airplane ~/.mltk/datasets/cifar10/automobile …

Return type:

Path to extract directory

Parameters:
  • dest_dir (str) –

  • logger (Logger) –