mltk.core.MltkDataset

class MltkDataset[source]

Helper class for configuring a training dataset

Refer to mltk.core.DatasetMixin for more details.

Methods

__init__

load_dataset

Load the dataset subset

summarize_class_counts

Generate a text summary of the given class counts dictionary

summarize_dataset

Return a string summary of the dataset

unload_dataset

Unload the dataset

load_dataset(subset, test=False, **kwargs)[source]

Load the dataset subset

This is called automatically by the MLTK before training or evaluation.

Parameters:
  • subset (str) – The dataset subset to return: ‘training’ or ‘evaluation’

  • test (bool) – This is optional, it is used when invoking a training “dryrun”, e.g.: mltk train basic_example-test If this is true, then only return a small portion of the dataset for testing purposes

Returns:

  • None, in this case the function is expected to set the my_model.x, my_model.y, and/or my_model.validation_data properties

  • x - The x samples, in this case the my_model.x properties will be automatically set

  • (x, y) - The x samples and y labels. In this case the my_model.x and my_model.y properties will be automatically set

  • (x, None, validation_data) - The x samples and validation_data. In this case the my_model.x and my_model.validation_data properties will be automatically set

Return type:

One of the following

unload_dataset()[source]

Unload the dataset

summarize_dataset()[source]

Return a string summary of the dataset

Return type:

str

static summarize_class_counts(class_counts)[source]

Generate a text summary of the given class counts dictionary

Return type:

str

Parameters:

class_counts (dict) –