# Model Training


This describes how to train a Machine Learning model using the MLTK and [Google Tensorflow](https://www.tensorflow.org).

```{note}
This document focuses on the training aspect of model development.
Refer to the [tutorials](../tutorials.md) for end-to-end guides on how to develop an ML model.
```


## Quick Reference

- Command-line: [mltk train --help](../command_line/train.md)
- Python API: [train_model](mltk.core.train_model)
- Python API examples: [train_model.ipynb](../../mltk/examples/train_model.ipynb)


## Overview

The MLTK internally uses [Google Tensorflow](https://www.tensorflow.org/tutorials) to train a model.  
The basic sequence for training a model is:

1. Create a [model specification](./model_specification.md) script
2. Populate the model training and dataset parameters
3. Define the model layout using the [Keras API](https://keras.io/api)
4. Invoke model training using the [Command-Line](#command) or [Python API](#python-api)

When training completes, a [model archive](./model_archive.md) file is generated in the same directory as the model training script and contains the trained model files and logs.

__HINT:__ See [Training via SSH](./model_training_via_ssh.md) for how to quickly train your model in the cloud.

## Model Specification

All model training parameters are defined in the [Model Specification](./model_specification.md) script.
This is a standard Python script that defines a [MltkModel](mltk.core.MltkModel) instance.

### MltkModel Instance

All training parameters are configured in the [MltkModel](mltk.core.MltkModel) instance.  
For example, the following might be added to the top of `my_model_v1.py`:

```python
# Define a new MyModel class which inherits the 
# MltkModel and several mixins
# @mltk_model
class MyModel(
    MltkModel, 
    TrainMixin, 
    AudioDatasetMixin, 
    EvaluateClassifierMixin
):
    """My Model's class object"""

# Instantiate the MyModel class
my_model = MyModel()
```

Here we define our model's class object: `MyModel`.

At a minimum, this custom class must inherit the following:  
- [MltkModel](mltk.core.MltkModel)
- [TrainMixin](mltk.core.TrainMixin)
- [DatasetMixin](mltk.core.DatasetMixin) (or a child of this mixin)

Additionally, this class inherits other model "mixins" to aid model development.

After our model is instantiated, the rest of the model specification simply 
populates the various properties of `MyModel`, e.g.:  

```python
# General Settings
my_model.version = 1
my_model.description = 'My model is great!'

# Training Basic Settings
my_model.epochs = 100
my_model.batch_size = 64 
my_model.optimizer = 'adam'
...

# Dataset Settings
my_model.dataset = speech_commands_v2
my_model.class_mode = 'categorical'
my_model.classes = ['up', 'down', 'left', 'right']
...
```

```{note}
The filename of the model specification script is the name given to the model. So, in this case, the model name is `my_model_v1`.
```

### Model Layout

An important property of the `MyModel` class example from above is
[TrainMixin.build_model_function](mltk.core.TrainMixin.build_model_function). This should reference
a function that builds the actual machine learning model which is built using the [Keras API](https://keras.io/api).

For example:

```python
def my_model_builder(my_model: MyModel):
    keras_model = Sequential(name=my_model.name)

    keras_model = Sequential()
    keras_model.add(InputLayer(my_model.input_shape))
    keras_model.add(Conv2D(
        filters=8,
        kernel_size=(10, 8),
        use_bias=True,
        padding="same",
        strides=(2,2))
    )
    keras_model.add(Activation('relu'))
    keras_model.add(Flatten())
    keras_model.add(Dense(units=my_model.n_classes))

    keras_model.compile(
        loss=my_model.loss, 
        optimizer=my_model.optimizer, 
        metrics=my_model.metrics
    )
    return keras_model

# Set the model property to reference the model build function
my_model.build_model_function = my_model_builder
```

Here, we define a function that builds a [KerasModel](mltk.core.KerasModel) then
sets `my_model.build_model_function` to reference the function.

At model training time, the model building function is invoked and the built 
[KerasModel](mltk.core.KerasModel) is trained using Tensorflow.


#### Note about hardcoding model layer parameters

While not required, the `my_model` argument to the building function
should be used over hardcoded values, e.g.:

```python
# Good:
# Dynamically determine the number of dense unit based
# on the number of classes specified in the model properties
keras_model.add(Dense(units=my_model.n_classes))

# Bad:
# Hardcoding dense units
# If the number of classes changes, 
# then training will likely fail
keras_model.add(Dense(units=5))
```

See the [Model Specification](./model_specification.md) documentation for more details.


## Training Output

When training completes, a [model archive](./model_archive.md) file is generated in the same directory as the model specification script and contains the trained model files and logs.

Included in the [model archive](./model_archive.md) is a quantized, `.tflite` model file. 
This is the file that is programmed into the embedded device and executed by [Tensorflow-Lite Micro](https://github.com/tensorflow/tflite-micro).

The `.tflite` is generated by the Tensorflow-Lite [Converter](https://www.tensorflow.org/lite/convert).
The settings for the converter are defined in the model specification script using the model property: [TrainMixin.tflite_converter](mltk.core.TrainMixin.tflite_converter)


For example, the model specification script might have:

```python
my_model.tflite_converter['optimizations'] = [tf.lite.Optimize.DEFAULT]
my_model.tflite_converter['supported_ops'] = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
my_model.tflite_converter['inference_input_type'] = tf.int8
my_model.tflite_converter['inference_output_type'] = tf.int8
my_model.tflite_converter['representative_dataset'] = 'generate'
```
These settings are used at the end of training to generate the `.tflite`.  
See [Model Quantization](./model_quantization.md) for more details.


## Command

Model training from the command-line is done using the `train` operation.

For more details on the available command-line options, issue the command:

```shell
mltk train --help
```

__HINT:__ See [Training via SSH](./model_training_via_ssh.md) for how to quickly train your model in the cloud.

The following are examples of how training can be invoked from the command-line:

### Example 1: Train as a "dry run"

Before fully training a model, sometimes it is useful to do a "dry run" 
to ensure everything is working. This can be done by appending `-test` to the end 
of the model name. With this, the model is trained for 1 epoch on a subset of 
the training data, and a model archive with `-test` append to the name is generated. 

```shell
mltk train tflite_micro_speech-test
```

### Example 2: Train for 100 epochs

The model specification typically contains the number of training epochs, i.e. [TrainMixin.epochs](mltk.core.TrainMixin.epochs).  
Optionally, the `--epochs` option can be used to override the model specification.

```shell
mltk train audio_example1 --epochs 100
```

### Example 3: Resume Training

If training does not fully complete, it can be restarted by adding the `--resume` option.
This will load the weights from the last saved checkpoint and begin training at that checkpoint's epoch.
See [TrainMixin.checkpoint](mltk.core.TrainMixin.checkpoint) for more details.

```shell
mltk train image_example1 --resume
```

## Python API

Model training is accessible via [train_model](mltk.core.train_model) API.

Examples using this API may be found in [train_model.ipynb](../../mltk/examples/train_model.ipynb)