# Why is the model not returning correct results on the embedded device? There can be many reasons why the output of the `.tflite` model does not generate valid results when executing on an embedded device. Some of these reasons include: ## Input Data Preprocessing One of the more common reasons is that the input data is not formatted correctly. Recall that whatever preprocessing is done to the dataset during model training _must_ also be done to the input samples on the embedded device at runtime. So, for instance, if the training dataset is comprised of spectrograms, then whatever algorithms were used to convert the raw audio samples into the spectrograms must also be used on the embedded device (See the [AudioFeatureGenerator](../audio/audio_feature_generator.md) for how the MLTK can aid spectrogram generation). The MLTK also supports creating [Python Wrappers](../cpp_development/wrappers/index.md) which allows for sharing C++ source code between the model training scripts (i.e. Python) and the embedded device (i.e. C++). With this, algorithms can be developed in C++ and used to preprocess the data during model training. Later, the _exact_ same C++ algorithms can be built into the embedded firmware application. This way, the preprocessing algorithms only need to be written once and can be shared between model training and model execution. ## Input Data Type Another common issue can occur when the `.tflite` model input has an `int8` data type, e.g.: ```python my_model.tflite_converter['inference_input_type'] = tf.int8 my_model.tflite_converter['inference_output_type'] = tf.int8 ``` but the raw data uses another data type, e.g. `uint8`. In this case, both the model training scripts _and_ embedded device must convert the sample data to `int8`. For example, say we're creating an image classification model and our dataset contains `uint8` images. But, we want our model's input data type to be `int8`. The our [model specification script](../guides/model_specification.md) might contain: ```python # Tell the TF-Lite Converter to use int8 model input/output data types my_model.tflite_converter['inference_input_type'] = tf.int8 my_model.tflite_converter['inference_output_type'] = tf.int8 ... # This is called by the ParallelImageDataGenerator() for each training sample # It converts the data type from uint8 to int8 def convert_img_from_uint8_to_int8(params:ParallelProcessParams, x:np.ndarray) -> np.ndarray: # x is a float32 dtype but has an uint8 range x = np.clip(x, 0, 255) # The data should already been in the uint8 range, but clip it just to be sure x = x - 128 # Convert from uint8 to int8 x = x.astype(np.int8) return x # Define the data generator with the data conversion callback my_model.datagen = ParallelImageDataGenerator( preprocessing_function=convert_img_from_uint8_to_int8, ... ``` With this, the model is trained with `int8` input data samples. __Additionally__, on the embedded device, we must manually convert the `uint8` data from the camera to `int8`, e.g.: ```c++ for(int i = 0; i < image_length; ++i) { model_input->data.int8[i] = (int8_t)(image_data[i] - 128); } ``` ## Hint: Just use float32 You can skip all of the above by using a `float32` input data type, e.g.: ```python my_model.tflite_converter['inference_input_type'] = tf.float32 my_model.tflite_converter['inference_output_type'] = tf.float32 ``` With this, this is no need for the `convert_img_from_uint8_to_int8()` callback during training nor the `image_data[i] - 128` on the embedded device. The raw `uint8` image data can be directly used during training _and_ on the embedded device. (However, on the embedded device, you'll need to convert the image data from `uint8` to `float`). This works because the [TfliteConverter](https://www.tensorflow.org/lite/convert) automatically adds `Quantize` and `Dequantize` layers to the `.tflite` which internally convert the `float` input data to/from `int8`. Using `float32` as the model input data type is useful as the conversion is automatically handled by the `.tflite` model. However, it does require additional RAM and processing cycles. It requires additional RAM because the input tensor buffer increases by 4x (i.e.`sizeof(int8)` vs `sizeof(float)`). Also, additional cycles are required to convert to/from `int8` and `float`.