profile_model

profile_model(model, image_path=None, accelerator=None, baud=115200, port=None, use_device=False, build=False, platform=None, runtime_buffer_size=-1, test=False, post_process=True, return_estimates=False, **kwargs)[source]

Profile a model for the given accelerator

This will profile the given model in either a hardware simulator or on a physical device.

Parameters:
  • model (Union[MltkModel, TfliteModel, str]) – The model to profile as either a mltk.core.MltkModel or mltk.core.TfliteModel instance, or the path to a .tflite or .mltk.zip file

  • accelerator (str) – The name of the hardware accelerator to profile for If omitted, then use reference kernels

  • use_device (bool) – Profile on a locally connected embedded device. If omitted, then profile in simulator

  • port (str) – Serial port of physical platform. If omitted, attempt to discover automatically

  • build (bool) – If true, build the MLTK Model as a .tflite before profiling

  • runtime_buffer_size – The size of the tensor arena. This is only used by the simulator (i.e. when use_device=False). If greater than 0, use the size given. If the given size is too small then loading the model will fail. If equal to 0, try to use the size built into the model’s parameters, if the model size is not available or too small, find the optimal size If less than 0, automatically find the optimal tensor arena size, ignore the size built into the model parameters

  • test (bool) – If a “test” model is provided

  • post_process (bool) – This allows for post-processing the profiling results (e.g. uploading to a cloud) if supported by the given MltkModel

  • return_estimates – If profiling in the simulator, this will estimate additional metrics such as CPU cycles and energy. Disabling this option can reduce profiling time

  • image_path (str) –

  • baud (int) –

  • platform (str) –

Return type:

ProfilingModelResults

Returns:

The results of model profiling

ProfilingModelResults

class ProfilingModelResults[source]

Results from profiling model for specific accelerator

__init__(model, accelerator=None, platform=None, cpu_clock_rate=0, runtime_memory_bytes=0, layers=None, is_simulated=True, model_details=None)[source]
Parameters:
  • model (TfliteModel) –

  • accelerator (str) –

  • platform (str) –

  • cpu_clock_rate (int) –

  • runtime_memory_bytes (int) –

  • layers (List[ProfilingLayerResult]) –

property name: str

Name of the profiled model

Return type:

str

property tflite_model: TfliteModel

Associated TfliteModel

Return type:

TfliteModel

property tflite_micro_model_details: TfliteMicroModelDetails

TF-Lite Micro model-specific details

Return type:

TfliteMicroModelDetails

property accelerator: str

Name of accelerator used for profiling

Return type:

str

property platform: str

Platform model was profiled on

Return type:

str

property is_simulated: bool

True if the simulator was used to generate the results, else False if an embedded device was used

Return type:

bool

property cpu_clock_rate: int

Clock rate in hertz

Return type:

int

property runtime_memory_bytes: int

Total SRAM in bytes required by ML library NOTE: This only include the ML run-time memory, it does NOT include the memory required by the user application or external pre-processing libraries (e.g. DSP)

Return type:

int

property flatbuffer_size: int

Total size in bytes required by ML model This is the size the .tflite flatbuffer file

Return type:

int

property layers: List[ProfilingLayerResult]

Profiling details of each model layer

Return type:

List[ProfilingLayerResult]

property n_layers: int

Number of layers in model

Return type:

int

property input_shape_str: str

Model input shape(s) as a string

Return type:

str

property input_dtype_str: str

Model input data type(s) as a string

Return type:

str

property output_shape_str: str

Model output shape(s) as a string

Return type:

str

property output_dtype_str: str

Model output data type(s) as a string

Return type:

str

property ops: int

The total number of ops to execute one model inference

Return type:

int

property macs: int

The total number of multiply-accumulate operations to execute one model inference

Return type:

int

property accelerator_cycles: int

The total number of accelerator cycles to execute one model inference

Return type:

int

property cpu_cycles: int

The total number of CPU cycles to execute one model inference

Return type:

int

property time: float

The total time in seconds required to execute one model inference

Return type:

float

property energy: float

The total energy required to execute one model inference

Return type:

float

property cpu_utilization: float

Ratio of the CPU used to execute the model

cpu_utilization = CPU cycles / Total clock cycles

Smaller cpu_utilization -> More cycles for CPU to do other non-ML tasks

Return type:

float

property n_unsupported_layers: int

The number of layers not supported by the accelerator

Return type:

int

property unsupported_layers: List[ProfilingLayerResult]

Return layers not supported by accelerator

Return type:

List[ProfilingLayerResult]

stat_total(name)[source]

Return the total sum of the all the layers for the given stat

Return type:

Union[float, int]

Parameters:

name (str) –

get_summary(include_labels=False, format_units=False, exclude_null=True, full_summary=False)[source]

Return a summary of the profiling results as a dictionary

Return type:

dict

generate_report(output_dir, format_units, full_summary=False)[source]

Generate a profiling report in the given directory

Parameters:
  • output_dir (str) –

  • format_units (bool) –

to_dict(format_units=False, exclude_null=True, full_summary=False)[source]

Return profiling results as dictionary

Arguments

format_units: Format number values to a string with associated units, e.g. 0.0234 -> 23.4m exclude_null: Exclude columns with all number values (e.g. don’t include energy if not energy numbers were provided) full_summary: Return all profiled stats. If this this false, then only the basic stats are returned

Return type:

dict

to_json(indent=2, format_units=False, exclude_null=True, full_summary=False)[source]

Return profiling results as JSON string

JSON Format:

{
"summary": { key/value summary of profiling },
"summary_labels": { key/value of printable labeles for each summary field }
"layers": [ {<model layer results>},  ... ]
"layers_labels": { key/value of printable labeles for each layer field }
}

Where the “summary” member contains:

"summary": {
    "name"                : "<Name of model>",
    "accelerator"         : "<Accelerator used>",
    "input_shape"         : "<Model input shapes>",
    "input_dtype"         : "<Model input data types>",
    "output_shape"        : "<Model output shapes>",
    "output_dtype"        : "<Model output data types>",
    "tflite_size"         : <.tflite file size>,
    "runtime_memory_size" : <Estimated TFLM arena size>,
    "ops"                 : <Total # operations>,
    "macs"                : <Total # multiply-accumulate ops>,
    "accelerator_cycles"  : <Total # accelerator cycles>,
    "cpu_cycles"          : <Total estimated CPU cycles>,
    "cpu_utilization"     : <Percentage of CPU required to run an inference>,
    "cpu_clock_rate"      : <CPU clock rate hz>,
    "energy"              : <Total estimated energy in Joules>,
    "time"                : <Total estimated inference time>,
    "n_layers"            : <# of layers in model>,
    "n_unsupported_layers": <# layers unsupported by accelerator>,
    "j_per_op"            : <Joules per operation>,
    "j_per_mac"           : <Joules per multiply-accumulate>,
    "op_per_s"            : <Operations per second>,
    "mac_per_s"           : <Multiply-accumulates per second>,
    "inf_per_s"           : <Inference per second>
}

Where the “layers” member contains:

"layers": [ {
    "index"       : <layer index>,
    "opcode"      : "<kernel opcode>",
    "options"     : "<layer options>",
    "ops"         : <# operations>,
    "macs"        : <# of multiple-accumulate operations>,
    "accelerator_cycles" : <# accelerator cycles>,
    "cpu_cycles"  : <estimated CPU cycles>,
    "energy"      : <estimated energy in Joules>,
    "time"        : <estimated layer execution time>,
    "supported"   : <true/false>,
    "err_msg"     : "<error msg if not supported by accelerator>"
    },
    ...
]
Arguments

indent: Amount of indentation to use in JSON formatting format_units: Format number values to a string with associated units, e.g. 0.0234 -> 23.4m exclude_null: Exclude columns with all number values (e.g. don’t include energy if not energy numbers were provided) full_summary: Return all profiled stats. If this this false, then only the basic stats are returned

Returns

JSON formated string

Return type:

str

to_csv(format_units=False, exclude_null=True, full_summary=False, include_header=True, dialect='excel')[source]

Return profiling results as CSV string

This returns a tuple of two CSV formatted strings. The first string contains the profiling summary. The second string contain the profiling results for the individual layers.

Parameters:
  • format_units – Format number values to a string with associated units, e.g. 0.0234 -> 23.4m

  • exclude_null – Exclude columns with all number values (e.g. don’t include energy if not energy numbers were provided)

  • full_summary – Return all profiled stats. If this this false, then only the basic stats are returned

  • include_header – Include the header row. If false then only the results are returned with no labels in the first row.

  • dialect (Union[str, Dialect]) – CSV dialect Default is excel. See https://docs.python.org/3/library/csv.html for more details.

Return type:

Tuple[str, str]

Return type:

Tuple[str, str]

Parameters:

dialect (Union[str, Dialect]) –

to_string(format_units=True, exclude_null=True, full_summary=False)[source]

Return the profiling results as a string

Arguments

format_units: Format number values to a string with associated units, e.g. 0.0234 -> 23.4m exclude_null: Exclude columns with all number values (e.g. don’t include energy if not energy numbers were provided) full_summary: Return all profiled stats. If this this false, then only the basic stats are returned

Return type:

str

ProfilingLayerResult

class ProfilingLayerResult[source]

Profiling results for an individual layer of a model

__init__(tflite_layer, ops=0, macs=0, cpu_cycles=0, accelerator_cycles=0, time=0.0, energy=0.0, error_msg=None, **kwargs)[source]
Parameters:
  • tflite_layer (TfliteLayer) –

  • ops (int) –

  • macs (int) –

  • cpu_cycles (int) –

  • accelerator_cycles (int) –

  • time (float) –

  • energy (float) –

  • error_msg (str) –

property is_accelerated: bool

Return true if this layer was executed on the accelerator

Return type:

bool

property is_unsupported: bool

Return true if this layer should have been accelerated but exceeds the limits of the accelerator

Return type:

bool

property error_msg: str

Error message generated by accelerator if layer was not supported

Return type:

str

property tflite_layer: TfliteLayer

Associated TF-Lite layer

Return type:

TfliteLayer

property index: int

Index of this layer in the model

Return type:

int

property name: str

Op<index>-<OpCodeStr>

Type:

Name of current layer as

Return type:

str

property opcode_str: str

OpCode as a string

Return type:

str

property opcode: BuiltinOperator

OpCode

Return type:

BuiltinOperator

property macs: int

Number of Multiple-Accumulate operations required by this layer

Return type:

int

property ops: int

Number of operations required by this layer

Return type:

int

property accelerator_cycles: int

Number of accelerator clock cycles required by this layer

Return type:

int

property cpu_cycles: int

Number of CPU clock cycles required by this layer

Return type:

int

property time: float

Time in seconds required by this layer

Return type:

float

property energy: float

Energy in Joules required by this layer The energy is relative to the ‘baseline’ energy (i.e. energy used while the device was idling)

Return type:

float

property options_str: str

Layer configuration options as a string

Return type:

str

property input_shape_str: str

Layer input shape(s) as a string

Return type:

str

property input_dtype_str: str

Layer input data type(s) as a string

Return type:

str

property output_shape_str: str

Layer output shape(s) as a string

Return type:

str

property output_dtype_str: str

Layer output data type(s) as a string

Return type:

str

get_summary(include_labels=False, format_units=False, excluded_columns=None, full_summary=False)[source]

Return a summary of the layer profiling results as a dictionary

Return type:

dict

Parameters:

excluded_columns (List[str]) –