profile_model¶
- profile_model(model, image_path=None, accelerator=None, baud=115200, port=None, use_device=False, build=False, platform=None, runtime_buffer_size=-1, test=False, post_process=True, return_estimates=False, **kwargs)[source]¶
Profile a model for the given accelerator
This will profile the given model in either a hardware simulator or on a physical device.
- Parameters:
model (
Union
[MltkModel
,TfliteModel
,str
]) – The model to profile as either amltk.core.MltkModel
ormltk.core.TfliteModel
instance, or the path to a .tflite or .mltk.zip fileaccelerator (
str
) – The name of the hardware accelerator to profile for If omitted, then use reference kernelsuse_device (
bool
) – Profile on a locally connected embedded device. If omitted, then profile in simulatorport (
str
) – Serial port of physical platform. If omitted, attempt to discover automaticallybuild (
bool
) – If true, build the MLTK Model as a .tflite before profilingruntime_buffer_size – The size of the tensor arena. This is only used by the simulator (i.e. when use_device=False). If greater than 0, use the size given. If the given size is too small then loading the model will fail. If equal to 0, try to use the size built into the model’s parameters, if the model size is not available or too small, find the optimal size If less than 0, automatically find the optimal tensor arena size, ignore the size built into the model parameters
test (
bool
) – If a “test” model is providedpost_process (
bool
) – This allows for post-processing the profiling results (e.g. uploading to a cloud) if supported by the given MltkModelreturn_estimates – If profiling in the simulator, this will estimate additional metrics such as CPU cycles and energy. Disabling this option can reduce profiling time
image_path (str) –
baud (int) –
platform (str) –
- Return type:
- Returns:
The results of model profiling
ProfilingModelResults¶
- class ProfilingModelResults[source]¶
Results from profiling model for specific accelerator
- __init__(model, accelerator=None, platform=None, cpu_clock_rate=0, runtime_memory_bytes=0, layers=None, is_simulated=True, model_details=None)[source]¶
- Parameters:
model (TfliteModel) –
accelerator (str) –
platform (str) –
cpu_clock_rate (int) –
runtime_memory_bytes (int) –
layers (List[ProfilingLayerResult]) –
- property name: str¶
Name of the profiled model
- Return type:
str
- property tflite_model: TfliteModel¶
Associated TfliteModel
- Return type:
- property tflite_micro_model_details: TfliteMicroModelDetails¶
TF-Lite Micro model-specific details
- Return type:
- property accelerator: str¶
Name of accelerator used for profiling
- Return type:
str
- property platform: str¶
Platform model was profiled on
- Return type:
str
- property is_simulated: bool¶
True if the simulator was used to generate the results, else False if an embedded device was used
- Return type:
bool
- property cpu_clock_rate: int¶
Clock rate in hertz
- Return type:
int
- property runtime_memory_bytes: int¶
Total SRAM in bytes required by ML library NOTE: This only include the ML run-time memory, it does NOT include the memory required by the user application or external pre-processing libraries (e.g. DSP)
- Return type:
int
- property flatbuffer_size: int¶
Total size in bytes required by ML model This is the size the .tflite flatbuffer file
- Return type:
int
- property layers: List[ProfilingLayerResult]¶
Profiling details of each model layer
- Return type:
List
[ProfilingLayerResult
]
- property n_layers: int¶
Number of layers in model
- Return type:
int
- property input_shape_str: str¶
Model input shape(s) as a string
- Return type:
str
- property input_dtype_str: str¶
Model input data type(s) as a string
- Return type:
str
- property output_shape_str: str¶
Model output shape(s) as a string
- Return type:
str
- property output_dtype_str: str¶
Model output data type(s) as a string
- Return type:
str
- property ops: int¶
The total number of ops to execute one model inference
- Return type:
int
- property macs: int¶
The total number of multiply-accumulate operations to execute one model inference
- Return type:
int
- property accelerator_cycles: int¶
The total number of accelerator cycles to execute one model inference
- Return type:
int
- property cpu_cycles: int¶
The total number of CPU cycles to execute one model inference
- Return type:
int
- property time: float¶
The total time in seconds required to execute one model inference
- Return type:
float
- property energy: float¶
The total energy required to execute one model inference
- Return type:
float
- property cpu_utilization: float¶
Ratio of the CPU used to execute the model
cpu_utilization = CPU cycles / Total clock cycles
Smaller cpu_utilization -> More cycles for CPU to do other non-ML tasks
- Return type:
float
- property n_unsupported_layers: int¶
The number of layers not supported by the accelerator
- Return type:
int
- property unsupported_layers: List[ProfilingLayerResult]¶
Return layers not supported by accelerator
- Return type:
List
[ProfilingLayerResult
]
- stat_total(name)[source]¶
Return the total sum of the all the layers for the given stat
- Return type:
Union
[float
,int
]- Parameters:
name (str) –
- get_summary(include_labels=False, format_units=False, exclude_null=True, full_summary=False)[source]¶
Return a summary of the profiling results as a dictionary
- Return type:
dict
- generate_report(output_dir, format_units, full_summary=False)[source]¶
Generate a profiling report in the given directory
- Parameters:
output_dir (str) –
format_units (bool) –
- to_dict(format_units=False, exclude_null=True, full_summary=False)[source]¶
Return profiling results as dictionary
- Arguments
format_units: Format number values to a string with associated units, e.g. 0.0234 -> 23.4m exclude_null: Exclude columns with all number values (e.g. don’t include energy if not energy numbers were provided) full_summary: Return all profiled stats. If this this false, then only the basic stats are returned
- Return type:
dict
- to_json(indent=2, format_units=False, exclude_null=True, full_summary=False)[source]¶
Return profiling results as JSON string
JSON Format:
{ "summary": { key/value summary of profiling }, "summary_labels": { key/value of printable labeles for each summary field } "layers": [ {<model layer results>}, ... ] "layers_labels": { key/value of printable labeles for each layer field } }
Where the “summary” member contains:
"summary": { "name" : "<Name of model>", "accelerator" : "<Accelerator used>", "input_shape" : "<Model input shapes>", "input_dtype" : "<Model input data types>", "output_shape" : "<Model output shapes>", "output_dtype" : "<Model output data types>", "tflite_size" : <.tflite file size>, "runtime_memory_size" : <Estimated TFLM arena size>, "ops" : <Total # operations>, "macs" : <Total # multiply-accumulate ops>, "accelerator_cycles" : <Total # accelerator cycles>, "cpu_cycles" : <Total estimated CPU cycles>, "cpu_utilization" : <Percentage of CPU required to run an inference>, "cpu_clock_rate" : <CPU clock rate hz>, "energy" : <Total estimated energy in Joules>, "time" : <Total estimated inference time>, "n_layers" : <# of layers in model>, "n_unsupported_layers": <# layers unsupported by accelerator>, "j_per_op" : <Joules per operation>, "j_per_mac" : <Joules per multiply-accumulate>, "op_per_s" : <Operations per second>, "mac_per_s" : <Multiply-accumulates per second>, "inf_per_s" : <Inference per second> }
Where the “layers” member contains:
"layers": [ { "index" : <layer index>, "opcode" : "<kernel opcode>", "options" : "<layer options>", "ops" : <# operations>, "macs" : <# of multiple-accumulate operations>, "accelerator_cycles" : <# accelerator cycles>, "cpu_cycles" : <estimated CPU cycles>, "energy" : <estimated energy in Joules>, "time" : <estimated layer execution time>, "supported" : <true/false>, "err_msg" : "<error msg if not supported by accelerator>" }, ... ]
- Arguments
indent: Amount of indentation to use in JSON formatting format_units: Format number values to a string with associated units, e.g. 0.0234 -> 23.4m exclude_null: Exclude columns with all number values (e.g. don’t include energy if not energy numbers were provided) full_summary: Return all profiled stats. If this this false, then only the basic stats are returned
- Returns
JSON formated string
- Return type:
str
- to_csv(format_units=False, exclude_null=True, full_summary=False, include_header=True, dialect='excel')[source]¶
Return profiling results as CSV string
This returns a tuple of two CSV formatted strings. The first string contains the profiling summary. The second string contain the profiling results for the individual layers.
- Parameters:
format_units – Format number values to a string with associated units, e.g. 0.0234 -> 23.4m
exclude_null – Exclude columns with all number values (e.g. don’t include energy if not energy numbers were provided)
full_summary – Return all profiled stats. If this this false, then only the basic stats are returned
include_header – Include the header row. If false then only the results are returned with no labels in the first row.
dialect (
Union
[str
,Dialect
]) – CSV dialect Default is excel. See https://docs.python.org/3/library/csv.html for more details.
- Return type:
Tuple[str, str]
- Return type:
Tuple
[str
,str
]- Parameters:
dialect (Union[str, Dialect]) –
- to_string(format_units=True, exclude_null=True, full_summary=False)[source]¶
Return the profiling results as a string
- Arguments
format_units: Format number values to a string with associated units, e.g. 0.0234 -> 23.4m exclude_null: Exclude columns with all number values (e.g. don’t include energy if not energy numbers were provided) full_summary: Return all profiled stats. If this this false, then only the basic stats are returned
- Return type:
str
ProfilingLayerResult¶
- class ProfilingLayerResult[source]¶
Profiling results for an individual layer of a model
- __init__(tflite_layer, ops=0, macs=0, cpu_cycles=0, accelerator_cycles=0, time=0.0, energy=0.0, error_msg=None, **kwargs)[source]¶
- Parameters:
tflite_layer (TfliteLayer) –
ops (int) –
macs (int) –
cpu_cycles (int) –
accelerator_cycles (int) –
time (float) –
energy (float) –
error_msg (str) –
- property is_accelerated: bool¶
Return true if this layer was executed on the accelerator
- Return type:
bool
- property is_unsupported: bool¶
Return true if this layer should have been accelerated but exceeds the limits of the accelerator
- Return type:
bool
- property error_msg: str¶
Error message generated by accelerator if layer was not supported
- Return type:
str
- property tflite_layer: TfliteLayer¶
Associated TF-Lite layer
- Return type:
- property index: int¶
Index of this layer in the model
- Return type:
int
- property name: str¶
Op<index>-<OpCodeStr>
- Type:
Name of current layer as
- Return type:
str
- property opcode_str: str¶
OpCode as a string
- Return type:
str
- property opcode: BuiltinOperator¶
OpCode
- Return type:
BuiltinOperator
- property macs: int¶
Number of Multiple-Accumulate operations required by this layer
- Return type:
int
- property ops: int¶
Number of operations required by this layer
- Return type:
int
- property accelerator_cycles: int¶
Number of accelerator clock cycles required by this layer
- Return type:
int
- property cpu_cycles: int¶
Number of CPU clock cycles required by this layer
- Return type:
int
- property time: float¶
Time in seconds required by this layer
- Return type:
float
- property energy: float¶
Energy in Joules required by this layer The energy is relative to the ‘baseline’ energy (i.e. energy used while the device was idling)
- Return type:
float
- property options_str: str¶
Layer configuration options as a string
- Return type:
str
- property input_shape_str: str¶
Layer input shape(s) as a string
- Return type:
str
- property input_dtype_str: str¶
Layer input data type(s) as a string
- Return type:
str
- property output_shape_str: str¶
Layer output shape(s) as a string
- Return type:
str
- property output_dtype_str: str¶
Layer output data type(s) as a string
- Return type:
str