- file NnApiDelegate.java
-
-
-
5.3. Neuron Adapter API Reference
Typedefs
- typedef struct NeuronModel NeuronModel
-
NeuronModel is an opaque type that contains a description of the mathematical operations that constitute the model.
- typedef struct NeuronCompilation NeuronCompilation
-
NeuronCompilation is an opaque type that can be used to compile a machine learning model.
- typedef struct NeuronExecution NeuronExecution
-
NeuronExecution is an opaque type that can be used to apply a machine learning model to a set of inputs.
- typedef struct NeuronDevice NeuronDevice
-
NeuronDevice is an opaque type that represents a device.
This type is used to query basic properties and supported operations of the corresponding device, and control which device(s) a model is to be run on.
Available since 4.1.0
- typedef struct NeuronMemory NeuronMemory
-
This type is used to represent shared memory, memory mapped files, and similar memories.
It is the application’s responsibility to ensure that there are no uses of the memory after calling NeuronMemory_free. This includes the execution which references this memory because of a call to NeuronExecution_setInputFromMemory or NeuronExecution_setOutputFromMemory.
Available since 4.1.0
- typedef struct NeuronEvent NeuronEvent
-
NeuronEvent is an opaque type that represents an event that will be signaled once an execution completes.
Available since 5.0.0
- typedef struct NeuronOperandType NeuronOperandType
-
NeuronOperandType describes the type of an operand. This structure is used to describe both scalars and tensors.
- typedef struct NeuronSymmPerChannelQuantParams NeuronSymmPerChannelQuantParams
-
Parameters for NEURON_TENSOR_QUANT8_SYMM_PER_CHANNEL operand.
-
Enums
- enum NeuronAdapterResultCode
-
Result codes.
Values:
- enumerator NEURON_NO_ERROR
-
- enumerator NEURON_OUT_OF_MEMORY
-
- enumerator NEURON_INCOMPLETE
-
- enumerator NEURON_UNEXPECTED_NULL
-
- enumerator NEURON_BAD_DATA
-
- enumerator NEURON_OP_FAILED
-
- enumerator NEURON_UNMAPPABLE
-
- enumerator NEURON_BAD_STATE
-
- enumerator NEURON_BAD_VERSION
-
- enumerator NEURON_OUTPUT_INSUFFICIENT_SIZE
-
- enumerator NEURON_UNAVAILABLE_DEVICE
-
- enumerator NEURON_MISSED_DEADLINE_TRANSIENT
-
- enumerator NEURON_MISSED_DEADLINE_PERSISTENT
-
- enumerator NEURON_RESOURCE_EXHAUSTED_TRANSIENT
-
- enumerator NEURON_RESOURCE_EXHAUSTED_PERSISTENT
-
- enumerator NEURON_DEAD_OBJECT
-
- enum [anonymous]
-
Operand values with size in bytes that are smaller or equal to this will be immediately copied into the model.
Values:
- enumerator NEURON_MAX_SIZE_OF_IMMEDIATELY_COPIED_VALUES
-
- enum [anonymous]
-
Size of the cache token, in bytes, required from the application.
Values:
- enumerator NEURON_BYTE_SIZE_OF_CACHE_TOKEN
-
- enum [anonymous]
-
Operand types. The type of operands that can be added to a model.
Some notes on quantized tensors
NEURON_TENSOR_QUANT8_ASYMM
Attached to this tensor are two numbers that can be used to convert the 8 bit integer to the real value and vice versa. These two numbers are:
-
scale: a 32 bit floating point value greater than zero.
-
zeroPoint: a 32 bit integer, in range [0, 255].
The formula is: real_value = (integer_value - zero_value) * scale.
NEURON_TENSOR_QUANT16_SYMM
Attached to this tensor is a number representing real value scale that is used to convert the 16 bit number to a real value in the following way: realValue = integerValue * scale. scale is a 32 bit floating point with value greater than zero.
NEURON_TENSOR_QUANT8_SYMM_PER_CHANNEL
This tensor is associated with additional fields that can be used to convert the 8 bit signed integer to the real value and vice versa. These fields are:
The size of the scales array must be equal to dimensions[channelDim]. NeuronModel_setOperandSymmPerChannelQuantParams must be used to set the parameters for an Operand of this type. The channel dimension of this tensor must not be unknown (dimensions[channelDim] != 0). The formula is: realValue[…, C, …] = integerValue[…, C, …] * scales[C] where C is an index in the Channel dimension.
NEURON_TENSOR_QUANT16_ASYMM
Attached to this tensor are two numbers that can be used to convert the 16 bit integer to the real value and vice versa. These two numbers are:
-
scale: a 32 bit floating point value greater than zero.
-
zeroPoint: a 32 bit integer, in range [0, 65535].
The formula is: real_value = (integer_value - zeroPoint) * scale.
NEURON_TENSOR_QUANT8_SYMM
Attached to this tensor is a number representing real value scale that is used to convert the 8 bit number to a real value in the following way: realValue = integerValue * scale. scale is a 32 bit floating point with value greater than zero.
NEURON_TENSOR_QUANT8_ASYMM_SIGNED
Attached to this tensor are two numbers that can be used to convert the 8 bit integer to the real value and vice versa. These two numbers are:
-
scale: a 32 bit floating point value greater than zero.
-
zeroPoint: a 32 bit integer, in range [-128, 127].
The formula is: real_value = (integer_value - zeroPoint) * scale.
Values:
- enumerator NEURON_FLOAT32
-
A 32 bit floating point scalar value.
- enumerator NEURON_INT32
-
A signed 32 bit integer scalar value.
- enumerator NEURON_UINT32
-
An unsigned 32 bit integer scalar value.
- enumerator NEURON_TENSOR_FLOAT32
-
A tensor of 32 bit floating point values.
- enumerator NEURON_TENSOR_INT32
-
A tensor of 32 bit integer values.
- enumerator NEURON_TENSOR_QUANT8_ASYMM
-
A tensor of 8 bit integers that represent real numbers.
- enumerator NEURON_BOOL
-
An 8 bit boolean scalar value.
- enumerator NEURON_TENSOR_QUANT16_SYMM
-
A tensor of 16 bit signed integers that represent real numbers.
- enumerator NEURON_TENSOR_FLOAT16
-
A tensor of IEEE 754 16 bit floating point values.
- enumerator NEURON_TENSOR_BOOL8
-
A tensor of 8 bit boolean values.
- enumerator NEURON_FLOAT16
-
An IEEE 754 16 bit floating point scalar value.
- enumerator NEURON_TENSOR_QUANT8_SYMM_PER_CHANNEL
-
A tensor of 8 bit signed integers that represent real numbers.
- enumerator NEURON_TENSOR_QUANT16_ASYMM
-
A tensor of 16 bit unsigned integers that represent real numbers.
- enumerator NEURON_TENSOR_QUANT8_SYMM
-
A tensor of 8 bit signed integers that represent real numbers.
- enumerator NEURON_TENSOR_QUANT8_ASYMM_SIGNED
-
A tensor of 8 bit signed integers that represent real numbers.
- enumerator NEURON_MODEL
-
A reference to a model.
- enumerator NEURON_EXT_TENSOR_UINT32
-
Extended data type - tensor uint32
- enum NeuronOperationType
-
Operation Types
Supported operations are listed with available versions. See Neuron_getVersion for querying version number.
Attempting to compile models with operations marked as not available will get a compilation failure.
Refer to the operation support status of each hardware platform. Attempting to compile models with operations supported by this library but not supported by the underlying hardware platform will get a compilation failure too.
Compatible NNAPI levels are also listed.
Values:
- enumerator NEURON_ADD
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_AVERAGE_POOL_2D
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_CONCATENATION
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_CONV_2D
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_DEPTHWISE_CONV_2D
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_DEPTH_TO_SPACE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_DEQUANTIZE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_EMBEDDING_LOOKUP
-
Not available.
- enumerator NEURON_FLOOR
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_FULLY_CONNECTED
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_HASHTABLE_LOOKUP
-
Not available.
- enumerator NEURON_L2_NORMALIZATION
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_L2_POOL_2D
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_LOCAL_RESPONSE_NORMALIZATION
-
Not available.
- enumerator NEURON_LOGISTIC
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_LSH_PROJECTION
-
Not available.
- enumerator NEURON_LSTM
-
Not available.
- enumerator NEURON_MAX_POOL_2D
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_MUL
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_RELU
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_RELU1
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_RELU6
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_RESHAPE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_RESIZE_BILINEAR
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_RNN
-
Not available.
- enumerator NEURON_SOFTMAX
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SPACE_TO_DEPTH
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SVDF
-
Not available.
- enumerator NEURON_TANH
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_BATCH_TO_SPACE_ND
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_DIV
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_MEAN
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_PAD
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SPACE_TO_BATCH_ND
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SQUEEZE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_STRIDED_SLICE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SUB
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_TRANSPOSE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_ABS
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_ARGMAX
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_ARGMIN
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_AXIS_ALIGNED_BBOX_TRANSFORM
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_BIDIRECTIONAL_SEQUENCE_LSTM
-
Not available.
- enumerator NEURON_BIDIRECTIONAL_SEQUENCE_RNN
-
Not available.
- enumerator NEURON_BOX_WITH_NMS_LIMIT
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_CAST
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_CHANNEL_SHUFFLE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_DETECTION_POSTPROCESSING
-
Not available.
- enumerator NEURON_EQUAL
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_EXP
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_EXPAND_DIMS
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_GATHER
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_GENERATE_PROPOSALS
-
Not available.
- enumerator NEURON_GREATER
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_GREATER_EQUAL
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_GROUPED_CONV_2D
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_HEATMAP_MAX_KEYPOINT
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_INSTANCE_NORMALIZATION
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_LESS
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_LESS_EQUAL
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_LOG
-
Not available.
- enumerator NEURON_LOGICAL_AND
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_LOGICAL_NOT
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_LOGICAL_OR
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_LOG_SOFTMAX
-
Not available.
- enumerator NEURON_MAXIMUM
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_MINIMUM
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_NEG
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_NOT_EQUAL
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_PAD_V2
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_POW
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_PRELU
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_QUANTIZE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_QUANTIZED_16BIT_LSTM
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_RANDOM_MULTINOMIAL
-
Not available.
- enumerator NEURON_REDUCE_ALL
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_REDUCE_ANY
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_REDUCE_MAX
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_REDUCE_MIN
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_REDUCE_PROD
-
Not available.
- enumerator NEURON_REDUCE_SUM
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_ROI_ALIGN
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_ROI_POOLING
-
Not available.
- enumerator NEURON_RSQRT
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SELECT
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SIN
-
Not available.
- enumerator NEURON_SLICE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SPLIT
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_SQRT
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_TILE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_TOPK_V2
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_TRANSPOSE_CONV_2D
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_UNIDIRECTIONAL_SEQUENCE_LSTM
-
Not available.
- enumerator NEURON_UNIDIRECTIONAL_SEQUENCE_RNN
-
Not available.
- enumerator NEURON_RESIZE_NEAREST_NEIGHBOR
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_QUANTIZED_LSTM
-
Not available.
- enumerator NEURON_IF
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_WHILE
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_ELU
-
Not available.
- enumerator NEURON_HARD_SWISH
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_FILL
-
Available since 4.1.0. NNAPI level 30.
- enumerator NEURON_RANK
-
Not available.
- enumerator NEURON_BATCH_MATMUL
-
Available since 5.1.2. NNAPI FL6.
- enumerator NEURON_NUMBER_OF_OPERATIONS
-
- enum NeuronAdapterFuseCode
-
Fused activation function types.
Values:
- enumerator NEURON_FUSED_NONE
-
- enumerator NEURON_FUSED_RELU
-
- enumerator NEURON_FUSED_RELU1
-
- enumerator NEURON_FUSED_RELU6
-
- enum NeuronAdapterPaddingCode
-
Implicit padding algorithms.
Values:
- enumerator NEURON_PADDING_SAME
-
SAME padding. Padding on both ends are the “same”: padding_to_beginning = total_padding / 2 padding_to_end = (total_padding + 1)/2. i.e., for even number of padding, padding to both ends are exactly the same; for odd number of padding, padding to the ending is bigger than the padding to the beginning by 1.
total_padding is a function of input, stride and filter size. It could be computed as follows: out_size = (input + stride - 1) / stride; needed_input = (out_size - 1) * stride + filter_size total_padding = max(0, needed_input - input_size) The computation is the same for the horizontal and vertical directions.
- enumerator NEURON_PADDING_VALID
-
VALID padding. No padding. When the input size is not evenly divisible by the filter size, the input at the end that could not fill the whole filter tile will simply be ignored.
- enum NeuronAdapterPreferenceCode
-
Execution preferences.
Values:
- enumerator NEURON_PREFER_LOW_POWER
-
- enumerator NEURON_PREFER_FAST_SINGLE_ANSWER
-
- enumerator NEURON_PREFER_SUSTAINED_SPEED
-
- enumerator NEURON_PREFER_TURBO_BOOST
-
- enum NeuronAdapterPriorityCode
-
Relative execution priority.
Values:
- enumerator NEURON_PRIORITY_LOW
-
- enumerator NEURON_PRIORITY_MEDIUM
-
- enumerator NEURON_PRIORITY_HIGH
-
- enumerator NEURON_PRIORITY_DEFAULT
-
- enum OptimizationCode
-
Compiler optimization hint.
Values:
- enumerator NEURON_OPTIMIZATION_NORMAL
-
Normal optimization. Available since 4.3.1
- enumerator NEURON_OPTIMIZATION_LOW_LATENCY
-
Reduce latency by utilizing as many APU cores as possible. Available since 4.3.1
- enumerator NEURON_OPTIMIZATION_DEEP_FUSION
-
Reducing DRAM access as more as possible. Available since 4.4.0
- enumerator NEURON_OPTIMIZATION_BATCH_PROCESSING
-
Reduce latency by using as many APU cores as possible in batch-dimension. (For models with batch > 1) Available since 4.4.0
- enumerator NEURON_OPTIMIZATION_DEFAULT
-
Default optimization setting. Available since 4.3.1
- enum CacheFlushCode
-
CPU cache flush hint.
Values:
- enumerator NEURON_CACHE_FLUSH_ENABLE_ALL
-
Sync input buffer and invalidate output buffer. Available since 5.0.1
- enumerator NEURON_CACHE_FLUSH_DISABLE_SYNC_INPUT
-
Disable sync input buffer. Available since 5.0.1
- enumerator NEURON_CACHE_FLUSH_DISABLE_INVALIDATE_OUTPUT
-
Disable invalidate output buffer. Available since 5.0.1
- enumerator NEURON_CACHE_FLUSH_DEFAULT
-
Default cache flush setting. Available since 5.0.1
-
Functions
- int Neuron_getVersion(NeuronRuntimeVersion *version)
-
Get the version of Neuron runtime library.
- Parameters:
-
version – the version of Neuron runtime library.
- Returns:
-
NEURON_NO_ERROR
- int Neuron_getL1MemorySizeKb(uint32_t *sizeKb)
-
Get the size of L1 memory in APU.
Available since 4.3.0
- Parameters:
-
sizeKb – L1 memory size in KB
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronMemory_createFromFd(size_t size, int protect, int fd, size_t offset, NeuronMemory **memory)
-
Creates a shared memory object from a file descriptor.
For ion descriptor, application should create the ion memory and descriptor first and then use it in this function.
Available since 4.1.0 Only supports ion fd.
- Parameters:
-
size – The requested size in bytes. Must not be larger than the file size. The desired memory protection for the mapping. It is either PROT_NONE or the bitwise OR of one or more of the following flags: PROT_READ, PROT_WRITE. The requested file descriptor. The file descriptor has to be mmap-able. The offset to the beginning of the file of the area to map. The memory object to be created. Set to NULL if unsuccessful.
- int NeuronMemory_createFromAHardwareBuffer()
-
Not supported at non-android platform
- Returns:
-
NEURON_BAD_STATE
- void NeuronMemory_free(NeuronMemory *memory)
-
Delete a memory object.
For ion memory, this function cleans up the internal resource associated with this memory. Applications should clean up the allocated ion memory after this function.
Available since 4.1.0
- int NeuronModel_create(NeuronModel **model)
-
Create an empty NeuronModel. The model should be constructed with calls to NeuronModel_addOperation and NeuronModel_addOperand.
Available since 4.1.0
- Parameters:
-
model – The NeuronModel to be created. Set to NULL if unsuccessful.
- Returns:
-
NEURON_NO_ERROR if successful.
- void NeuronModel_free(NeuronModel *model)
-
Destroy a model. The model need not have been finished by a call to NeuronModel_finish.
Available since 4.1.0
- Parameters:
-
model – The model to be destroyed.
- int NeuronModel_finish(NeuronModel *model)
-
Indicate that we have finished modifying a model. Required before calling NeuronCompilation_compile.
Available since 4.1.0
- Parameters:
-
model – The model to be finished.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_addOperand(NeuronModel *model, const NeuronOperandType *type)
-
Add an operand to a model. The order in which the operands are added is important. The first one added to a model will have the index value 0, the second 1, etc. These indexes are used as operand identifiers in NeuronModel_addOperation.
Available since 4.1.0
- Parameters:
-
-
model – The model to be modified.
-
type – The NeuronOperandType that describes the shape of the operand. Neither the NeuronOperandType nor the dimensions it points to need to outlive the call to NeuronModel_addOperand.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_setOperandValue(NeuronModel *model, int32_t index, const void *buffer, size_t length)
-
Sets an operand to a constant value. Values of length smaller or equal to NEURON_MAX_SIZE_OF_IMMEDIATELY_COPIED_VALUES are immediately copied into the model. For values of length greater than NEURON_MAX_SIZE_OF_IMMEDIATELY_COPIED_VALUES, a pointer to the buffer is stored within the model. The application must not change the content of this region until all executions using this model have completed. As the data may be copied during processing, modifying the data after this call yields undefined results.
Attempting to modify a model once NeuronModel_finish has been called will return an error.
A special notice on the buffer lifetime when the length is greater than NEURON_MAX_SIZE_OF_IMMEDIATELY_COPIED_VALUES. The provided buffer must outlive the compilation of this model. I.e. user must keep the buffer unchanged until NeuronCompilation_finish of this model. This is an internal optimization comparing to NNAPI. In NNAPI, NN runtime will copy the buffer to a shared memory between NN runtime and NNAPI HIDL service during ANNModel_finish, and it will be copied again to the compiled result during ANNCompilation_finish. In Neuron Adapter, there will be only one copying during NeuronCompilaiton_finish, so it is required to keep the buffer alive until NeuronCompilaiton_finish returned.
Available since 4.1.0
- Parameters:
-
-
model – The model to be modified.
-
index – The index of the model operand we’re setting.
-
buffer – A pointer to the data to use.
-
length – The size in bytes of the data value.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_setOperandValueFromModel(NeuronModel *model, int32_t index, const NeuronModel *value)
-
Sets an operand to a value that is a reference to another NeuronModel.
The referenced model must already have been finished by a call to NeuronModel_finish.
The NeuronModel_relaxComputationFloat32toFloat16 setting of referenced models is overridden by that setting of the main model of a compilation.
The referenced model must outlive the model referring to it.
Attempting to modify a model once NeuronModel_finish has been called will return an error.
Available since 4.1.0
- Parameters:
-
-
model – The model to be modified.
-
index – The index of the model operand we’re setting.
-
value – The model to be referenced.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_setOperandSymmPerChannelQuantParams(NeuronModel *model, int32_t index, const NeuronSymmPerChannelQuantParams *channelQuant)
-
Sets an operand’s per channel quantization parameters Sets parameters required by a tensor of type NEURON_TENSOR_QUANT8_SYMM_PER_CHANNEL This function must be called for every tensor of type NEURON_TENSOR_QUANT8_SYMM_PER_CHANNEL before calling NeuronModel_finish
Available since 4.1.0
- Parameters:
-
-
model – The model to be modified.
-
index – The index of the model operand we’re setting.
-
channelQuant – The per channel quantization parameters for the operand. No memory in this struct needs to outlive the call to this function.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_addOperation(NeuronModel *model, NeuronOperationType type, uint32_t inputCount, const uint32_t *inputs, uint32_t outputCount, const uint32_t *outputs)
-
Add an operation to a model. The operands specified by inputs and outputs must have been previously added by calls to NeuronModel_addOperand.
Available since 4.1.0
- Parameters:
-
-
model – The model to be modified.
-
type – The NeuronOperationType of the operation.
-
inputCount – The number of entries in the inputs array.
-
inputs – An array of indexes identifying each operand.
-
outputCount – The number of entries in the outputs array.
-
outputs – An array of indexes identifying each operand.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_addOperationExtension(NeuronModel *model, const char *name, const char *vendor, const NeuronDevice *device, uint32_t inputCount, const uint32_t *inputs, uint32_t outputCount, const uint32_t *outputs)
-
Add an operation extension to a model. The operands specified by inputs and outputs must have been previously added by calls to NeuronModel_addOperand. User needs to specify the operation extension name and the desired device which will execute the operation extension.
Available since 4.1.0
- Parameters:
-
-
model – The model to be modified.
-
name – The name of the operation extension.
-
vendor – The name of the vendor which will implement the operation extension.
-
device – The device which will execute the operation extension.
-
inputCount – The number of entries in the inputs array.
-
inputs – An array of indexes identifying each operand.
-
outputCount – The number of entries in the outputs array.
-
outputs – An array of indexes identifying each operand.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_identifyInputsAndOutputs(NeuronModel *model, uint32_t inputCount, const uint32_t *inputs, uint32_t outputCount, const uint32_t *outputs)
-
Specfifies which operands will be the model’s inputs and outputs. An operand cannot be used for both input and output. Doing so will return an error.
The operands specified by inputs and outputs must have been previously added by calls to NeuronModel_addOperand.
Attempting to modify a model once NeuronModel_finish has been called will return an error.
Available since 4.1.0
- Parameters:
-
-
model – The model to be modified.
-
inputCount – The number of entries in the inputs array.
-
inputs – An array of indexes identifying the input operands.
-
outputCount – The number of entries in the outputs array.
-
outputs – An array of indexes identifying the output operands.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_getSupportedOperations(NeuronModel *model, bool *supported, uint32_t operationCount)
-
Gets the supported operations in a model. This function must be called after calling NeuronModel_finish
Available since 4.1.0
- Parameters:
-
-
model – The model to be queried.
-
supported – The boolean array to be filled. True means supported. The size of the boolean array must be at least as large as the number of operations in the model. The order of elements in the supported array matches the order in which the corresponding operations were added to the model.
-
operationCount – number of operations in the model
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_getSupportedOperationsForDevices(const NeuronModel *model, const NeuronDevice *const *devices, uint32_t numDevices, bool *supportedOps)
-
Get the supported operations for a specified set of devices. If multiple devices are selected, the supported operation list is a union of supported operations of all selected devices.
Available since 4.1.0
- Parameters:
-
-
model – The model to be queried.
-
devices – Selected devices
-
numDevices – Number of selected devices
-
supportedOps – The boolean array to be filled. True means supported. The size of the boolean array must be as least as large as the number of operations in the model. The order of elements in the supportedOps array matches the order in which the corresponding operations were added to the model.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_relaxComputationFloat32toFloat16(NeuronModel *model, bool allow)
-
Specifies whether NEURON_TENSOR_FLOAT32 is allowed to be calculated with range and/or precision as low as that of the IEEE 754 16-bit floating-point format. By default, NEURON_TENSOR_FLOAT32 must be calculated using at least the range and precision of the IEEE 754 32-bit floating-point format.
Available since 4.1.0
- Parameters:
-
-
model – The model to be modified.
-
allow – ‘true’ indicates NEURON_TENSOR_FLOAT32 may be calculated with range and/or precision as low as that of the IEEE 754 16-bit floating point format. ‘false’ indicates NEURON_TENSOR_FLOAT32 must be calculated using at least the range and precision of the IEEE 754 32-bit floating point format.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_suppressInputConversion(NeuronModel *model, bool suppress)
-
Hint compiler to suppress the input data conversion, the users have to convert the input data into platform-expected format before inference.
Available since 4.2.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_suppressOutputConversion(NeuronModel *model, bool suppress)
-
Hint compiler to suppress the output data conversion, the users have to convert the output data from platform-generated format before inference.
Available since 4.2.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_restoreFromCompiledNetwork(NeuronModel **model, NeuronCompilation **compilation, const void *buffer, const size_t size)
-
Restore the compiled network using user provided buffer.
The restored NeuronCompilaton could be used in creating executing instance. The restored NeuronModel cannot be recompiled.
Available since 4.3.0
- Parameters:
-
-
model – Restored model.
-
compilation – Restored compilation
-
buffer – User provided buffer to restore the compiled network.
-
size – Size of the user provided buffer in bytes.
- Returns:
-
NEURON_NO_ERROR if compiled network is successfully copied to the user allocated buffer. NEURON_BAD_DATA if it fails to load the compiled network, this could either be the version is not matched or the data is corrupted.
- int NeuronCompilation_create(NeuronModel *model, NeuronCompilation **compilation)
-
Create a NeuronCompilation to compile the given model.
This function only creates the object. Compilation is only performed once NeuronCompilation_finish is invoked. NeuronCompilation_finish should be called once all desired properties have been set on the compilation. NeuronModel_free should be called once the compilation is no longer needed. The provided model must outlive the compilation. The model must already have been finished by a call to NeuronModel_finish.
Available since 4.1.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful
- void NeuronCompilation_free(NeuronCompilation *compilation)
-
Destroy a compilation.
Available since 4.1.0
- Parameters:
-
compilation – The compilation to be destroyed.
- int NeuronCompilation_finish(NeuronCompilation *compilation)
-
Compilation is finished once NeuronCompilation_finish is invoked. Required before calling NeuronExecution_create. This function must only be called once for a given compilation.
Available since 4.1.0
- Parameters:
-
compilation – The compilation to be finished.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_setCaching(NeuronCompilation *compilation, const char *cacheDir, const uint8_t *token)
-
Provides optional caching information for faster re-compilation.
Available since 4.1.0
- Parameters:
-
-
compilation – The compilation to be cached.
-
cacheDir – The cache directory for storing and retrieving caching data. The user should choose a directory local to the application, and is responsible for managing the cache entries.
-
token – The token provided by the user to specify a model must be of length NEURON_BYTE_SIZE_OF_CACHE_TOKEN. The user should ensure that the token is unique to a model within the application. Neuron cannot detect token collisions; a collision will result in a failed execution or in a successful execution that produces incorrect output values.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_setL1MemorySizeKb(NeuronCompilation *compilation, uint32_t sizeKb)
-
Hint compiler with the size of L1 memory, this value should not be larger than real platform’s settings. The user can get the platform’s L1 memory size in KB by calling Neuron_getL1MemorySizeKb.
Available since 4.3.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_createForDevices(NeuronModel *model, const NeuronDevice *const *devices, uint32_t numDevices, NeuronCompilation **compilation)
-
Create a NeuronCompilation to compile the given model for a specified set of devices. The user must handle all compilation and execution failures from the specified set of devices. This is in contrast to a use of NeuronCompilation_create, where neuron will attempt to recover from such failures.
Available since 4.1.0
- Parameters:
-
-
model – The NeuronModel to be compiled.
-
devices – The set of devices. Must not contain duplicates.
-
numDevices – The number of devices in the set.
-
compilation – The newly created object or NULL if unsuccessful.
- Returns:
-
NEURON_NO_ERROR if successful, NEURON_BAD_DATA if the model is invalid.
- int NeuronCompilation_createForDebug(NeuronModel *model, NeuronCompilation **compilation)
-
Create a NeuronCompilation. Which can divide one graph into several subgraph and use the information to debug.
Only be used in debug purpose, no guarantees performance and thread safe.
Available since 5.0.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful, NEURON_BAD_DATA if the model is invalid.
- int NeuronCompilation_setPreference(NeuronCompilation *compilation, int32_t preference)
-
Sets the execution preference associated with this compilation.
Default value of preference is PREFER_SINGLE_FAST_ANSWER
Available since 4.1.0
- Parameters:
-
-
compilation – The compilation to be modified.
-
preference – Either NEURON_PREFER_LOW_POWER, NEURON_PREFER_SINGLE_FAST_ANSWER, or NEURON_PREFER_SUSTAINED_SPEED.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_setPriority(NeuronCompilation *compilation, int priority)
-
Sets the execution priority associated with this compilation.
Execution priorities are relative to other executions created by the same application (specifically same uid) for the same device. Specifically, priorities of executions from one application will not affect executions from another application.
Higher priority executions may use more compute resources than lower priority executions, and may preempt or starve lower priority executions.
Available since 4.1.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_getInputPaddedDimensions(NeuronCompilation *compilation, int32_t index, uint32_t *dimensions)
-
Get the padded dimensional information of the specified input operand of the compilation. This function must be called after calling NeuronCompilation_finish. If NeuronModel_suppressInputConversion was not applied to the model to be compiled, the returned dimensions are the padded dimension after NeuronCompilation_finish to satisfy the optimization requirement from the underlying hardware accelerators. If NeuronModel_suppressInputConversion was applied to the model to be compiled, the returned dimensions are the same as the original dimensions given from user.
Available since 4.2.0
- Parameters:
-
-
compilation – The compilation to be queried.
-
index – The index of the input operand we are querying. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs. It is not the index associated with NeuronModel_addOperand.
-
dimensions – The dimension array to be filled. The size of the array must be exactly as large as the rank of the input operand to be queried in the model.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_getOutputPaddedDimensions(NeuronCompilation *compilation, int32_t index, uint32_t *dimensions)
-
Get the padded dimensional information of the specified output operand of the compilation. This function must be called after calling NeuronCompilation_finish. If NeuronModel_suppressOutputConversion was not applied to the model to be compiled, the returned dimensions are the padded dimension after NeuronCompilation_finish to satisfy the optimization requirement from the underlying hardware accelerators. If NeuronModel_suppressOutputConversion was applied to the model to be compiled, the returned dimensions are the same as the original dimensions given from user.
Available since 4.2.0
- Parameters:
-
-
compilation – The compilation to be queried.
-
index – The index of the output operand we are querying. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs. It is not the index associated with NeuronModel_addOperand.
-
dimensions – The dimension array to be filled. The size of the array must be exactly as large as the rank of the output operand to be queried in the model.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_getInputPaddedSize(NeuronCompilation *compilation, int32_t index, size_t *size)
-
Get the expected buffer size (bytes) of the specified input operand of the compilation. If NeuronModel_suppressInputConversion was not applied to the model to be compiled, the returned size are the padded size after NeuronCompilation_finish to satisfy the optimization requirement from the underlying hardware accelerators. If NeuronModel_suppressInputConversion was applied to the model to be compiled, the returned size are the same as the original size given from user.
Available since 4.2.0
- Parameters:
-
-
compilation – The compilation to be queried.
-
index – The index of the input operand we are querying. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs. It is not the index associated with NeuronModel_addOperand.
-
size – the expected buffer size in bytes.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_getOutputPaddedSize(NeuronCompilation *compilation, int32_t index, size_t *size)
-
Get the expected buffer size (bytes) of the specified output operand of the compilation. If NeuronModel_suppressOutputConversion was not applied to the model to be compiled, the returned size are the padded size after NeuronCompilation_finish to satisfy the optimization requirement from the underlying hardware accelerators. If NeuronModel_suppressOutputConversion was applied to the model to be compiled, the returned size are the same as the original size given from user.
Available since 4.2.0
- Parameters:
-
-
compilation – The compilation to be queried.
-
index – The index of the output operand we are querying. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs. It is not the index associated with NeuronModel_addOperand.
-
size – the expected buffer size in bytes.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_getCompiledNetworkSize(NeuronCompilation *compilation, size_t *size)
-
Get the compiled network size of the compilation.
This must be called after NeuronCompilation_finished and before NeuronExecution_create. It is not allowed to call this with a compilation restored from cache.
Available since 4.3.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_storeCompiledNetwork( NeuronCompilation *compilation, void *buffer, const size_t size)
-
Store the compiled network.
Users have to allocate the buffer with the specified size before calling this function.
This must be called after NeuronCompilation_finished and before NeuronExecution_create. It is not allowed to call this with a compilation restored from cache.
Available since 4.3.0
- Parameters:
-
-
compilation – The compilation to be queried.
-
buffer – User allocated buffer to store the compiled network.
-
size – Size of the user allocated buffer in bytes.
- Returns:
-
NEURON_NO_ERROR if compiled network is successfully copied to the user allocated buffer.
- int NeuronCompilation_setOptimizationHint(NeuronCompilation *compilation, uint32_t optimizationCode)
-
Hint the compiler to apply the optimization strategy according to the user specified parameters.
Available since 4.3.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_setOptimizationString(NeuronCompilation *compilation, const char *optimizationString)
-
Hint the compiler to apply the optimization strategy according to the user specified arguments in a null-terminated string.
Available since 4.6.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_setTrimIOAlignment(NeuronCompilation *compilation, bool enable)
-
Hint compiler to trim the model IO alignment.
Available since 4.4.8
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronCompilation_setSWDilatedConv(NeuronCompilation *compilation, bool enable)
-
Hint compiler to use software dilated convolution
Available since 4.4.8
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronExecution_create(NeuronCompilation *compilation, NeuronExecution **execution)
-
Create a new execution instance by calling the NeuronExecution_create function. The provided compilation must outlive the execution.
Available since 4.1.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful
- void NeuronExecution_free(NeuronExecution *execution)
-
Destroy an execution.
Available since 4.1.0
- Parameters:
-
execution – The execution to be destroyed.
- int NeuronExecution_setInput(NeuronExecution *execution, int32_t index, const NeuronOperandType *type, const void *buffer, size_t length)
-
Associate a user buffer with an input of the model of the NeuronExecution. The provided buffer must outlive the execution.
Available since 4.1.0
- Parameters:
-
-
execution – The execution to be modified.
-
index – The index of the input argument we are setting. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs. It is not the index associated with NeuronModel_addOperand.
-
type – The NeuronOperandType of the operand. Currently NeuronAdapter only takes NULL.
-
buffer – The buffer containing the data.
-
length – The length in bytes of the buffer.
- Returns:
-
NEURON_NO_ERROR if successful, NEURON_BAD_DATA if the name is not recognized or the buffer is too small for the input.
- int NeuronExecution_setOutput(NeuronExecution *execution, int32_t index, const NeuronOperandType *type, void *buffer, size_t length)
-
Associate a user buffer with an output of the model of the NeuronExecution. The provided buffer must outlive the execution.
Available since 4.1.0
- Parameters:
-
-
execution – The execution to be modified.
-
index – The index of the output argument we are setting. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs. It is not the index associated with NeuronModel_addOperand.
-
type – The NeuronOperandType of the operand. Currently NeuronAdapter only takes NULL.
-
buffer – The buffer where the data is to be written.
-
length – The length in bytes of the buffer.
- Returns:
-
NEURON_NO_ERROR if successful, NEURON_BAD_DATA if the name is not recognized or the buffer is too small for the output.
- int NeuronExecution_setInputFromMemory(NeuronExecution *execution, uint32_t index, const NeuronOperandType *type, const NeuronMemory *memory, size_t offset, size_t length)
-
Associate part of a memory object with an input of the model of the NeuronExecution.
The provided memory must outlive the execution and should not be changed during computation.
Available since 4.1.0
- Parameters:
-
-
execution – The execution to be modified.
-
index – The index of the input argument we are setting. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs. It is not the index associated with Neuronodel_addOperand.
-
type – The NeuronOperandType of the operand. Currently NueronAdapter only takes NULL.
-
memory – The memory containing the data.
-
offset – This specifies the location of the data within the memory. The offset is in bytes from the start of memory.
-
length – The size in bytes of the data value.
- Returns:
-
NEURON_NO_ERROR if successful, NEURON_BAD_DATA if the name is not recognized or the buffer is too small for the input.
- int NeuronExecution_setOutputFromMemory(NeuronExecution *execution, uint32_t index, const NeuronOperandType *type, const NeuronMemory *memory, size_t offset, size_t length)
-
Associate part of a memory object with an output of the model of the NeuronExecution.
The provided memory must outlive the execution and should not be changed during computation.
Available since 4.1.0
- Parameters:
-
-
execution – The execution to be modified.
-
index – The index of the output argument we are setting. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs. It is not the index associated with Neuronodel_addOperand.
-
type – The NeuronOperandType of the operand. Currently NueronAdapter only takes NULL.
-
memory – The memory containing the data.
-
offset – This specifies the location of the data within the memory. The offset is in bytes from the start of memory.
-
length – The size in bytes of the data value.
- Returns:
-
NEURON_NO_ERROR if successful, NEURON_BAD_DATA if the name is not recognized or the buffer is too small for the input.
- int NeuronExecution_compute(NeuronExecution *execution)
-
Schedule synchronous evaluation of the execution. Returns once the execution has completed and the outputs are ready to be consumed.
Available since 4.1.0
- Parameters:
-
execution – The execution to be scheduled and executed.
- Returns:
-
NEURON_NO_ERROR if the execution completed normally. NEURON_BAD_STATE if the inference fails. Add two return code since 5.0.0 (NEURON_MISSED_DEADLINE_TRANSIENT if inference timeout, and NEURON_OUTPUT_INSUFFICIENT_SIZE if given outsize is not sufficient for real output)
- int NeuronExecution_startComputeWithDependencies(NeuronExecution *execution, const NeuronEvent *const *dependencies, uint32_t num_dependencies, uint64_t duration, NeuronEvent **event)
-
Schedule asynchronous evaluation of the execution with dependencies.
The execution will wait for all the depending events to be signaled before starting the evaluation. Once the execution has completed and the outputs are ready to be consumed, the returned event will be signaled. Depending on which devices are handling the execution, the event could be backed by a sync fence. Use NeuronEvent_wait to wait for that event.
NeuronEvent_wait must be called to recurperate the resources used by the execution.
If parts of the execution are scheduled on devices that do not support fenced execution, the function call may wait for such parts to finish before returning.
The function will return an error if any of the events in dependencies is already in a bad state. After the execution is scheduled, if any of the events in dependencies does not complete normally, the execution will fail, and NeuronEvent_wait on the returned event will return an error.
The function will return an error if any of the execution outputs has a tensor operand type that is not fully specified.
Available since 5.0.0
- Parameters:
-
-
execution – The execution to be scheduled and executed.
-
dependencies – A set of depending events. The actual evaluation will not start until all the events are signaled.
-
num_dependencies – The number of events in the dependencies set.
-
duration – currently not used
-
event – The event that will be signaled on completion. event is set to NULL if there’s an error.
- Returns:
-
NEURON_NO_ERROR if the evaluation is successfully scheduled.
- int NeuronExecution_setLoopTimeout(NeuronExecution *execution, uint64_t duration)
-
Set the maximum duration of WHILE loops in the specified execution.
Available since 5.0.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- uint64_t Neuron_getDefaultLoopTimeout()
-
Get the default timeout value for WHILE loops.
Available since 5.0.0
- Returns:
-
The default timeout value in nanoseconds.
- uint64_t Neuron_getMaximumLoopTimeout()
-
Get the maximum timeout value for WHILE loops.
Available since 5.0.0
- Returns:
-
The maximum timeout value in nanoseconds.
- int NeuronExecution_setBoostHint(NeuronExecution *execution, uint8_t boostValue)
-
Sets the execution boost hint associated with this execution. Required before calling NeuronExecution_compute.
Execution boost is the hint for the device frequency, ranged between 0 (lowest) to 100 (highest). For the compilation with preference set as NEURON_PREFER_SUSTAINED_SPEED, scheduler guarantees that the executing boost value would equal to the boost value hint.
On the other hand, for the compilation with preference set as NEURON_PREFER_LOW_POWER, scheduler would try to save power by configuring the executing boost value with some value that is not higher than the boost value hint.
Available since 4.1.0
- Parameters:
-
-
execution – The execution to be modified.
-
boostValue – The hint for the device frequency, ranged between 0 (lowest) to 100 (highest).
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronExecution_setCacheFlushHint(NeuronExecution *execution, uint8_t flushHint)
-
Sets the execution CPU cache flush hint associated with this execution. Required before calling NeuronExecution_setInputFromMemory and NeuronExecution_setOutputFromMemory.
Default value of preference is NEURON_CACHE_FLUSH_ENABLE_ALL
Available since 5.0.1
- Parameters:
-
-
execution – The execution to be modified.
-
hint – It is either NEURON_CACHE_FLUSH_ENABLE_ALL or the bitwise OR of one or more of the following flags: NEURON_CACHE_FLUSH_DISABLE_SYNC_INPUT, NEURON_CACHE_FLUSH_DISABLE_INVALIDATE_OUTPUT.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronExecution_getOutputOperandRank(NeuronExecution *execution, int32_t index, uint32_t *rank)
-
Get the dimensional information of the specified output operand of the model of the latest computation evaluated on NeuronExecution.
This function may only be invoked when the execution is in the completed state.
Available since 5.0.0
- Parameters:
-
-
execution – The execution to be queried.
-
index – The index of the output argument we are querying. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs.
-
rank – The rank of the output operand.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronExecution_getOutputOperandDimensions(NeuronExecution *execution, int32_t index, uint32_t *dimensions)
-
Get the dimensional information of the specified output operand of the model of the latest computation evaluated on NeuronExecution. The target output operand cannot be a scalar.
This function may only be invoked when the execution is in the completed state.
Available since 5.0.0
- Parameters:
-
-
execution – The execution to be queried.
-
index – The index of the output argument we are querying. It is an index into the lists passed to NeuronModel_identifyInputsAndOutputs.
-
dimensions – The dimension array to be filled. The size of the array must be exactly as large as the rank of the output operand to be queried in the model.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronDebug_setReportPath(NeuronModel *model, const char *path)
-
Set report path for debug plus.
Only be used in debug purpose, the execution should be created by NeuronCompilation_createForDebug compilation.
Available since 5.0.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful, NEURON_BAD_DATA if the path is empty.
- int Neuron_getDeviceCount(uint32_t *numDevices)
-
Get the number of available devices.
Available since 4.1.0
- Parameters:
-
numDevices – The number of devices returned.
- Returns:
-
NEURON_NO_ERROR if successful.
- int Neuron_getDevice(uint32_t devIndex, NeuronDevice **device)
-
Get the representation of the specified device.
Available since 4.1.0
- Parameters:
-
-
devIndex – The index of the specified device. Must be less than the number of available devices.
-
device – The representation of the specified device. The same representation will always be returned for the specified device.
- Returns:
-
NEURONNO_ERROR if successful.
- int NeuronDevice_getName(const NeuronDevice *device, const char **name)
-
Get the name of the specified device.
Available since 4.1.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronDevice_getDescription(const NeuronDevice *device, const char **description)
-
Get the description of the specified device.
Available since 5.0.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- void NeuronEvent_free(NeuronEvent *event)
-
- int NeuronEvent_wait(NeuronEvent *event)
-
Waits until the execution completes.
More than one thread can wait on an event. When the execution completes, all threads will be released.
SeeNeuronExecution for information on multithreaded usage.
Available since 5.0.0
- Parameters:
-
event – The event that will be signaled on completion.
- Returns:
-
NEURON_NO_ERROR if the execution completed normally. NEURON_UNMAPPABLE if the execution input or output memory cannot be properly mapped.
- int NeuronEvent_createFromSyncFenceFd(int sync_fence_fd, NeuronEvent **event)
-
Create a NeuronEventfrom a sync_fence file descriptor.
The newly created NeuronEvent does not take ownership of the provided sync_fence_fd, it will instead dup the provided sync_fence_fd and own the duplicate.
Available since 5.0.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronEvent_getSyncFenceFd(const NeuronEvent *event, int *sync_fence_fd)
-
Get sync_fence file descriptor from the event.
If the NeuronEvent is not backed by a sync fence, the sync_fence_fd will be set to -1, and NEURON_BAD_DATA will be returned.
See NeuronEvent_createFromSyncFenceFd and NeuronExecution_startComputeWithDependencies to see how to create an event backed by a sync fence.
The user takes ownership of the returned fd, and must close the returned file descriptor when it is no longer needed.
Available since 5.0.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronDevice_getExtensionSupport(const char *extensionName, bool *isExtensionSupported)
-
Queries whether an extension is supported by the driver implementation of the specified device.
Available since 5.0.0
- Parameters:
-
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_getExtensionOperandType(NeuronModel *model, const char *extensionName, uint16_t operandCodeWithinExtension, int32_t *type)
-
Creates an operand type from an extension name and an extension operand code.
See NeuronModel for information on multithreaded usage.
Available since 5.0.0
- Parameters:
-
-
model – The model to contain the operand.
-
extensionName – The extension name.
-
operandCodeWithinExtension – The extension operand code.
-
type – The operand type.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_getExtensionOperationType(NeuronModel *model, const char *extensionName, uint16_t operationCodeWithinExtension, int32_t *type)
-
Creates an operation type from an extension name and an extension operation code.
See NeuronModel for information on multithreaded usage.
Available since 5.0.0
- Parameters:
-
-
model – The model to contain the operation.
-
extensionName – The extension name.
-
operationCodeWithinExtension – The extension operation code.
-
type – The operation type.
- Returns:
-
NEURON_NO_ERROR if successful.
- int NeuronModel_setOperandExtensionData(NeuronModel *model, int32_t index, const void *data, size_t length)
-
Sets extension operand parameters.
Available since 5.0.0
- Parameters:
-
-
model – The model to be modified.
-
index – The index of the model operand we’re setting.
-
data – A pointer to the extension operand data. The data does not have to outlive the call to this function.
-
length – The size in bytes of the data value.
- Returns:
-
NEURON_NO_ERROR if successful.
- struct NeuronOperandType
-
#include <NeuronAdapter.h>
NeuronOperandType describes the type of an operand. This structure is used to describe both scalars and tensors.
Public Members
- int32_t type
-
The data type, e.g NEURON_INT8.
- uint32_t dimensionCount
-
The number of dimensions. It should be 0 for scalars.
- const uint32_t *dimensions
-
The dimensions of the tensor. It should be nullptr for scalars.
- float scale
-
These two fields are only used for quantized tensors. They should be zero for scalars and non-fixed point tensors. The dequantized value of each entry is (value - zeroPoint) * scale.
- int32_t zeroPoint
-
Only used with scale for quantized tensors
- struct NeuronSymmPerChannelQuantParams
-
#include <NeuronAdapter.h>
Parameters for NEURON_TENSOR_QUANT8_SYMM_PER_CHANNEL operand.
Public Members
- uint32_t channelDim
-
The index of the channel dimension.
- uint32_t scaleCount
-
The size of the scale array. Should be equal to dimension[channelDim] of the Operand.
- const float *scales
-
The array of scaling values for each channel. Each value must be greater than zero.
- struct NeuronRuntimeVersion
-
#include <NeuronAdapter.h>
The structure to represent the neuron version.
Public Members
- uint8_t major
-
major version
- uint8_t minor
-
minor version
- uint8_t patch
-
patch version