RemoteTritonTileProcessor¶
-
MacroModule
¶ genre ML_Inference_Providers
author Jan-Martin Kuhnigk
package FMEstable/ReleaseMeVis
definition RemoteTritonTileProcessorModule.def see also ApplyTileProcessor
,ApplyTileProcessorPageWise
keywords RedLeaf
,remote
,triton
,deeplearning
,deep
,machine
,learning
,inference
,provider
,grpc
,client
,serv
,classif
,regress
,nvidia
,tensor
,infer
,predict
Purpose¶
Connects to remote NVidia Triton Inference Server inference server to process voxels tile by tile as fed by ApplyTileProcessorPageWise
.
Usage¶
Assuming you have a Triton Inference Server inference server running, enter its URL into Server URL
(e.g. http://localhost:8000, or grpc://localhost:8001, https/grpcs is also supported), and enter the Model ID
you want to use.
You can also check if the connection is possible and the model was found by pressing the Check Connection button, which, if successful, will list all versions available for the current model and all outputs available for the currently selected model version.
Then press Update
to set up the processor.
Connect to an ApplyTileProcessorPageWise
and verify output tile size, padding and dimension mapping. See ApplyTileProcessorPageWise
help for more details.
See the example network on how to apply these modules to process images.
Details¶
Model Config / Meta Data Support¶
In some situations, it may be desired to store and serve additional parameters with the model. This is supported to some degree through the Triton model configuration.
The Tritonserver supports/expects a config.pb(txt)
provided for each served model (ID).
Non-Versioned Parameters
You may provide arbitrary key/value meta information through the following format:
parameters: {
key: "some custom parameter"
value: {
string_value: "some custom string value"
}
}
parameters: {
key: "some custom complex parameter provided via JSON"
value: {
string_value: "{ 'some string key': 'some string value', 'some int key': 42 }"
}
}
Notice the lower example providing a JSON-compatible string (requiring single quotes (’) instead of double quotes (“)) to encode mode complex data structures.
The key/value pair will then be available in the ‘parameter info’ at the module output, which you can inspect in multiple ways, e.g. through the output inspector or with the ParameterInfoInspector
, or programmatically via ctx.field( "outCppTileProcessor" ).object().getParameterInfo()
as a python dictionary. If a JSON-compatible string is provided (lower example), the JSON will be parsed and provided as a dictionary.
Versioned Parameters / Inference Tile Properties
As Triton only allows one config for all versions, a simple way to define version-specific parameters is also supported through versioned parameters. You can define a versioned parameter by using certain ‘magic’ key postfixes:
- Use the postfix
_for_version_N
whereN
is the integer version you want to address to define a version-specific property - Use the postfix
_default
to indicate the default value for a version-specific property for those versions that do not specify special values
A special case of such version-specific properties stored as JSON-like strings are the inference_tile_properties, which can be used to propose those TileProcessorProperties
required for page-wise processing with ProcessTiles
or ApplyTileProcessorPageWise
(and also to some degree non-page-wise processing with ApplyTileProcessor
).
Note that any version-specific property value are always expected to be stored as a json-compatible string. A parser problem will result in an error for the special case of inference_tile_properties
, or in a parameter info value "error": "<error information>"
for any other version-specific property.
This is a working example:
parameters: {
key: "inference_tile_properties_default"
value: {
string_value: "{ 'inputs': { 'input': { 'dataType': 'float32', 'dimensions': 'X, Y, CHANNEL1, BATCH', 'externalDimensionForChannel1': 'C', 'externalDimensionForChannel2': 'U', 'fillMode': 'Reflect', 'fillValue': 0.0, 'padding': [8, 8, 0, 0, 0, 0] } }, 'outputs': { 'scores': { 'dataType': 'float32', 'referenceInput': 'input', 'stride': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0], 'tileSize': [44, 44, 2, 1, 1, 1], 'tileSizeMinimum': [2, 2, 2, 1, 1, 1], 'tileSizeOffset': [2, 2, 0, 1, 0, 0] } } }"
}
}
parameters: {
key: "inference_tile_properties_for_version_2"
value: {
string_value: "{ 'inputs': { 'input': { 'dataType': 'float32', 'dimensions': 'X, Y, CHANNEL1, BATCH', 'externalDimensionForChannel1': 'C', 'externalDimensionForChannel2': 'U', 'fillMode': 'Reflect', 'fillValue': 0.0, 'padding': [8, 8, 0, 0, 0, 0] } }, 'outputs': { 'scores': { 'dataType': 'float32', 'referenceInput': 'input', 'stride': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0], 'tileSize': [46, 46, 2, 1, 1, 1], 'tileSizeMinimum': [2, 2, 2, 1, 1, 1], 'tileSizeOffset': [2, 2, 0, 1, 0, 0] } } }"
}
}
Tips¶
- You can probe which models are available on the server (added as combo box items and thus enabling model ID auto complete) by pressing the tool button on the right side of the
Model ID
field. - If supported by your server and client setup, use GRPC for better performance.
Output Fields¶
outCppTileProcessor¶
-
name:
outCppTileProcessor
, type:
PythonTileProcessorWrapper(MLBase)
, deprecated name:
outCppTileClassifier
¶ Provides a C++ TileProcessor wrapping the python object communicating with the Triton backend. Usually to be connected to
ProcessTiles
orApplyTileProcessor
.
Parameter Fields¶
Field Index¶
[] : Trigger |
On Input Change Behavior : Enum |
Version : Enum |
Clear : Trigger |
outAvailableModels : String |
|
Custom Version : Integer |
Prediction Timeout [ms] : Integer |
|
Default Timeout [ms] : Integer |
Server URL : String |
|
doNotClearOnFailedUpdate : Bool |
Status Code : Enum |
|
Expanded URL : String |
Status Message : String |
|
Has Valid Output : Bool |
Update : Trigger |
|
Model ID : String |
Used Model Version : Integer |
Visible Fields¶
Server URL¶
-
name:
inServerUrl
, type:
String
, default:
$(FME_TRITONSERVER_URL)
, deprecated name:
inServer
¶ URL the Triton inference server is to be contacted on (e.g. grpc://localhost:8500 or http://localhost:8501). The prefix (grpc://, grpcs:// or http://) determines the protocol to be used. You can use MDL variables here e.g. ‘$(MY_TRITON_SERVER_URL)’, for example defined in the mevislab.prefs.
Expanded URL¶
-
name:
outExpandedServerUrl
, type:
String
, persistent:
no
¶ Expanded version of the
Server URL
Prediction Timeout [ms]¶
-
name:
inPredictionTimeout_ms
, type:
Integer
, default:
2000
, minimum:
-1
, deprecated name:
inTimeout\_ms
¶ Maximum reaction time in milliseconds for the server before a timeout occurs.
Default Timeout [ms]¶
-
name:
inDefaultTimeout_ms
, type:
Integer
, default:
200
, minimum:
-1
¶ Default server connection timeout (in milliseconds), used for connection attempts, not for predictions.
Model ID¶
-
name:
inModelId
, type:
String
, deprecated name:
inModelName
¶ Model ID to load on the server, including potential namespaces/hierarchies separated by ‘.’ (e.g. fme.MyModel.sagittal).
Version¶
-
name:
inModelVersionSelectionMode
, type:
Enum
, default:
Latest
¶ Version selection mode
Values:
Title | Name | Description |
---|---|---|
Latest | Latest | Choose latest model (highest version number) available |
Custom | Custom | Select custom model version |
Custom Version¶
-
name:
inCustomModelVersion
, type:
Integer
, default:
1
, minimum:
1
¶ Custom model version (integer > 0).
Used Model Version¶
-
name:
outUsedModelVersion
, type:
Integer
, persistent:
no
¶ Indicates the model version actually used.
On Input Change Behavior¶
-
name:
onInputChangeBehavior
, type:
Enum
, default:
Clear
, deprecated name:
shouldAutoUpdate,shouldUpdateAutomatically
¶ Declares how the module should react if a value of an input field changes.
Values:
Title | Name | Deprecated Name |
---|---|---|
Update | Update | TRUE |
Clear | Clear | FALSE |
[]¶
-
name:
updateDone
, type:
Trigger
, persistent:
no
¶ Notifies that an update was performed (Check status interface fields to identify success or failure).
Has Valid Output¶
-
name:
hasValidOutput
, type:
Bool
, persistent:
no
¶ Indicates validity of output field values (success of computation).
Status Code¶
-
name:
statusCode
, type:
Enum
, persistent:
no
¶ Reflects module’s status (successful or failed computations) as one of some predefined enumeration values.
Values:
Title | Name |
---|---|
Ok | Ok |
Invalid input object | Invalid input object |
Invalid input parameter | Invalid input parameter |
Internal error | Internal error |