RemoteTritonTileProcessor¶
Purpose¶
Connects to remote NVidia Triton Inference Server inference server to process voxels tile by tile as fed e.g. by ProcessTiles or ApplyTileProcessorPageWise.
Usage¶
Assuming you have a Triton Inference Server inference server running, enter its URL into Server URL (e.g. http://localhost:8000, or grpc://localhost:8001, https/grpcs is also supported), and enter the Model ID you want to use.
You can also check if the connection is possible and the model was found by pressing the Check Connection button, which, if successful, will list all versions available for the current model and all outputs available for the currently selected model version.
Then press Update to set up the processor.
Connect to an ApplyTileProcessorPageWise and verify output tile size, padding and dimension mapping. See ApplyTileProcessorPageWise help for more details.
See the example network on how to apply these modules to process images.
Details¶
Model Config / Meta Data Support¶
In some situations, it may be desired to store and serve additional parameters with the model. This is supported to some degree through the Triton model configuration.
The Tritonserver supports/expects a config.pb(txt) provided for each served model (ID).
Non-Versioned Parameters
You may provide arbitrary key/value meta information through the following format:
parameters: {
  key: "some custom parameter"
  value: {
    string_value: "some custom string value"
  }
}
parameters: {
  key: "some custom complex parameter provided via JSON"
  value: {
    string_value: "{ 'some string key': 'some string value', 'some int key': 42 }"
  }
}
Notice the lower example providing a JSON-compatible string (requiring single quotes (’) instead of double quotes (“)) to encode mode complex data structures.
The key/value pair will then be available in the ‘parameter info’ at the module output, which you can inspect in multiple ways, e.g. through the output inspector or with the ParameterInfoInspector, or programmatically via ctx.field( "outCppTileProcessor" ).object().getParameterInfo() as a python dictionary. If a JSON-compatible string is provided (lower example), the JSON will be parsed and provided as a dictionary.
Versioned Parameters / Inference Tile Properties
As Triton only allows one config for all versions, a simple way to define version-specific parameters is also supported through versioned parameters. You can define a versioned parameter by using certain ‘magic’ key postfixes:
Use the postfix
_for_version_NwhereNis the integer version you want to address to define a version-specific propertyUse the postfix
_defaultto indicate the default value for a version-specific property for those versions that do not specify special values
A special case of such version-specific properties stored as JSON-like strings are the inference_tile_properties, which can be used to propose those TileProcessorProperties required for page-wise processing with ProcessTiles or ApplyTileProcessorPageWise (and also to some degree non-page-wise processing with ApplyTileProcessor).
Note that any version-specific property value are always expected to be stored as a json-compatible string. A parser problem will result in an error for the special case of inference_tile_properties, or in a parameter info value "error": "<error information>" for any other version-specific property.
This is a working example:
parameters: {
  key: "inference_tile_properties_default"
  value: {
    string_value: "{ 'inputs': { 'input': { 'dataType': 'float32', 'dimensions': 'X, Y, CHANNEL1, BATCH', 'externalDimensionForChannel1': 'C', 'externalDimensionForChannel2': 'U', 'fillMode': 'Reflect', 'fillValue': 0.0, 'padding': [8, 8, 0, 0, 0, 0] } }, 'outputs': { 'scores': { 'dataType': 'float32', 'referenceInput': 'input', 'stride': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0], 'tileSize': [44, 44, 2, 1, 1, 1], 'tileSizeMinimum': [2, 2, 2, 1, 1, 1], 'tileSizeOffset': [2, 2, 0, 1, 0, 0] } } }"
  }
}
parameters: {
  key: "inference_tile_properties_for_version_2"
  value: {
    string_value: "{ 'inputs': { 'input': { 'dataType': 'float32', 'dimensions': 'X, Y, CHANNEL1, BATCH', 'externalDimensionForChannel1': 'C', 'externalDimensionForChannel2': 'U', 'fillMode': 'Reflect', 'fillValue': 0.0, 'padding': [8, 8, 0, 0, 0, 0] } }, 'outputs': { 'scores': { 'dataType': 'float32', 'referenceInput': 'input', 'stride': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0], 'tileSize': [46, 46, 2, 1, 1, 1], 'tileSizeMinimum': [2, 2, 2, 1, 1, 1], 'tileSizeOffset': [2, 2, 0, 1, 0, 0] } } }"
  }
}
Tips¶
You can probe which models are available on the server (added as combo box items and thus enabling model ID auto complete) by pressing the tool button on the right side of the
Model IDfield.If supported by your server and client setup, use GRPC for better performance.
When requesting multiple pages at the same time (e.g. by using an
ImageCacheat the output), inference requests will be issued in parallel to increase performance (depending on your MeVisLab preferences setup for the number of ML processing threads).
Windows¶
Default Panel¶
Output Fields¶
outCppTileProcessor¶
- name: outCppTileProcessor, type: PythonTileProcessorWrapper(MLBase), deprecated name: outCppTileClassifier¶
 Provides a C++ TileProcessor wrapping the python object communicating with the Triton backend. Usually to be connected to
ProcessTilesorApplyTileProcessor.
Parameter Fields¶
Field Index¶
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
  | 
||
  | 
  | 
|
  | 
  | 
|
  | 
  | 
Visible Fields¶
Server URL¶
- name: inServerUrl, type: String, default: $(FME_TRITONSERVER_URL), deprecated name: inServer¶
 URL the Triton inference server is to be contacted on (e.g. grpc://localhost:8500 or http://localhost:8501). The prefix (grpc://, grpcs:// or http://) determines the protocol to be used. You can use MDL variables here e.g. ‘$(MY_TRITON_SERVER_URL)’, for example defined in the mevislab.prefs.
Expanded URL¶
- name: outExpandedServerUrl, type: String, persistent: no¶
 Expanded version of the
Server URL
Default Timeout [ms]¶
- name: inDefaultTimeout_ms, type: Integer, default: 200, minimum: -1¶
 Default server connection timeout (in milliseconds), used for connection attempts, not for predictions.
Prediction Timeout [ms]¶
- name: inPredictionTimeout_ms, type: Integer, default: 30000, minimum: -1, deprecated name: inTimeout\_ms¶
 Maximum reaction time in milliseconds for the server before a timeout occurs.
Max. Tries per Batch¶
- name: inNumPredictionTriesPerBatch, type: Integer, default: 3¶
 Maximum number of tries per batch until an error is reported (and processing likely aborted). Trying more than once can be useful if your connection is flaky and occasional timeouts may occur, or if the server’s resources are occasionally exhausted e.g. due to multi-threading pushing many requests at the same time. Note that if the server is unavailable, this will be hopefully found out on update already (not when tiles are requested). Still, increasing the number will prolong the time until you detect that your server has gone down during processing.
Model ID¶
- name: inModelId, type: String, deprecated name: inModelName¶
 Model ID to load on the server, including potential namespaces/hierarchies separated by ‘.’ (e.g. fme.MyModel.sagittal).
Server Info¶
- name: outServerInfo, type: String, persistent: no¶
 Information about the server as provided by itself.
Version¶
- name: inModelVersionSelectionMode, type: Enum, default: Latest¶
 Version selection mode
Values:
Title  | 
Name  | 
Description  | 
|---|---|---|
Latest  | 
Latest  | 
Choose latest model (highest version number) available  | 
Custom  | 
Custom  | 
Select custom model version  | 
Custom Version¶
- name: inCustomModelVersion, type: Integer, default: 1, minimum: 1¶
 Custom model version (integer > 0).
Used Model Version¶
- name: outUsedModelVersion, type: Integer, persistent: no¶
 Indicates the model version actually used.
Disable batch-level multithreading¶
- name: inDisableBatchLevelMultithreading, type: Bool, default: FALSE¶
 Prevent attempting to process multiple batches in parallel, e.g. to save network bandwidth or server memory/compute resources.
If unset, multi-batch inference operations (e.g. using
ProcessTiles) may use up to the number of threads specified in the MeVisLab preferences as “Maximum Threads Used for Image Processing”, which can speed up inference substantially especially for smaller batches. Otherwise, all batches will be processed sequentially.
Update¶
- name: update, type: Trigger¶
 Initiates update of all output field values.
Clear¶
- name: clear, type: Trigger¶
 Clears all output field values to a clean initial state.
On Input Change Behavior¶
- name: onInputChangeBehavior, type: Enum, default: Clear, deprecated name: shouldAutoUpdate,shouldUpdateAutomatically¶
 Declares how the module should react if a value of an input field changes.
Values:
Title  | 
Name  | 
Deprecated Name  | 
|---|---|---|
Update  | 
Update  | 
TRUE  | 
Clear  | 
Clear  | 
FALSE  | 
[]¶
- name: updateDone, type: Trigger, persistent: no¶
 Notifies that an update was performed (Check status interface fields to identify success or failure).
Has Valid Output¶
- name: hasValidOutput, type: Bool, persistent: no¶
 Indicates validity of output field values (success of computation).
Status Code¶
- name: statusCode, type: Enum, persistent: no¶
 Reflects module’s status (successful or failed computations) as one of some predefined enumeration values.
Values:
Title  | 
Name  | 
|---|---|
Ok  | 
Ok  | 
Invalid input object  | 
Invalid input object  | 
Invalid input parameter  | 
Invalid input parameter  | 
Internal error  | 
Internal error  | 
Status Message¶
- name: statusMessage, type: String, persistent: no¶
 Gives additional, detailed information about status code as human-readable message.