CategoricalEncoding¶
-
MLModule
¶ author Felix Thielke
package FMEstable/ReleaseMeVis
dll MLCategoricalEncoding
definition MLCategoricalEncoding.def see also IntervalMap
,Softmax
keywords one-hot
,multi-hot
,bitmask
,label
,vector
,categorical
Purpose¶
Converts categorical labels back and forth between one-/multi-hot encodings, integer labels, and bitmasks.
Usage¶
Connect an image with a representation of segmentation masks in one of the following formats into one of the other formats: * Integer labels in range 0..*n*-1 for n structures (often, with label 0 representing background, but not necessarily so) * OneHot representation with n channels per voxel, one of which contains a 1 to mark the segmented structure, the others being 0 * OneHotWithoutBG is a variant of the latter, for the common case that 0 is the background and not part of the channels * MultiHot representation with n possibly overlapping channels (e.g. any number of channels between 0..n set to 1, usually also without any background) * Bitmask representing the channels with bits
Details¶
Integer labels always start at zero. Vectorial representations always have length Number Of Channels Or Bits
. Bitmasks use one bit per channel / label, starting with label 0 at the lowest bit.
CategoricalEncoding
automatically tries to choose the smallest unsigned integer that can represent the output bitmask.
Parameter Fields¶
Visible Fields¶
Mode¶
-
name:
mode
, type:
Enum
, default:
OneHotToInteger
¶ Source and target categorical encodings
Values:
Title | Name | Description |
---|---|---|
One-Hot Vector to Integer Label | OneHotToInteger | Convert vectors like (0, 0, 1, 0) to integers (here 2) |
Integer Label to One-Hot Vector | IntegerToOneHot | Convert integers n to vectors with the n’th entry set (example: 1 -> (0, 1, 0)) |
One-Hot Vector without Background to Integer Label | OneHotWithoutBGToInteger | Convert vectors to integers, adding a background label where all entries are 0 (examples: (0, 1) -> 2; (0, 0) -> 0) |
Integer Label to One-Hot Vector without Background | IntegerToOneHotWithoutBG | Convert integers n to vectors with the n-1’th entry set (example: 1 -> (1, 0, 0)) |
Multi-Hot Vector to Bitmask | MultiHotToBitmask | Convert vectorial representation with potentially multiple categories to bitwise representation (example: (1, 1, 0, 1) -> 0b00001011 = 11 decimal) |
Bitmask to Multi-Hot Vector | BitmaskToMultiHot | Convert bitwise representation of potentially multiple categories into vectorial (example: 5 = 0b0101 -> (1, 0, 1, 0)) |
Integer Label to Bitmask | IntegerToBitmask | Convert single integer category to bitmask, mapping only non-zero integers to bits set (for example, 2 -> 2^1 = 4 = 0b0010, but 0 -> 0) This is semantically similar to IntegerToOneHotWithoutBG, just using bits instead of channels. |
Bitmask to Integer Label | BitmaskToInteger | Convert bitmask to the highest set single integer category (example: 0b0101 -> 3) While this mode can inverse the effect of the IntegerToBitmask mapping, it would in general of course be possible to have multiple bits set, which cannot be represented with a single integer label. In those cases, as the example shows, the highest bit / label takes precedence. |
Vector Dimension¶
-
name:
vectorDimension
, type:
Enum
, default:
c
¶ Dimension along which one-hot or multi-hot vectors are encoded
Number Of Channels Or Bits¶
-
name:
numberOfChannelsOrBits
, type:
Integer
, default:
2
, minimum:
1
, maximum:
4.29497e+09
, deprecated name:
numberOfChannels
¶ Number of channels used in the vectorial or bitmask representation.
In the IntegerToOneHot / OneHotToInteger modes, this equals the number of labels n, with integer labels ranging from 0 to n-1. In modes involving OneHotWithoutBG,
this
excludes the background, so it equals the number of non-zero (/ foreground) labels. (The rest of the modes do not need clarification, since there is no “background” involved anyhow.)
Infer from static max¶
-
name:
numberOfChannelsOrBitsFromMax
, type:
Bool
, default:
FALSE
¶ If
input0
has a reliable static max, infer number of labels from thatCaution: If the static max is dynamic (e.g., the maximum label does not appear on all images), the categorical vector size may change at runtime if this flag is set.
If the source image is a bitmask, but the static max is not (2^n-1) for an integer n, an error will be posted and the output will become invalid. (The IntegerToBitmask mode will set an appropriate static max, although it is larger than necessary, which is not only convenient for back conversion but also matches the MultiHotToBitmask static max.)