Functional Description of DirectDicomImport

DirectDicomImport is a MeVisLab module whose primary purpose is the import of DICOM and non-DICOM (image) objects typically located in a file system. These are grouped, sorted, split, and composed under constraints required by the following processing steps.

While this is a functional documentation, the usage and parameter documentation is located here.

1. Used Terms

The following terms are needed to describe DirectDicomImport:

  • Voxel

    An up to 6-dimensional pixel in an ML image.

  • Frame

    Usually a two dimensional DICOM image.

  • Image

    A DICOM or non-DICOM object with two or higher dimensional extent (even temporal or higher extents).

  • Volume

    A number of DICOM or non-DICOM images or objects belonging together, usually managed in a MultiFileVolume data structure.

  • MultiFileVolume

    A data structure which describes the inner structure of a Volume. It is usually created by an import process of DirectDicomImport and typically stores a number of file references and properties, as well as their spatial and temporal organization. It is a lightweight data structure which does not store the imported data files themselves, but references them at their locations in the file system or their URLs. So volumes can be transferred easily between application components or even be stored or cached without duplicating the original data. MultiFileVolumes and -Lists are supported by several MeVisLab modules.

2. General Notes

It is crucial to understand that images and DICOM objects must fulfill some requirements before they can be used for medical image or other data processing in MeVisLab. This includes, for example, that distances between neighbor voxels in images should be the same within the entire image, because otherwise measurements on such images would be incorrect or at least very difficult to understand.

Also frames displayed as one image heap or sequence should be from the same patient, the same DICOM study, and in most cases also from the same DICOM series. When retrieving DICOM objects from the file system or a clinical PACS, these requirements, however, are not guaranteed. Files may miss or be duplicated, in wrong order, from different studies, cases, or be corrupted due to failures during data transmissions.

Even if a set of DICOM objects or files is perfectly ordered and fully correct, it may not directly be usable in MeVisLab. An example is a fan of 2 dimensional images acquired with an ultrasound device. Such images are not equidistant and have different orientations. Displaying them as simple spatially ordered heap of images would falsify the impression for the user.

The role of DirectDicomImport is to reduce the risk of incorrectly mixed or composed images as much as possible before feeding them into subsequent processing steps. It also provides many settings which allow the specifications of tolerances and DICOM tags which then are checked and applied during the import process. That way users or applications control the precision and strictness of the import and composition process such that they can rely on the imported volumes according to their own definitions.

If the data does not fulfill the requirements completely or partially, it shall either be refused by DirectDicomImport, or at least be annotated with warnings, errors or hints which describe the limitations.

3. The Import and Processing Pipeline

DirectDicomImport performs the following processing steps:

  1. Collection and filtering of file sets to import

    1. Distinguishing between DICOM and non-DICOM files
    2. Loading and caching all DICOM files into DCMTree::Tree data structures
    3. Decomposition of enhanced multi-frame files to single frames
  2. Grouping, Splitting and Sorting

    1. Grouping and Splitting
    2. Sorting

    Putting objects together which may belong to each other and separating those ones not belonging together.

  3. Composing

    1. Composing image and non-image volumes from DICOM files
    2. Composing images from non-DICOM files
    3. Calculating ML Image Properties
  4. Providing Results

    The most important output of DirectDicomImport is a list of MultiFileVolumes which can be stored or fed into other modules, as well as the (overlay) image of a currently selected volume.

3.1. Collection and Filtering File Sets

File sets are created from different locations:

Each file set is handled as one import step, as well as the explicit file list. Files from different file sets will not be composed to same MultiFileVolumes.

It can be configured whether directories are scanned recursively (the default, see DirectDicomImport.dplScanRecursively) or not.

3.1.a. Distinguishing between DICOM and non-DICOM Objects

In MeVisLab, a number of different objects need to be imported. In medical contexts these usually are DICOM files, however, also additional image information in normal or converted image file formats (such as tiff, png, analyze, .mlimage, histology image formats etc.) is needed. For this purpose DirectDicomImport has a number of backends supporting DICOM and non-DICOM files.

To distinguish them, two suffix lists are provided:

This skips unnecessary tests for contents of DICOM file which otherwise would speed down the import process significantly if many of such files appear. Since DICOM files do not have a specific file suffix, all files not explicitly declared as non-DICOM, must be checked for correct content.

3.1.b. Loading and Caching


Before files are fed into the load and import process, an optional filtering pipeline can be applied. This can be, for example, labeling files and/or to assign dedicated tolerances or DPL Configurations (see below). Such filters can be connected on a network level to the DirectDicomImport.inputFilterPlugin connector.

After filtering DICOM and non-DICOM files are handled differently:

  • DICOM objects which are loaded as DICOM tags organized in tree-like data structures. If multiple files with same values in SOP Instance UID (0008,0018) are found, only one of them is imported, all others are rejected with warnings, because same values mean same contents, file copies or incorrectly created files.

    DICOM files then are loaded on the file system level with the third-party library dcmtk. To establish independence of this specific library, all loaded tags are converted to DCMTree::Tag or DCMTree::Tree objects. They are created by the MeVisLab-owned library DicomTree with its dcmtk-backend DicomTree_OFFIS. DicomTree objects are used by most DICOM most modules on C++ programming level which, in this way, are dcmtk independent.

  • Non-DICOM: Other image formats imported by DirectDicomImport are implemented in form of several loaders (ImageLoad, MLImageFormatLoad, itkImageFileReader, HistoLoad, etc. ). They are not parts of the DirectDicomImport module itself, but are backends which are used on demand.


Since large sets of files may be imported, loaded DICOM objects are loaded and stored in caches in form of DCMTree::Tags and DCMTree::Trees, which are stored in a tag and a tree cache, respectively. This significantly reduces memory usage, prevents unnecessary reload of files, and speeds up the import process if configured appropriately.

Cache sizes are defined application-wide (DirectDicomImport.maxTreeCachedMBs, DirectDicomImport.maxTagCachedMBs). Developers can configure them to optimize the trade-off between memory usage and import speed.

3.1.c. Decomposition of Enhanced Multi-frame Files

DICOM allows so-called enhanced multi-frame files containing a number of frames which belong to different volumes. Their frames may not be ordered in the same way as they should be displayed. Also, frames belonging to the same displayed volume, can be distributed over multiple files. In order to sort and display these frames correctly, they are extracted from the enhanced multi-frame files and fed into the import process as if they are single frames. This option is enabled and applied by default (DirectDicomImport.decomposeMultiFrameFiles).

Although this can be disabled by the user/application (which can speed up import process significantly), this should only be done by experienced users and if the content of the imported files is guaranteed to be in expected order.

Other “classic” multi-frame files have a defined order of frames and they are never split nor composed; they are imported “en-bloc” as one file per MultiFileVolume.

3.2. Grouping, Splitting, and Sorting

Grouping and Splitting

After loading, caching and possibly decomposing of enhanced multi-frame files, objects which belong together must be put together, and if not, they must be put into different MultiFileVolumes.

For this purpose the meaning of “belonging together” must be defined. This includes a number of assumptions which are listed here, without being exhaustive:

  • DICOM objects or image (frames)

    • shall be from the same modality, manufacturer, station, protocol, patient, the same study, acquired on the same device, etc.
    • in case of image frames
  • Non-DICOM images

    Non-DICOM objects are not grouped or split, however, the composition step described below only puts images together which have comparable voxel types and image extents.


DICOM Objects

Especially images must be ordered correctly before being displayed or processed. This includes spatial as well as temporal ordering of frames.

Spatial ordering is usually performed using Image Position Patient (0020,0032) tags from DICOM meta data. In a few cases where DICOM images do not a specific location in space (such as secondary captures), less reliable information such as instance numbers are used instead if available.

The Grouping, Splitting, and Sorting process is performed by using the (MeVisLab-owned) Dicom Processor Library (DPL).

So-called “DPL Configurations” can be defined which describe all the grouping, splitting, and sorting requirements in a formal list (DirectDicomImport.dplConfigString0 - 3). Additionally a set of limits and tolerances are defined either as part of the DPL configurations or as explicit parameters (see DICOM Import Options - > Settings, e.g. DirectDicomImport.relativeDistanceTolerance).

Based on these settings, all frames are grouped, split, ordered and put into sets which are fed into the following compose step.


In order to guarantee consistent display of image data in 3D, reverse frame scans may also be stored in reverse order. In the MultiFileVolume, the World Matrix (see under Calculating ML Image Properties) then define a Voxel-to-World mapping with a left-handed homogeneous matrix.

Non-DICOM images

Non-DICOM objects are taken in the order as they come in from the loading process.

3.3. Composing

File sets which have been created or have passed the grouping, splitting and sorting process, have a spatial and perhaps also a temporal ordering and extent. Each grouped and not split set of files is converted to a MultiFileVolume in which the files are stored by reference to prevent duplication. If image files are composed as a group, they are considered as a larger image consisting of its referenced files. In this case also an ImageProperties data structure is calculated in and stored in the MultiFileVolume to form one large image volume.

a) Composing DICOM objects

After grouping, splitting and sorting, also the DICOM objects are considered “belonging to each other” and fulfilling all defined requirements. To prevent information loss, all DICOM tags are stored together in a MultiFileVolume as a so-called DCMTree::Tree. This also eliminates the need to reopen all source files when the DICOM information is needed, stored or shall be modified. This, however, can be quite a large amount of data which needs to be reduced as much as possible to be manageable:

  • Pixel data tags are not stored in the DCMTree::Tree since it is already available as image data.
  • If multiple frames are composed, usually there are only minor differences between their tags. Since DCMTree::Tree has a tree-like structure, all tags which are identical in all frames, are stored on top-level (root) of the tree. Frame-specific tags are stored in a private tag with creator MeVis StructuredMF data and an entry with the frame-specific tags of each frame. That way a so called “Structured DICOM Tree“ is composed, stored and also referenced in the MultiFileVolume.

b) Non-DICOM Images

Imported image files which have not been recognized as DICOM objects, usually have no meta data allowing a more sophisticated composition. Thus there are only two ways to compose them (DirectDicomImport.composeOtherFilesMode):

  • each image is provided as one MultiFileVolume, which is the default.
  • with the heap mode images with same two or three-dimensional extents are composed (in the order of their appearance in the file list) to three dimensional or temporally extent MultiFileVolumes, respectively.

c) Calculating ML Image Properties

Properties of images of the MeVis Image Processing Library (ML) contain:

  • Type of the Voxel Data

    The data type of voxels is calculated from the value representation of the tag containing the pixel data (OB (Other Byte String), OW (Other Word String), OF (Other Float String), OD (Other Double String)), the number of valid bits (Bits Stored (0028,0101) tag) and the Pixel Representation (0028,0103) tag (defining the sign).

    In special cases (such as deformation field or spectroscopy data) pixels of 3D-float vector of complex floats are used.

    Single bit pixels of segmentation data are converted to 8 bit unsigned integer pixels with value 0 or 1.


    Rescale Intercept (0028,1052), Rescale Slope (0028,1053), Dose Grid Scaling (3004,000E), Photometric Interpretation (0028,0004), and any look-up table values are not applied to pixels or the type of Voxel Data, since their correct display depends on application, display, color model and calculation properties.

    To prevent unintended incorrectly (or incompletely) displayed data, MultiFileVolumes with such tags are marked with an issue. It denotes that for this purpose dedicated modules exist, e.g. ApplyDicomPixelModifiers or DicomRescale. They also allow further settings such as the selection of appropriate color channel modifications or pixel types for the output voxels.

  • Image Extents

    • X, Y

      Slice extents (Y- and X- dimension) are derived from the Rows (0028,0010) and Columns (0028,0011) tags, i.e. from the extent of the pixel matrix of the corresponding frame.

    • Z

      The Z-extent results from the number of composed frames in the MultiFileVolume. Note that this does not necessarily match the Number Of Frames (0028,0008) tag, which denotes the number of frames in a file.

    • C

      The extent of the color (C-) dimension is usually 1 if the number of Samples Per Pixel (0028,0002) is 1 or only one color or intensity component per pixel exists. Images with three samples per pixel are converted to three color channels whose content is translated from the Photometric Interpretation (0028,0004) tag. As denoted above, color channel conversions or LUT specifications do not influence the C- dimension extent.

    • T

      The extent in the temporal (T-) dimension is derived from the Number Of Temporal Positions (0020,0105) tag if available.

      A T-dimension extent larger one may also appear if the composition process detects multiple series or images in the same case, with matching X-, Y-, Z-, and C-extents and if no separating feature exists. This behavior, however, can be controlled with DirectDicomImport.decomposeTo3DVolumes.

    • U

      MultiFileVolumes with U-Dimension extents are not created by DirectDicomImport.

  • World Matrix

    The World Matrix of the ML Image Properties is a 4x4 homogeneous matrix created from scaling (voxel size), image orientation (rotation and shearing) and translation (image position):

    • Voxel Size

      The X- and Y- extent of a voxel is derived from the Pixel Spacing (0028,0030) tag if available. If that one does not exist, a number of other tags are evaluated, which, however, do not necessarily describe the real physical extent of a voxel in the object. In those cases the MultiFileVolume is always marked with an issue that the voxel size may not represent physical extents. This is especially important if sizes of objects shall be measured on images.

      The Spatial (Z-) extent of a voxel is usually derived from the distance between the values of the Image Position Patient (0020,0032) tags of the two first neighboring image slices, at least if more than one exists. If only one slice exists, it is derived from Spacing Between Slices (0018,0088) tag, if available. If even that does not exist the Slice Thickness (0018,0050) is used.

      If any component of the voxel size cannot be determined, 1 is used instead, and the MultiFileVolume is marked with an issue describing the problem.

    • Image Orientation

      The orientation vectors of each image frame are determined from the Image Orientation Patient (0020,0037) tag of the first image frame if available. They determine the direction of the X- and Y- axis of the coordinate system in which the image slices reside.

      The Z-axis is determined from the difference vector between the values of the Image Position Patient (0020,0032) tags of the two first neighboring image slices if more than one exists. This is especially important for frames, which, for example, are acquired with gantry tilts. This causes sheared heaps of frames which can be represented in this way.

      All three orientation vectors are normalized before using them in the World Matrix.

      If no vector for the X- or Y- orientation can be determined, it is set to the (1,0,0) or (0,1,0), respectively. If no Z-vector can be determined, it is calculated from the cross product of the X- and Y- orientations.


      In order to display frame sequences in 3D, scans in reverse order can define left-handed coordinate systems by the X-, Y- and Z-orientation vectors. Applications should take this in consideration to prevent incorrectly data displays.

    • Translation

      The translation part of the homogeneous matrix is taken from the position of Image Position Patient (0020,0032) tag of the first frame of the MultiFileVolume.

    Notes regarding the World Matrix:

    • Validity of Geometric Measurements

      If geometric measurements on the images are to be performed, and if they shall correspond to physical sizes and orientations in the real world objects, some more care has to be taken.

      First, the imported image should be checked for a SOP Class which actually contains pixel size and orientation information corresponding to real world sizes. Most image SOP Classes do not contain such data (such as projection images, secondary captures, some ultra sound images, x-ray mammography etc.). Only a few ones may deliver information usable for normal measurements (such as (Enhanced) CT Image, (Enhanced) MR Image Storage). Therefore MultiFileVolumes imported with DirectDicomImport should be checked for such SOP Classes and for absence of relevant warnings and errors before performing measurement on their images.

    • Half-Voxel-Shift

      The World Matrix describes the transformation of voxel data into world space, which often corresponds to the coordinate system in which a patient is scanned.

      According to the DICOM standard, the “ Image Position (0020,0032) specifies the x, y, and z coordinates of the upper left hand corner of the image; it is the center of the first voxel transmitted”. When displaying the first transmitted voxel with an extent of one unit, this would display the origin of the displayed at the position of minus half a unit. This is also the way how many tools such as Slicer display their data.

      However, historically the origin of the displayed voxel area in world space in MeVisLab has coordinates (0,0,0). This results to a difference of half a voxel extent between coordinates in MeVisLab and other tools. This needs to be taken in consideration when exchanging coordinates between MeVisLab and other tools.

      This half-voxel-shift is integrated in the calculation of the World Matrix and therefore is automatically included when transforming between voxel and world space in all MeVisLab tools.

  • Image Property Extensions

    If images are composed by DirectDicomImport, additional information belonging to each MultiFileVolume is stored in a list of so-called Image Property Extensions. It can be retrieved in DirectDicomImport.tagDump as string dump when a specific MultiFileVolume is selected.

    Dependent on the type of the MultiFileVolumes and used (backend) loaders, different extensions may be created and appended to the image properties of a MultiFileVolumes. Here only the most typical ones are described:

    • DicomTreeImagePropertyExtension

      The DCMTree::Tree created from DICOM files and referenced in a MultiFileVolume, is added as such an extension. See Composing for details.

    • StringImagePropertyExtension

      This extension type is created by some loader backends in order to store non-DICOM meta information in them.

3.4. Providing Results

Independent of whether DICOM or non-DICOM objects have been composed to MultiFileVolumes, all of them are managed as a MultiFileVolumeList. Invalid, filtered, or suppressed objects do not appear in it any more.

DirectDicomImport has the following outputs:

Issues related to any MultiFileVolume are stored together with it and can be retrieved by subsequent modules when needed.

Issues related to the whole import process, for example file i/o errors, are provided as the import log.

Also the flags DirectDicomImport.hadWarnings and DirectDicomImport.hadErrors show whether any problem occurred during the entire import process.