MLDataCompressors¶
Purpose¶
The MLDataCompressors are extensions which can be used by ML modules to compress and decompress data. They are loaded on startup time of MeVisLab and can be requested by the MLDataCompressorFactory class. They are typically used to compress and decompress image data saved and loaded by the MLImageFormat
classes.
See also general documentation about the MLImageFormat and their documentation in the MeVisLab SDK.
Notes:
- When storing images with the MLImageFileFormat, only pages are compressed and each page is compressed on its own.
- Since page tables (about 32 bytes/page), image information and properties are not compressed, this can affect compression ratios if the image has many pages or large DICOM trees are stored with it.
- There is a set up time for data compressors on each page - for some compressors this is even much higher than the compression time itself (especially for LZMA). Reducing the number of pages therefore sometimes can reduce compression times.
- Compressors often benefit from pages which are not too small. Reducing the page number of pages or increasing the page extent can improve compression ratios (for example often with BZip2).
- As a guideline page sizes of 256KB up to 1MB are usually well suited for compression (on CT data this is a typical slice size). For LZMA compression they could be increased to reduce compression times, however, do this only with care to avoid negative side effects of too large pages.
Example¶
See the example of the MLImageFormatSave
module and its Compression parameter.
Current Compressors¶
Currently implemented data compressors are:
Some of these compressors (BZip2, ZLib as well as LZMA, LZF, and LZ4 which use this flag internally) benefit from an enabled Optimze Data flag in the MLImageFormatSave
module if voxels have more than 8 bit, as it is typical for CT or MR data from clinical examinations.
Some other compressors benefit from an enabled DiffCode Data flag in the MLImageFormatSave
module which prepares image data by calculating absolute differences between neighbour voxels before compressing them.
BZip2DataCompressor¶
The BZip2DataCompressor usually has high compression ratios for the typical medial image data used in MeVisLab. In some cases it can be significant slower (10-50 times) than the other compressors. Therefore it is recommended for files where high compression ratios are needed, and if saving time is not of very high importance.
The compession ratios of integer and non scalar voxel images are often, not always, improved by activating Optimize Data, in a few cases they become (normally insignificantly) worse. On float and double typed voxels this flag is not efficient and is therefore ignored.
Note: Compression ratios (and also compression times) are often improved with this compressor if page extents are increased; so avoid too small pages.
LZ4DataCompressor¶
The LZ4DataCompressor is significantly faster than all other compressors whereby it has fair compression ratios. It is recommended for high saving performance and if the compression ratio is less important. Compression ratios are 10-30% less than with the most other compressors, comparable with the LZFDataCompressor, whereas its compression speed is even faster. Like the LZFDataCompressor it is much faster than the Tiff-Lzw compression of the ImageSave module and the compression is comparable or often better.
This compressor is faster than most disk and network IO while a good compression rate is reachable, reducing file size and saving times. This is because compression time is less than the time gained by reducing the amount of stored data. Only on SSD and ram-disks file IO can be faster than this compressor.
Note that this compressor became available with MeVisLab 2.7, thus images stored with this compressor cannot be decompressed with older MeVisLab versions. If you need a fair backward compatibility then use the LZFDataCompressor which is nearly equivalent and an alternative already available with MeVisLab2.1.
LZFDataCompressor¶
The LZFDataCompressor a long time was the fastest of all compressors whereby it has fair compression ratios, now, however, the LZ4DataCompressor might be an alternative if maximum compression speed is required. The LZFDataCompressor is recommended for high saving and compression performance and if the compression ratio is less important, especially if LZ4DataCompressor cannot be used, because backward compatibility to MeVisLab versions before 1.7 is needed. Compression ratios are 10-30% less than with the other compressors whereas its compression speed is about 2-5 times faster than of the ZLibDataCompressor. Note that this compressor is also much faster than the Tiff-Lzw compression of the ImageSave module and the compression (especially with enabled Optimize Data flag) is comparable or often better. Since the compression ratio on many, not all, data sets can be improved by the DiffCode Data flag, this also can be enabled. However, in a few cases, this can also reduce compression ratios.
If file IO on a system is slow (e.g. if the disk is slow or saving is via a network) and a good compression rate is reachable, then saving time might even be reduced by enabling compression. This is because compression time is less than the time gained by reducing the amount of stored data.
Note that this compressor became available with MeVisLab 2.1, thus images stored with this compressor cannot be decompressed with older MeVisLab versions.
LZMADataCompressor¶
On an amount of tested data sets the LZMADataCompressor it mostly had the highest compression ratios, however the data compression is very slow in comparison to the other compressors (about 5-40 times slower than BZip2DataCompressor). Therefore it is recommended only where compression times are of no importance or where sizes have to be reduced as much as possible. Decompression times are usually fast.
This compressor has one Level argument which can be set from 0-9. It specifies the amount of memory to be used by the compressor and should only be increased with care over the default value of 5 to avoid out of memory problems. Higher values, however, can improve compression ratios, lower values can reduce them. There is no significant influence of Level to compression times.
- Notes:
- The set up time of this compressor is quite high - therefore compressing images with few pages is usually much faster than compressing same sized images with many pages. So avoid too small pages.
- This compressor became available with MeVisLab 2.1, thus images stored with this compressor cannot be decompressed with older MeVisLab versions.
ZLibDataCompressor¶
The ZLibDataCompressor is faster than the BZip2DataCompressor and the LZMADataCompressor and sometimes even reaches same or better compression ratios, however, it is slower than LZFDataCompressor or LZ4DataCompressor. It is recommended for a good balance of both, compression ratios and compression times.
The compession of 16 bit medical integer images normally (not always) is improved by activating the flags Optimize Data and DiffCode Data.
CTCompress2DataCompressor¶
The CTCompress2DataCompressor is an experimental data compressor designed for integer CT and MR data. It normally reaches a good balance between high compression ratios and high speed compression for CT and MR data, however, image data with floating point or non scalar voxels is not compressed.
Compression Ratios and Times¶
Note that this is only a guide line. Compression ratios and times vary between images and data sets.
Typical Compression Speed (from fast to slow) [1]:
No Compression [3] < LZ4DataCompressor [4] ==LZFDataCompressor < CTCompress2DataCompressor [2] < ZLibDataCompressor < BZip2DataCompressor < LZMADataCompressor
Typical Compression Ratios (from high to low) [1]:
LZMADataCompressor < BZip2DataCompressor < CTCompress2DataCompressor(*) < ZLibDataCompressor < LZFDataCompressor==LZ4DataCompressor < No Compression
Footnotes
[1] | (1, 2) The data sets used for these measurement are mostly MeVisLab example data sets, but also some else medical image data sets. |
[2] | Only on integer CT/MR data. |
[3] | No compression may be slower than using fast compressors in use cases where file IO or network transmissions are slower than compressing the data. In such cases especially the LZ4 compressor might be beneficial. |
[4] | The LZ4 compressor can be faster than LZF, however, this is usually difficult to measure and requires very fast file IO, for example on an SSD or ram disk. |