Chapter 9. Unicode Support

Table of Contents

9.1. Unicode Support

Chapter Objectives

This chapter describes features and limitations of the ML with regard to international character handling.

9.1. Unicode Support

Unicodes must be supported to handle international characters in character strings such as parameters or file names. The ML implements support of the so-called UTF8 unicodes and provides some functions for managing or recoding these UTF8 unicodes. Thus, all strings (especially field names) the ML handles may contain unicoded characters.

[Important]Important

All obtained strings from e.g., module fields or other ML sources may contain unicoded characters.

Strings (e.g., file names) must be handled with I/O functions that are both capable of dealing with unicode and platform-independent.

See Section 2.6.2, “ MLUtilities and Chapter 10, File System Support for more information on helper functions for the platform-independent implementation of unicode-related stuff.

[Note]Note

When you receive other unicoded strings (e.g., from user interfaces, other libraries or from string files), these strings might use other uni-codings. See mlUnicode.h for information on how to convert these strings to UTF8.

The following functions are available:

  1. MLuint16* MLConvertUTF8ToUTF16(const char* input)

    Converts the given input char string (UTF8, terminated by 0) to UTF16, returns a newly allocated string that must be freed with MLFree(), returns NULL on error.

  2. MLuint32* MLConvertUTF8ToUTF32(const char* input)

    Converts the given input char string (UTF8, terminated by 0) to UTF32, returns a newly allocated string that must be freed with MLFree(), returns NULL on error.

  3. char* MLConvertUTF16ToUTF8(const MLuint16* input)

    Converts the given input wide string (UTF16, terminated by 0) to UTF8, returns a newly allocated wide string that must be freed with MLFree(), returns NULL on error.

  4. char* MLConvertUTF8ToLatin1(const char* input)

    Converts the given UTF8 encoded string into a Latin1 string, converting all non-Latin1 chars to '?', the returned string must be freed with MLFree(), returns NULL on error.

  5. char* MLConvertUTF16ToLatin1(const MLuint16* input)

    Converts the given UTF16 encoded wide string into a Latin1 string, converting all non-Latin1 chars to '?', the returned string must be freed with MLFree(), returns NULL on error.

  6. char* MLConvertLatin1ToUTF8(const char* input)

    Converts the given Latin1 encoded string into a UTF8 string, the returned string must be freed with MLFree(), returns NULL on error.

  7. MLuint16* MLConvertLatin1ToUTF16(const char* input)

    Converts the given Latin1 encoded string into a UTF16 wide string, the returned string must be freed with MLFree(), returns NULL on error.