Unicode Support

Wide Character Type

FOX 1.2 adds a new platform-independent type, FXwchar, to represent Unicode characters. This is a 32-bit unsigned integer, large enough to store any of the Unicode code points. Please note that FXwchar is not necessarily the same type as your platform's native wchar_t type and, in general, you shouldn't assume that sequences of FXwchar values can be used with functions that are expecting wchar_t arguments.

Wide String Type

The new FXWString class is basically a variant of FXString that uses the new FXwchar type internally.

Codecs

FXTextCodec is an abstract base class that defines the protocol for a number of converters. Every concrete subclass of FXTextCodec must implement the following four member functions:

  /**
   * Convert a sequence of wide characters from Unicode to the specified
   * 8-bit encoding. Reads at most n wide characters from src and writes
   * at most m bytes into dest. Returns the number of characters actually
   * written into dest.
   *
   * On exit, the src and dest pointers are updated to point to the next
   * available character (or byte) for reading (writing).
   */
  virtual unsigned long fromUnicode(FXuchar*& dest,unsigned long m,const FXwchar*& src,unsigned long n) = 0;

  /**
   * Convert a sequence of bytes in some 8-bit encoding to a sequence
   * of wide characters (Unicode). Reads at most n bytes from src and
   * writes at most m characters into dest. Returns the number of characters
   * actually read from src.
   *
   * On exit, the src and dest pointers are updated to point to the next
   * available byte (or character) for writing (reading).
   */
  virtual unsigned long toUnicode(FXwchar*& dest,unsigned long m,const FXuchar*& src,unsigned long n) = 0;

  /**
   * Return the IANA mime name for this codec; this is used for example
   * as "text/utf-8" in drag and drop protocols.
   */
  virtual const FXchar* mimeName() const = 0;
  
  /**
   * Return the Management Information Base (MIBenum) for the character set.
   */
  virtual FXint mibEnum() const = 0;

As of this writing, two codecs have been implemented: FXUTF8Codec, for converting to and from UTF-8 encoding, and FXLatin1Code, for converting to and from IS0-8859-1 (Latin1) encoding.