ISO/ IEC JTC1/SC22/WG21 N1628


Extensions for the Programming Language C++ to Support New Character Data Types

ISO/IEC JTC1 SC22 WG21 N1628
Date: July 16 2004

Lawrence Crowl


Problem

Many users of C++ need to manipulate Unicode character strings.
Unfortunately, there is no C++ standard means to do so.

Solution

The ISO C committee has addressed this issue extensively.  We should adopt their work, but with those changes necessary for effective use within C++.  See ISO C standard TR 19769 "New character types in C" as described in draft report ISO/IEC JTC1 SC22 WG14 N1040 "Extensions for the programming language C to support new character data types" at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1040.pdf.

Summary of WG14 N1040

The document WG14 N1040 provides motivation, macros for reporting ISO
10646 encoding, new typedefs for the 16-bit and 32-bit character types, character and string literals, mixed string concatenation, and four library functions.

Changes for C++

The document WG14 N1040 can be adopted with few changes.  Specifically, they are:

   Section 3 "The new typedefs"

      Define char16_t to be a typedef to a distinct new type, with the
      name _Char16_t that has the same size and representation as
      uint_least16_t.

      [N1040 defined char16_t as a typedef to the type of
      uint_least16_t, which make overloading on char16_t impossible.]

      Define char32_t to be a typedef to a distinct new type, with the
      name _Char32_t that has the same size and representation as
      uint_least32_t.

      [N1040 defined char32_t as a typedef to the type of
      uint_least32_t, which make overloading on char16_t impossible.]

   New section 6.5 "The standard template and typedefs"

      The standard library will define ...16 and ...32 typedefs, in
      analogy to the w... typedefs, for

         filebuf, streambuf, streampos, streamoff,
         ios, istream, ostream,
         fstream, ifstream, ofstream,
         stringstream, istringstream, ostringstream,
         string