Issue 0090: Multibyte characters in formats question

This issue has been automatically converted from the original issue lists and some formatting may not have been preserved.

Authors: Clive Feather, WG14
Date: 1993-12-03
Submitted against: C90
Status: Closed
Converted from: dr.htm, dr_090.html

Item 27 - multibyte characters in formats

Consider a locale where the characters '\xE' and '\xF' start and end an alternate shift state (i.e., the latter reverts to the initial shift state), and where multibyte characters whose first byte is greater than or equal to 0x80 are two bytes long. The multibyte characters and the alternate shift state characters are all distinct from the basic execution character set (subclause 5.2.1). What is the output generated by the following fprintf calls?

        fprintf (stdout, "Test A: (%d)\n", 42);
        fprintf (stdout, "Test B: (\xE%d\xF)\n", 42);
        fprintf (stdout, "Test C: (\xE%\xF" "d)\n", 42);
        fprintf (stdout, "Test D: (\xCC%d)\n", 42);
        fprintf (stdout, "Test E: (\xE\xCC%d\xF)\n", 42);
        fprintf (stdout, "Test F: (\xE\xCC%\xF" "d)\n", 42);

Comment from WG14 on 1997-09-23:

Response

The first call contains no locale-specific characters and must produce the obvious output. The remainder of this response addresses the subsequent calls.

The hypothetical locale is defined such that “the multibyte characters and the alternate shift state characters are all distinct from the basic execution character set.” Thus the % character in the string literal is not the same character as the % that introduces a conversion specification (subclauses 7.9.6.1 and 7.9.6.2) because it is distinct.

The C Standard says, “The format is composed of zero or more directives: ordinary multibyte characters (not %), which are copied unchanged to the output stream, ...” Therefore, the output generated by the example fprintf calls is the format argument copied unchanged to the output stream. Note that the third argument in each call to fprintf is not needed.