1. Introduction
To prevent mojibake may use a native Unicode API when writing to
a terminal bypassing the stream buffer. During the review of [P2093] "Formatted output" Tim Song suggested that synchronizing with the
underlying stream may be beneficial for gradual adoption. This paper presents
motivating examples, observes that this problem doesn’t normally happen in
practice and proposes a minor update to the wording to provide a synchronization
guarantee.
2. Revision history
Changes since R0:
-
Added another motivating example.
-
Split Discussion into multiple sections.
-
Added the wording.
3. Motivating examples
Consider the following example:
printf ( "first \n " ); std :: ( "second \n " );
This will produce the expected output:
first second
because is at least line buffered by default.
However, in theory this may reorder the output:
printf ( "first" ); std :: ( "second" );
because of buffering in but not . Testing on Windows 10
with MSVC 19.28 and {fmt}'s implementation of ([FMT]) showed that the
order is preserved in this case as well. This suggests that is
completely unbuffered by default on this system. This is also confirmed in [MS-CRT]:
The
andstdout functions are flushed whenever they are full or, if you are writing to a character device, after each library call.stderr
On other systems the order is preserved too because the output goes through the stream buffer in both cases.
Consider, another example that involves iostreams:
struct A { int a ; int b ; friend std :: ostream & operator << ( std :: ostream & os , const A & a ) { std :: ( os , "{{a={}, b={}}}" , a . a , a . b ); return os ; } }; int main () { A a = { 2 , 4 }; std :: cout << "A is " << a << '\n' ; }
We updated the implementation of for in {fmt} to use the
native Unicode API and verified that there is no reordering in this example
either on the same test platform.
4. Proposal
Althought the issue appears to be mostly theoretical, it might still be
beneficial to clarify in the standard that synchronization is desired.
It is possible to guarantee the desired output ordering by flushing the buffer
before writing to a terminal in .
This will incur additional cost but only for the terminal case and when
transcoding is needed. Platforms that don’t buffer the output like the one we
tested should be able to avoid a call to flush.
Neither {fmt} ([FMT]) nor Rust ([RUST-STDIO]) do any attempt to provide such
synchronization in their implementations of . However, in practice this
synchronization appears to be a noop on tested platforms.
5. Wording
Modify subsection "Print functions [print.fun]":
void vprint_unicode ( FILE * stream , string_view fmt , format_args args );
...
Effects: The function initializes an automatic variable via
string out = vformat ( fmt , args );
If refers to a terminal capable of displaying Unicode, writes to
the terminal using the native Unicode API; if contains invalid code units,
the behavior is undefined and implementations are encouraged to diagnose it.
Otherwise writes to unchanged.
stream 's buffer before writing out .
Modify subsection "Print [ostream.formatted.print]":
void vprint_unicode ( ostream & os , string_view fmt , format_args args ); void vprint_nonunicode ( ostream & os , string_view fmt , format_args args );
Effects: Behaves as a formatted output function
([ostream.formatted.reqmts])
of , except that:
- failure to generate output is reported as specified below, and
- any exception thrown by the call to
is propagated without regard to the value ofvformat and without turning onos . exceptions () in the error state ofios_base :: badbit .os
After constructing a object, the
function initializes an automatic variable via
string out = vformat ( os . getloc (), fmt , args );
If the function is and is a stream that refers to a
terminal capable of displaying Unicode which is determined in an
implementation-defined manner, writes to the terminal using the
native Unicode API; if contains invalid code units, the behavior
is undefined and implementations are encouraged to diagnose it.
If the terminal output is buffered by default, the function flushes the 's buffer before writing .
Otherwise (if is not such a stream or the function is ), inserts the character sequence into .
If writing to the terminal or inserting into fails, calls (which may throw ).