Document Number: N2301
See Also: N2290
Submitter: Martin Sebor
Submission Date: September 29, 2018
Subject: NaN printf Formatting Underspecified

Summary

Function f below makes use of the common idiom to call snprintf first to determine the size of output, then allocate a buffer big enough for the output, and finally call sprintf to convert the argument into the buffer. In addition, the function attempts to parse the output using the sscanf function, and to verify that the parsed and extracted value match the initial input.

	void f (const char *n_char_sequence)
	{
	  double x = nan (n_char_sequence);
	  int n0 = snprintf (0, 0, "%f", x);
	  if (n0 > 0)
	  {
	    char d[n + 1];
	    int n1 = sprintf (d, "%f", x);
	    assert (n0 == n1);

	    double y;
	    char c;
	    int n2 = sscanf (d, "%lf%c", &y, &c);
	    assert (isnan (y) && n2 == 1);
	  }
	}
Is the function safe and can both assertions be expected to pass?

fprintf says the following about the effects of the f conversion specifier with a NaN argument.

A double argument representing a NaN is converted in one of the styles [-]nan or [-]nan(n-char-sequence) — which style, and the meaning of any n-char-sequence, is implementation-defined.

Suppose an implementation defines the n-char-sequence as a random string up to INT_MAX in length. On such an implemnentation the function above is not safe because the call to sprintf can overflow. (Replacing the call with snprintf prevents the overflow but it might truncate the output instead.)

Allowing such a perverse implementation surely was not the intent of the standard but the fact that there is no constraint preventing it would come as a surprise to most programmers and might be especially concerning to security analysts.

At a minimum, the authors think the n-char-sequence should be required to be the same for NaNs with the same representation. Preferably, there should also be some small upper bound on its length. Ideally, however, for the sake of portability, the default output for NaN would be "nan" and the "nan(n-char-sequence)" form would need to be explicitly requested, for example by the # flag.

Furthermore, assuming the %f directive does produce the same result for the same representation of a NaN as should be expected of any sane fprintf implementation, the only way sscanf can read back the full NaN string, including both parentheses, is is if the n-char-sequence itself doesn't contain embedded closing parentheses. This constraint is missing from the specification.

Proposed Resolution

To resolve the problems discussed above we propose to change §7.21.6.1 The fprintf function as indicated below. Since no implementations are known to produce output that includes the n-char-sequence we believe the risks of specifying the effect of the # flag and suggesting to constrain the amount of output are minimal and acceptable given the potential severity of the problem, even if it's only theoretical.

A double argument representing a NaN is converted in one of the following styles: — which style, and the The meaning of any n-char-sequence, is implementation-defined. The F conversion specifier produces INF, INFINITY, or NAN instead of inf, infinity, or nan, respectively.277) The produced output shall be the same for NaN values with the same representation.

Furthermore, add a new paragraph to the Recommended Practice subsection as follows.

–?– For floating conversions of a NaN argument and with the # flag specified, the n-char-sequence should be no longer than the number of bits in the representation of the operand's significand. In addition, the produced output should convert to the same NaN representation as the argument by the corresponding strtod, strtof, or strtold function, respectively.

Change §7.29.2.1 The fwprintf function as indicated below.

A double argument representing a NaN is converted in one of the following styles: — which style, and the The meaning of any n-wchar-sequence, is implementation-defined. The F conversion specifier produces INF, INFINITY, or NAN instead of inf, infinity, or nan, respectively.332) The produced output shall be the same for NaN values with the same representation.

Furthermore, add a new paragraph to the Recommended Practice subsection as follows.

–?– For floating conversions of a NaN argument and with the # flag specified, the n-wchar-sequence should be no longer than the number of bits in the representation of the operand's significand. In addition, the produced output should convert to the same NaN representation as the argument by the corresponding wcstod, wcstof, or wcstold function, respectively.

Finally, add a new bullet to §J.3.12 Library functions referencing the new implementation-defined behavior as follows.

– The output for %p conversion in the fprintf or fwprintf function (7.21.6.1, 7.29.2.1).
– The style of the output of the coversion of the a, e, f, or g format specifier for a NaN argument when the # flag is specified (7.21.6.1, 7.29.2.1).

Note: The authors recommend to consider N2290 in conjunction with this paper.