Defect Report #036

Submission Date: 10 Dec 92
Submittor: WG14
Source: X3J11/91-040 (Fred Tydeman)
Question 1
May floating-point constants be represented with more precision than implied by its type? Consider the following code fragment:
float f; double d; long double ld; ld = ld + 0.1; /*add a long double and a double */ ld = ld + 1.0 / 10.0; /* expression with "same" value */ d = f + 0.1f; /* "+" is allowed to be double precision */In the above example, the decimal number 0.1, when converted to binary, is a non-terminating repeating binary number; so the more bits used to represent the number, the closer it will be to its true value. Hence, if doubles are 64 bits and long doubles are 80 bits, the long double will be more accurate. So in essence, may 0.1 (a double) be represented with more precision, e.g. as 0.1L (a long double)?
Parts of the C Standard that may help answer the question follow.
Subclause 5.1.2.3 Program execution, page 7, line 36:

In the abstract machine, all expressions are evaluated as specified by the semantics.

I believe that this is the ``as if'' rule that applies to this case.
Subclause 5.1.2.3 Program execution, page 8, lines 44-45:

Alternatively, an operation involving only ints or floats may be executed using double-precision operations if neither range nor precision is lost thereby.

Clearly, d = f + 0.1F may be done using a double-precision add. But may 0.1f be represented as the double 0.1?
Subclause 6.1.3.1 Floating constants, page 26, lines 32-35:

If the scaled value is in the range of representable values (for its type) the result is either the nearest representable value, or the larger or smaller representable value immediately adjacent to the nearest representable value, chosen in an implementation-defined manner.

I believe that the above does not require that the result be the nearest representable value (for its type).
Subclause 6.2.1.5 Usual arithmetic conversions, page 35, lines 38-39:

The values of floating operands and of the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.

I believe that a floating constant is a floating operand, so is allowed greater precision. Clearly, the expression 1.0 / 10.0 is allowed greater precision than just double, so it would make sense to allow an equivalent constant (0.1) to have greater precision.
Subclause 6.4 Constant expressions, page 55, lines 14-16:

If a floating expression is evaluated in the translation environment, the arithmetic precision and range shall be at least as great as if the expression were being evaluated in the execution environment.

Response
The Committee concurs with all the arguments presented - a floating constant may be represented in more precision than implied by its type.

Previous Defect Report < - > Next Defect Report