Document: WG14 N1352


New macros for <float.h>


Submitter: Fred Tydeman (USA)
Submission Date: 2009-02-06
Previous version of paper: N1303, N1317
Related WG14 documents: N1151, N1171
Related WG21 documents: N2798
Subject: New macros for <float.h>

Existing practice: Many implementation have macros (with various spellings) for the minimum positive subnormal numbers. C99 has DECIMAL_DIG with the similar meaning as LDBL_MAXDIG10.

The committee asked that the minimum symbols be changed to the smallest positive number (either smallest subnormal or smallest normal) and more like C++. In doing the asked for change, the author changed the name of the macros to a more meaningful name.

Changes to C1x

Add new bullets to 5.2.4.2.2 Characteristics of floating types <float.h>

[bullet near DECIMAL_DIG] The number of base 10 digits required to ensure that floating-point numbers with p radix b digits which differ by only one unit in the last place (ulp) are always differentiated,

 p log10 b             if b is power of 10
 ceil(1 + p log10 b)   otherwise 

[Note to editor: WG14 paper N1290 on printed page 9 has the correct symbols/fonts for the above two math expressions; it is also very similar to the existing math expressions for DECIMAL_DIG in C99.]

FLT_MAXDIG10    6
DBL_MAXDIG10   10
LDBL_MAXDIG10  10

[bullet after FLT_MIN] minimum positive floating-point number. If subnormal numbers are supported [footnote], their value is the minimum positive subnormalized (also known as denormalized) floating-point number, otherwise the minimum positive normalized floating-point number, of the respective types.

FLT_TRUE_MIN        1E-37
DBL_TRUE_MIN        1E-37
LDBL_TRUE_MIN       1E-37

[footnote]: Support means that they are not flushed to zero when used as operands, nor, when an arithmetic operation produces them.

[bullet after FLT_MIN] The presence or absence of subnormalization (variable number of exponent bits) is characterized by the values:

-1  Indeterminate (cannot be determined if type allows subnormalized values)
 0  Absent (type does not allow subnormalized values)
+1  Present (type does allow subnormalized values)

for each respective type via:

FLT_HAS_SUBNORM
DBL_HAS_SUBNORM
LDBL_HAS_SUBNORM

[paragraph 13, example 1] Add

FLT_MAXDIG10   9
after FLT_RADIX

[paragraph 14, example 2] Remove "normalized" from just before IEC 60559.

Add

FLT_MAXDIG10    6
DBL_MAXDIG10   17
after DECIMAL_DIG

Add

FLT_TRUE_MIN        1.40129846E-45 // decimal constant
FLT_TRUE_MIN        0X1P-149F // hex constant
FLT_HAS_SUBNORM      +1
DBL_TRUE_MIN        4.9406564584124654E-324 // decimal constant
DBL_TRUE_MIN        0X1P-1074 // hex constant
DBL_HAS_SUBNORM      +1
after FLT_MIN and DBL_MIN.

Words for Rationale:

[add to 5.2.4.2.2 section] For applications that need to check, at translation time, if subnormal floating-point numbers are supported, can check for *_HAS_SUBNORM having the value +1, where * is FLT, DBL, or LDBL.

The values of the smallest subnormal floating-point numbers (if supported) are typically, but not always, FLT_MIN*FLT_EPSILON, DBL_MIN*DBL_EPSILON, LDBL_MIN*LDBL_EPSILON.