From willemw@komp Thu Sep 4 16:14:02 1997 Received: from komp.ace.nl (komp.ace.nl [193.78.104.90]) by dkuug.dk (8.6.12/8.6.12) with SMTP id QAA09227 for ; Thu, 4 Sep 1997 16:14:02 +0200 Received: by komp.ace.nl with SMTP id AA27741 (1.14/2.17); Thu, 4 Sep 97 16:12:03 +0200 (MET) To: sc22wg11@dkuug.dk Subject: WG11 N440 (3 of 7): LIA-1 binding for C: Date: Thu, 04 Sep 97 16:12:02 N Message-Id: <27739.873382322@komp> From: Willem Wakker * Document Number: WG14 N749/J11 97-112 C9X Revision Proposal ===================== * Title: LIA-1 Binding: Author: Fred J. Tydeman Author Affiliation: Tydeman Consulting Postal Address: 3711 Del Robles Dr., Austin, Texas, USA, 78727 E-mail Address: tydeman@tybor.com Telephone Number: +1 (512) 255-8696 Fax Number: +1 (512) 255-8696 Sponsor: WG14 Date: 1997-08-25 Proposal Category: __ Editorial change/non-normative contribution __ Correction Y_ New feature __ Addition to obsolescent feature list __ Addition to Future Directions __ Other (please specify) ______________________________ Area of Standard Affected: __ Environment __ Language __ Preprocessor Y_ Library Y_ Macro/typedef/tag name Y_ Function Y_ Header __ Other (please specify) ______________________________ Prior Art: None known. Target Audience: Programmers writing programs that perform a significant amount of numeric processing.___________________ Related Documents (if any): WG14/N753 (LIA-1 Binding: Rationale), WG14/N752 (LIA-1 Binding: Optional parts annex), WG14/N751 (LIA-1 Binding: LIA-1 + IEC-559 annex), WG14/N750 (LIA-1 Binding: LIA-1 annex), WG14/N748 (LIA-1 Binding: Adding 'pole' from LIA-2), WG14/N747 (IEC 559 Binding: Signaling NaNs), WG14/N693 (Type-Generic Math Functions), WG14/N528 (C Binding of LIA-1), WG14/N487 (LIA-1), WG14/N486 (LIA Overview), WG14/N463 (Impact of adding LIA-1) Proposal Attached: _Y Yes __ No, but what's your interest? Abstract: These changes are the fundamental changes to C to allow support of ISO 10967-1 (LIA-1). They are being added in their own header Proposal: Note: The '*' characters in the lefthand column are not part of the proposal (they are useful for emacs M-x outline mode) In the following, bold text, italic text, code sample are the conventions used to indicate text different from normal. * 7 Library -- Add a new library section (here called 7.x) ** 7.x Notification and additional limits, mathematics, and general utilities. The header declares several macros and functions to support Language Independent Arithmetic. These are additional limits (characteristics of the integer and floating-point types), general integer utilities, mathematical functions, notification and access to the integer environment. The integer environment refers collectively to any integer status flags and control modes supported by the implementation[footnote]. An integer exception status flag is a system variable whose value is set as a side effect of the arithmetic to provide auxiliary information. An integer control mode is a system variable whose value may be set by the user to affect the subsequent behavior of the arithmetic. [footnote]. This header is designed to support the notification indicators (here called exception status flags) required by LIA-1, and other similar integer state information. Also it is designed to facilitate code portability among all systems. *** 7.x.1 Limits Several macros are declared to provide additional information beyond and about the characteristics of the integer and floating-point types. All integral values in the header, shall be constant expressions suitable for use in #if preprocessing directives; all floating values shall be constant expressions. All floating-point related macros have separate names for all three floating-point types. **** 7.x.1.1 Integer limits: The treatment of out-of-bounds results: 0 undefined behavior 1 wrap (similar to unsigned) 2 notification for the signed integer types int, long and long long are characterized by: INT_OUT_OF_BOUNDS LONG_OUT_OF_BOUNDS LLONG_OUT_OF_BOUNDS respectively. **** 7.x.1.2 Floating-point limits: The level of support for subnormalized numbers is characterized by the values: -1 indeterminable 0 not supported 1 fully supported 2 treated as zero for the floating types float, double, long double FLT_SUBNORMAL DBL_SUBNORMAL LDBL_SUBNORMAL respectively. All other negative values for *_SUBNORMAL characterize implementation-defined behavior. The values given in the following list shall be replaced by implementation-defined expressions: -- boolean value (0 or 1) to indicate if the corresponding type conforms to IEC 559. FLT_IEC_559 DBL_IEC_559 LDBL_IEC_559 -- boolean value (0 or 1) to indicate if IEC 559 tininess is detected "before rounding" (1) or "after rounding" (0). TINYNESS_BEFORE -- boolean value (0 or 1) to indicate if IEC 559 loss-of-accuracy is detected as a denormalization loss (1) or as an inexact result (0). HAS_DENORM_LOSS -- boolean value (0 or 1) to indicate if LIA-1 strict 1-ulp accuracy and a common rounding rule for +, -, *, and / is used. LIA_STRICT -- boolean value (0 or 1) to indicate if underflows are silent (do not produce a notification) SILENT_UNDERFLOW -- boolean value (0 or 1) to indicate if comparisons may overflow or underflow like subtraction COMPARISON_VIA_SUBTRACT -- boolean value (0 or 1) to indicate if negate may fail because floating-point values are not sign symmetric. NEGATE_MAY_FAIL The values given in the following list shall be replaced by implementation-defined expressions that shall be equal or greater in magnitude (absolute value) to those shown, with the same sign: -- minimum positive floating-point number, b**emin-p if subnormalized numbers are supported, else b**emin-1. FLT_TRUE_MIN 1E-37 DBL_TRUE_MIN 1E-37 LDBL_TRUE_MIN 1E-37 The values given in the following list shall be replaced by implementation-defined expressions that shall be equal or lesser in magnitude (absolute value) to those shown, with the same sign: -- maximum rounding error in terms of Units in Last Place (ULPs), FLT_RND_ERR 1.5 DBL_RND_ERR 1.5 LDBL_RND_ERR 1.5 Example 2 in is supplemented with these: FLT_IEC_559 1 FLT_SUBNORMAL 1 FLT_TRUE_MIN 1.40129846E-45 FLT_RND_ERR 0.5 DBL_IEC_559 1 DBL_SUBNORMAL 1 DBL_TRUE_MIN 4.94065646E-324 DBL_RND_ERR 0.5 *** 7.x.2 Mathematics: Several functions are declared to provide additional capability beyond . Most synopses specify a function which takes one or more double arguments and returns a double value; for each such function, there are functions with the same name but with f and l suffixes which are corresponding functions with float and long double arguments and return values. **** 7.x.2.1 Exponential and logarithmic functions ***** 7.x.2.1.1 The fracrep function Synopsis #include double fracrep(double x); Description The fracrep function extracts the fraction of the model representation of x, as a signed normalized fraction in the format of x. Returns The fracrep function returns the value y, such that y is a double with magnitude in the interval [1/FLT_RADIX, 1) or zero, and x equals y times FLT_RADIX raised to the power (logb(x)+1.0). The value returned for zero is 0.0. ***** 7.x.2.1.2 The ulp function Synopsis #include double ulp(double x); Description The ulp function computes the value of a Unit in the Last Place of x. A domain error occurs if the argument is zero. A range error may occur if subnormals are not supported. Returns The ulp function returns the value FLT_RADIX raised to the power (logb(x)+1-p). p is the precision of the floating type and is one of *_MANT_DIG. **** 7.x.2.2 Sign function ***** 7.x.2.2.1 The fsgn function Synopsis #include double fsgn(double x); Description The fsgn function computes the sign of a floating-point number x. Positive floating-point numbers have a sign of +1.0, negative floating-point numbers have a sign of -1.0, and zero has a sign of 0.0. Returns The fsgn function returns the sign. **** 7.x.2.3 Manipulation functions: ***** 7.x.2.3.1 The fsucc function Synopsis #include double fsucc(double x); Description The fsucc function determines the next representable value, in the type of the function, after x in the direction of +infinity. A range error occurs if x is the largest positive finite number. Returns The fsucc function returns the smallest representable value, of the same type, greater than x. ***** 7.x.2.3.2 The fpred function Synopsis #include double fpred(double x); Description The fpred function determines the next representable value, in the type of the function, after x in the direction of -infinity. A range error occurs if x is the largest negative finite number. Returns The fpred function returns the largest representable value, of the same type, less than x. ***** 7.x.2.3.3 The truncto function Synopsis #include double truncto(double x, int n); Description The truncto function truncates (rounds toward zero) x to n digits of precision. Returns The truncto function returns the value for normal numbers: sign(x) * floor(|x|/(FLT_RADIX**(expon(x)-n))) * FLT_RADIX**(expon(x)-n) and for subnormal numbers: sign(x) * floor(|x|/(FLT_RADIX**(emin-n))) * FLT_RADIX**(emin-n). If n is less than 1, returns 0. If n is greater than precision of x, returns x. ***** 7.x.2.3.4 The roundto function Synopsis #include double roundto(double x, int n); Description The roundto function rounds (rounds to biased nearest with ties going away from zero) x to n digits of precision. A range error may occur. Returns The roundto function returns the value for normal numbers: sign(x) * floor(|x|/(FLT_RADIX**(expon(x)-n))+0.5) * FLT_RADIX**(expon(x)-n) and for subnormal numbers: sign(x) * floor(|x|/(FLT_RADIX**(emin-n))+0.5) * FLT_RADIX**(emin-n). If n is less than 1, returns 0. If n is greater than precision of x, returns x. **** 7.x.2.4 Conversion macros: The following subclauses provide macros that convert from floating-point type to integral type using round to nearest rounding. The round to nearest can be biased (ties round away from zero) or unbiased (such as IEC 559 round to nearest even). In the synopses in this subclause, real-floating indicates that the argument must be an expression of real floating type. ***** 7.x.2.4.1 The icvt macro Synopsis #include int icvt(real-floating x); Description The icvt macro rounds its argument to the nearest integral value. If the rounded value is outside the range of int, the numeric result is unspecified. A range error may occur if the magnitude of x is too large. Returns The icvt macro returns the rounded integral value. ***** 7.x.2.4.2 The lcvt macro Synopsis #include long lcvt(real-floating x); Description The lcvt macro is equivalent to the icvt macro, except that the returned value has type long. ***** 7.x.2.4.3 The llcvt macro Synopsis #include long long llcvt(real-floating x); Description The llcvt macro is equivalent to the icvt macro, except that the returned value has type long long. ***** 7.x.2.4.4 The uicvt macro Synopsis #include unsigned int uicvt(real-floating x); Description The uicvt macro rounds its argument to the nearest integral value. If the rounded value is outside the range of unsigned int, the rounded value is wrapped modulo (UINT_MAX+1). Returns The uicvt macro returns the rounded integral value. ***** 7.x.2.4.5 The ulcvt macro Synopsis #include unsigned long ulcvt(real-floating x); Description The ulcvt macro is equivalent to the uicvt macro, except that the returned value has type unsigned long. ***** 7.x.2.4.6 The ullcvt macro Synopsis #include unsigned long long ullcvt(real-floating x); Description The ullcvt macro is equivalent to the uicvt macro, except that the returned value has type unsigned long long. *** 7.x.3 General utilities Several functions are declared to provide additional capability beyond . **** 7.x.3.1 The sgn function Synopsis #include int sgn(int j); Description The sgn function computes the sign of an integer j. Positive integers have a sign of +1, negative integers have a sign of -1, and zero has a sign of 0. Returns The sgn function returns the sign. **** 7.x.3.2 The lsgn function Synopsis #include long int lsgn(long int j); Description The lsgn function is similar to the sgn function, except that the argument and returned value each have type long int. **** 7.x.3.3 The llsgn function Synopsis #include long long int llsgn(long long int j); Description The llsgn function is similar to the sgn function, except that the argument and returned value each have type long long int. **** 7.x.3.4 The modulo function Synopsis #include int modulo(int numer, int denom); Description The modulo function computes the modulus, that is, numer-(floor(numer/denom)*denom). If denom is zero, the behavior is undefined. Returns The modulo function returns the modulus. **** 7.x.3.5 The lmodulo function Synopsis #include long int lmodulo(long int numer, long int denom); Description The lmodulo function is similar to the modulo function, except that the arguments and returned value each have type long int. **** 7.x.3.6 The llmodulo function Synopsis #include long long int llmodulo(long long int numer, long long int denom); Description The llmodulo function is similar to the modulo function, except that the arguments and returned value each have type long long int. *** 7.x.4 Notification Each macro INT_OVERFLOW INT_DIVBYZERO INT_INVALID is defined if and only if the implementation supports the exception by means of the functions in 7.x.4.2. The defined macros expand to integral constant expressions whose values are distinct powers of 2. **** 7.x.4.1 The LIA_NOTIFY pragma and macro ***** 7.x.4.1.1 The LIA_NOTIFY pragma Synopsis #include #pragma STDC LIA_NOTIFY { UNDEF | IGNORE | FLAGS | TRAP } Description The LIA_NOTIFY pragma provides a means to inform the implementation which notification mechanism is to be used[footnote]. The pragma can occur either outside external declarations or preceding all explicit declarations and statements inside a compound statement. When outside external declarations, the pragma takes effect from its occurrence until another LIA_NOTIFY pragma is encountered, or until the end of the translation unit. When inside a compound statement, the pragma takes effect from its occurrence until another LIA_NOTIFY pragma is encountered (within a nested compound statement), or until the end of the compound statement; at the end of a compound statement the state for the pragma is restored to its condition just before the compound statement. The effect of this pragma in any other context is undefined. If part of a program tests flags or runs under non-default mode settings, but was translated with the state for the LIA_NOTIFY pragma UNDEF, then the behavior of that program is undefined. UNDEF means that the program wishes notifications to cause undefined behaviour. This matches C89. This mode does not conform to LIA-1. IGNORE means that the program wishes notifications to be ignored. This also causes the final check on the notification indicators at program termination to be suppressed. This allows the optimizations mentioned in the subsection on "FENV_ACCESS off" to be done. This mode does not conform to LIA-1. FLAGS means that the program wishes notifications to cause set a status flag. TRAP means that the program wishes notifications to cause a trap, that is signal SIGFPE. Until is included, the default state for the pragma shall be UNDEF. Once is included, the default state for the pragma is implementation-defined and shall be one of FLAGS, TRAP, (or DYNAMIC if supported). [footnote]Notification is the process by which a program is informed that an arithmetic operation cannot be performed. ***** 7.x.4.1.2 The LIA_NOTIFY macro The macro LIA_NOTIFY has one of these values (with corresponding meaning): 0 Undefined, like C89/C95; not LIA-1 compliant 1 Notifications are ignored; not LIA-1 compliant 2 All notifications will set flags 3 All notifications will trap 4 Program switches between flags and traps at runtime to indicate the current way notifications are being handled. That is, the macro tracks the state of the LIA_NOTIFY pragma. **** 7.x.4.2 Indicators or Exception flags The following functions provide access to the integer exception flags. They support the basic abstraction of flags that are either set or clear. The int input argument for the functions represents a subset of integer exceptions, and can be constructed by bitwise ORs of the exception macros, for example INT_DIVBYZERO | INT_INVALID. For other argument values the behavior of these functions is undefined. ***** 7.x.4.2.1 The ieclearexcept function Synopsis #include void ieclearexcept(int excepts); Description The ieclearexcept function clears the supported exceptions represented by its argument. ***** 7.x.4.2.2 The ieraiseexcept function Synopsis #include void ieraiseexcept(int excepts); Description The ieraiseexcept function raises the supported exceptions represented by its argument. The order in which these exceptions are raised is unspecified. ***** 7.x.4.2.3 The ietestexcept function Synopsis #include int ietestexcept(int excepts); Description The ietestexcept function determines which of a specified subset of the exception flags are currently set. The excepts argument specifies the exception flags to be queried.[footnote] [footnote]. This mechanism allows testing several exceptions with just one function call. Returns The ietestexcept function returns the value of the bitwise OR of the exception macros corresponding to the currently set exceptions included in excepts. * -- Add to 7.? Type-generic math : after all occurances of ** -- Add to 7.?.1 Type-generic macros *** -- Add to the list of real (but not complex) functions that starts atan2, exp2: fracrep ulp fsgn fsucc fpred truncto roundto * -- Add to Annex F IEC 559 Floating-Point Arithmetic: ** -- Add new subclause F.10 : *** F.10.1 Exponential and logarithmic functions **** F.10.1.1 The fracrep function fracrep(-0.0) returns -0.0 fracrep(+/-INFINITY) returns +/-INFINITY **** F.10.1.2 The ulp function ulp(-0.0) returns a NaN and raises the invalid exception. ulp(+/-INFINITY) returns a NaN and raises the invalid exception. ulp(1.0) returns DBL_EPSILON. *** F.10.2 Sign function **** F.10.2.1 The fsgn function fsgn(-0.0) returns -0.0 fsgn(+INFINITY) returns +1.0 fsgn(-INFINITY) returns -1.0 fsgn(NaN) returns the same NaN. *** F.10.3 Manipulation functions **** F.10.3.1 The fsucc function fsucc(-0.0) returns +DBL_TRUE_MIN fsucc(+INFINITY) returns +INFINITY fsucc(+DBL_MAX) returns +INFINITY and raises overflow. **** F.10.3.2 The fpred function fpred(-0.0) returns -DBL_TRUE_MIN fpred(-INFINITY) returns -INFINITY fpred(-DBL_MAX) returns -INFINITY and raises overflow. **** F.10.3.3 The truncto function truncto(-0.0, n) returns -0.0 for any n. truncto(+/-INFINITY, n) returns +/-INFINITY for any n. truncto(NaN, n) returns the same NaN for any n. **** F.10.3.4 The roundto function roundto(-0.0, n) returns -0.0 for any n. roundto(+/-INFINITY, n) returns +/-INFINITY for any n. roundto(NaN, n) returns the same NaN for any n. *** F.10.4 Conversion macros **** F.10.4.1 The icvt macro icvt(+/-INFINITY) returns an unspecified value and raises INT_INVALID. icvt(NaN) returns an unspecified value and raises INT_INVALID. **** F.10.4.2 The lcvt macro lcvt(+/-INFINITY) returns an unspecified value and raises INT_INVALID. lcvt(NaN) returns an unspecified value and raises INT_INVALID. **** F.10.4.3 The llcvt macro llcvt(+/-INFINITY) returns an unspecified value and raises INT_INVALID. llcvt(NaN) returns an unspecified value and raises INT_INVALID. **** F.10.4.4 The uicvt macro uicvt(+/-INFINITY) returns an unspecified value and raises INT_INVALID. uicvt(NaN) returns an unspecified value and raises INT_INVALID. **** F.10.4.5 The ulcvt macro ulcvt(+/-INFINITY) returns an unspecified value and raises INT_INVALID. ulcvt(NaN) returns an unspecified value and raises INT_INVALID. **** F.10.4.6 The ullcvt macro ullcvt(+/-INFINITY) returns an unspecified value and raises INT_INVALID. ullcvt(NaN) returns an unspecified value and raises INT_INVALID. - --- Fred J. Tydeman +1 (512) 255-8696 Tydeman Consulting 3711 Del Robles tydeman@tybor.com Programming, testing, numerics Austin, Texas 78727 Voting member of X3J11 (ANSI "C") USA Sample C9X+FPCE tests: ftp://jump.net/pub/tybor/