Proposal to Add Decimal Floating Point Support to C++

Document: ISO/IEC JTC1 SC22 WG21 N3407=12–0097
Date: 2012–09–14
Project: Programming Language C++
Addresses: Library and Evolution Working Groups
Reply To:
Dietmar Kühl,
Bloomberg L.P.
39–45 Finsbury Square
London, EC2A 1PQ, United Kingdom

This document proposes to add decimal floating point support to the C++ standard. The current version doesn’t spell out the details but instead refers to the Decimal TR (ISO/IEC TR 24733) as a basis and describes changes to be applied to this interface to bring the proposal up to date with C++ 2011 enhancements.


C++ provides built-in data types for the processing of numerical values: float, double, and long double. The constraints for these types imply that a floating point representation is used, i.e., the values are represented using a fixed size as (–1)sign * significand * baseexponent. The standard doesn’t mandate the base to be used and typically base 2 is chosen primarily because it yields the fastest computations.

In many areas, especially in finance, exact values need to be processed and the inputs are commonly decimal. Unfortunately, decimal values cannot, in general, be represented accurately using binary floating points even when the decimal values only uses a few digits. Instead, the values become an approximation. As long as the values are carefully processed the original decimal value can be restored from a binary floating point (assuming reasonable restrictions on the number of decimal digits). However, computations and certain conversions introduce subtle errors (e.g. double to float and back to double, even if float is big enough to restore the original decimal value). As a result, the processing of exact decimal values using binary floating points is very error prone.

The use of decimal floating points avoids many of the problems caused by binary floating points. In particular, computations which need to accurately process decimal numbers can use decimal floating points. Decimal floating points provide a useful and sufficient compromise for these domains. Since they use a fixed size representation computations which are normally exact can introduce inaccuracies when the number of necessary digits becomes too big but for actual applications this is rarely a problem. Also, decimal floating points cannot represent the result of all operations exactly. For example, the result of a division with a prime other than 2 and 5 will, in general, be rounded. In the contexts where exact results are needed the corresponding operations aren’t needed.

The need for support of exact decimal computations is recognized in many communities and supported in several systems, although different alternatives for the support are chosen. Below is a list of example programming languages with decimal support:

  1. The C committee is working on a Decimal TR as TR 24732. The decimal support in C uses built-in types _Decimal32, _Decimal64, and _Decimal128.
  2. Java provides decimal arithmetic by java.math.BigDecimal, an arbitrary sized integer with an integer scale for the decimal places.
  3. Python provides decimal.Decimal which is a fixed point decimal representation. The number of decimal digits can be set globally.
  4. .Net provides System.Decimal which is a 128 bit decimal floating point. The details of this represntation are slightly different from the 128 bit decimal floating point in IEEE 754–2008. System.Decimal is accessible in C# as decimal.
  5. SQL provides a fixed point decimal representation where the number of digits and the number of fractional digits can be chosen for each context.
  6. Ruby provides BigDecimal, an arbitrary sized integer with an integer scale for the decimal places.

Since C++ is used in many places where accurate decimal arithmetic is required it seems reasonable to add similar support to the standard C++ library.


This document proposes to add the interfaces described by the Decimal TR, augmented to take advantage of C++ 2011 features as outlined below, as a mandatory part of the next revision of C++.

The Decimal TR was issued in 2009 and, thus, in 2014 a statement needs to be made whether it is to be affirmed, revised, or withdrawn. Since the next revision of C++ is scheduled to be released in 2016 it is also proposed that the Decimal TR is revised to reflect the changes outlined below for the 2014 systematic review. Assuming decimal floating point support is added to C++ 2016 the technical report can be withdrawn for the 2019 systematic review.

Potential Implications

Adding anything to the standard C++ library isn’t free and any component may depend on specific infrastructure to be present to be implementable. This section discusses the involved costs and requirements.

Hardware Support

The support for decimal floating point numbers described does not require specific hardware support. There are several software implementations of decimal floating points (Intel, IBM, HP) with suitable performance. It isn’t expected that decimal floating points are used for heavy number crunching because in these contexts the corresponding results will not be exact decimal values in the first place. Thus, the performance expectations for decimal floating points are different than those for floating points used for for number crunching.

When processing decimal values using binary floating points it is necessary to convert between fractional decimal values to the closest fractional binary value. These conversions are relatively expensive and are avoided when the processing is done with decimal floating points. Since the operations on binary floating points in general yield inaccurate decimal values the hardware support for binary floating point isn’t of much help when trying to process decimal values. Thus, performance comparisons between decimal floating points and binary floating points are misguided because they address different problems.

That said, dedicated hardware can improve the performance of operations using decimal floating points and the specification is written such that potentially available hardware support can be used for an implementation. For example, IBM provides a library detecting the presence of hardware support and which only uses a software implementation for decimal floating points where no hardware support is available (see the section on DFPAL; the decimal support in gcc is based on libdfp which also chooses between a hardware and a software implementation depending on availability of the hardware but which implements the interface of TR 24732).

Cost of Specification

The operations on decimal floating points are relatively complex. To yield predictable results for portable programs it is necessary to specify the details of rounding, retained precision, dealing with boundary conditions, etc. However, all of these details are already addressed by IEEE 754–2008. The specification in the C++ standard will have exactly the same semantics by referencing IEEE 754–2008 for the semantics. What needs to be specified are the interfaces to access the various features of IEEE 754–2008 in a natural way from C++.

The Decimal TR already spells out most aspects of a C++ binding. With the added C++ 2011 features it is possible to create a better user experience. There are some design areas open with respect to adding C++ 2011 support (see the section on Changes to the Decimal TR below). Thus, the overall cost of specification should be acceptable.

Cost of Implementation

The implementation of decimal floating point support is certainly not trivial. However, it is also not as complex as, e.g., the implementation of the special math functions. Several independent implementations are available for different platforms, including open source versions. The libraries mentioned below are all using a C interface which can be used to implement the C++ support.

  1. Intel’s library is distributed as source.
  2. IBM’s implementation is distributed as source with multiple open source projects (gcc and ICU).
  3. HP provides support for decimal floating points with their C and C++ compilers.

Implementing the interfaces specified by the Decimal TR in terms of the C implementations is relatively straight forward. It also seems reasonable that a native C++ implementation can be provided with a reasonable amount of work.

Cost of Testing

IBM provides a set of language independent test cases for the decimal floating point semantics on the General Decimal Arithmetic page. These can be processed by a C++ program to yield a reasonable basis for testing. A comprehensive testsuite for the decimal floating point semantics is probably more involved but such testsuites can be shared with other languages also requiring support for decimal floating point support, e.g., C, ECMA Script, etc. Testing the various C++ interfaces, i.e., the languge specific parts which can’t be shared, shouldn’t be more involved than other C++ libraries.

Cost of Support

The semantics of decimal floating points is very similar in spirit to the semantics of binary floating points. The primary difference is that the base is decimal rather than binary. The major difference between binary and decimal floating points is that the latter are not normalized, i.e., individual decimal values may have multiple representations (a group of different representations for the same value is refered to as cohort by IEEE 754–2008). The freedom can be used to keep track of the precision of values and needs to be maintained during rounding. However, the overall complexity of the decimal floating point semantics are on a similar level as those of binary floating points. They are not dramatically more complex as is the case, e.g., with the special math functions. Staff capable of providing support for use of binary floating points will be able to also provide support for decimal floating points. To some extent, using and providing support for decimal floating points is easier than for binary floating points because all issues relating to base conversions disappear.

Changes to the Decimal TR

The Decimal TR was targetting C++ 2003 and, thus, didn’t use any of the new C++ 2011 features. Several of the new features help in creating a better user experience and the specification in the Decimal TR needs to be updated to take these into account. This section describes the changes proposed to the Decimal TR.

Standard Layout Types

C++ 2003 didn’t have any concept of standard-layout types and it was impossible to make declared default constructors trivial to take advantage of POD types. In C++ 2011 the restrictions on types which can be treated special are relaxed and standard-layout types are defined which support types with private non-static data members. Standard-layout types are, e.g., needed when communicating with other language. Thus, all decimal types will be required to be standard-layout types.

Defaulted Default Constructors

To make a decimal type a POD type it needs to be a standard-layout type and a trivial class. Since there are several non-trivial constructors in each of the decimal types it is necessary to declare the default constructor. To keep the class trivial the default constructors need to be defaulted on the first declaration. The corresponding declarations will be changed to become

decimal32() = default;
decimal64() = default;
decimal128() = default;

The Decimal TR. couldn’t make the decimal types trivial because there was no way for C++ 2003 to make an user-declared default constructor trivial. Instead, the Decimal TR defined the default constructor to initialize the decimal floating point with a zero value. For decimal floating points there is a large cohort of zero values and whichever zero is chosen is unlikely to be the right one in practice. Thus, using an explicitly defaulted default constructor is a semantic change possibly resulting in non-initialized decimal floating points but the advantages of this change seem to outweight the disadvantages.

Explicit Conversion Operators

The decimal floating-point types all have a conversion operator to long long obtaining the value truncated towards zero. This conversion yields an unspecified result when the integral part cannot be represented by long long or if the decimal type represents one of the special values. Note, that the Decimal TR used the type long long although it was introduced only with C++ 2011. This was done to avoid compatibility issues between an implementation based on the TR and an implementation augmented to use new C++ 2011 features.

Although the conversion is sometimes useful it shouldn’t be implicit, i.e., these conversion operators will be made explicit:

explicit operator long long() const;

The behavior of these conversion operators will remain unchanged. Making the conversion explicit introduces an inconsistency with the existing floating point types float, double, and long double: These can be converted implicitly to integer types. Since the implicit conversions from floating point types to integers frequently introduce surprises it seems to be reasonable to make the conversion explicit for newly introduced types.

Section 4.2 (Conversions) of the Decimal TR describes how decimal floating-point types can be converted to basic floating types using a cast in C. Since implicit conversion between decimal floating-point types and basic floating types can easily create problems corresponding conversions are not available in the Decimal TR. With the possibility of disabling implicit conversions corresponding explicit conversions should be added:

explicit operator float() const;

Returns: If std::numeric_limits<float>::is_iec559 == true, returns the result of the conversion of *this to float, performed as in IEEE 754–2008. Otherwise, the returned value is implementation-defined.

explicit operator double() const;

Returns: If std::numeric_limits<double>::is_iec559 == true, returns the result of the conversion of *this to double, performed as in IEEE 754–2008. Otherwise, the returned value is implementation-defined.

explicit operator long double() const;

Returns: If std::numeric_limits<long double>::is_iec559 == true, returns the result of the conversion of *this to long double, performed as in IEEE 754–2008. Otherwise, the returned value is implementation-defined.

Whether the various decimal*_to_*() conversion functions used by the current Decimal TR. are retained needs to be decided. In some contexts it may be preferrable to use named functions. For example, the conversion operators are not necessarily suitable to be used as function objects. On the other hand, it is easy to create corresponding function objects using the explicit conversions.

Make the Decimal Types final

Although the decimal floating-point types are described as a library feature, some restrictions are imposed on them to allow implementing these types as built-in types. In particular, Section 2 (Conventions) states that the result of deriving from the decimal floating-point types is undefined. Instead of making this behavior undefined, all of the decimal floating-point types should be made final to prevent deriving:

class decimal32 final { ... };
class decimal64 final { ... };
class decimal128 final { ... };

Use of other operations capable of detecting if the type is implemented as a class or is a built-in type will remain undefined.

Exception Specifications

Many of the operations on decimal floating-point types have wide contracts and, thus, cannot throw any exception. Where appropriate the corresponding operations should be declared to be noexcept(true).

Note that the Decimal TR refers to “raising floating-point exceptions”. This doesn’t necessarily throw a C++ exception but may just setup an indication that a specific condition occurred. However, an implementation may choose to implement a mode of operation where C++ exceptions are thrown as a result of raising certain floating-point exceptions. Thus, the use of noexcept(true) probably won’t apply to many operations.

Constant Expressions

C++ 2011 added the ability to create constexpr functions. It may be desirable to turn certain operations into constexpr and it should be explicitly permitted to do so. In general, the operations shouldn’t be mandated to be constexpr because the semantics of many operations depend on run-time setting, e.g., because they use the rounding mode. On the other hand, the use of constexpr operations is especially desirable, e.g., because constant initialization is preformed prior to any dynamic initialization (3.6.2, [basic.start.init], paragraph 2), thereby avoiding any issues relating to the order of initialization.

To really support the use of constant expressions for decimal floating-point types it is necessary to restrict the semantics of the operations. In particular, the operations need to be independent of the floating point environment (however, it seems ISO/IEC 60559 requires a way to specify the rouding mode to be used when computing constants). The conditional availability of the floating point environment would raise the requirement that functions can be overloaded on constexpr arguments, for example (in C++ 2011 it is not possible to overload these two functions):

constexpr decimal64 operator+ (constexpr decimal64 d1,
                               constexpr decimal64 d2);
decimal64           operator+ (decimal64 d1,
                               decimal64 d2);

The first function would be used if the arguments d1 and d2 are constant expressions, otherwise the other function would be used. The implementations of the version using constant expressions wouldn’t raise any floating point exception and wouldn’t depend on the dynamically specified floating point context. The implementations of both functions would do similar operations but possibly in vastly different ways. For example, the constant expression version would use a software implementation while the other version could be implemented to take advantage of hardware support for decimal floating points. However, corresponding support isn’t available in C++ 2011.

Not having constexpr support yields a viable library. If it is controversial to add constexpr it is probably safest to allow the use of constexpr but not to mandate it.

Literal Suffixes

Section 4.1 (Literals) of the Decimal TR mentions that C uses literal suffixes for easy creation of decimal floating-point types. With the ability to define user-define literals a similar mechanism can be provided in C++. That is, the following operators should be added:

template <char... C> constexpr decimal32 operator "" DF();
template <char... C> constexpr decimal64 operator "" DD();
template <char... C> constexpr decimal128 operator "" DL();
template <char... C> constexpr decimal32 operator "" df();
template <char... C> constexpr decimal64 operator "" dd();
template <char... C> constexpr decimal128 operator "" dl();

It may be desirable to mandate that the return types of these operators are constexpr. However, the implementations aren’t necessarily trivial. If mandated use of constexpr is controversial the support should only be allowed and not mandated.

Decimal Formatting

Decimal floating points support a feature not available for binary floating points: They can represent the precision of the original number, i.e., they can keep track of trailing zeros after the decimal point (unless the number of digits would exceed the number of decimals for the decimalfloating point). To support a choice of formatting the number using its own precision the C Decimal TR uses the %a and %A format specifiers which use the optionally present precision to restrict the formatted to number to a maximum number of digits.

Table 88 in [facet.num.put.virtuals] paragraph 5 already specifies that the format specifiers %a and %A are used for floating point conversions when the floatfield is set to std::ios_base::fixed | std::ios_base::scientific (this is used to format binary floating point numbers using a hexadecimal format). The Decimal TR additionally expands the table for length modifiers to support the modifiers H, D, and DD for decimal32, decimal64, and decimal128, respectively. Thus, using std::ios_base::fixed | std::ios_base::scientific results in formatting decimal floating points taking their own precision into account when being formatted.

Unfortunately, this paragraph specifies that the precision (str.precision()) is only specified for floating pont types if floatflied != std::ios_base::fixed | std::ios_base::scientific. However, it is desirable optionally impose an upper bound on the used precision. One way to address this problem is to change the corresponding paragraph to become

For conversion from a floating type, if floatfield != (ios_base::fixed | ios_base::scientific) or if a decimal floating-point type is formatted and 0 < str.presision(), str.precision() is specified in the conversion specification. Otherwise, no precision is specified.

With this change the currently set precision would be taken into account when formatting decimal floating points. When str.precision() == 0 and floatfield is set to std::ios_base::fixed | std::ios_base::scientific no precision would be specified with the %a or %A specifiers when formatting a decimal floating point, i.e., its own precision is used. Note, that using a non-zero precision with the %a or %A format specifier affects all digits, not just the fractional digits (this is consistent with the way the %g and %G format specifiers work).

When setting floatfield to std::ios_base::fixed | std::ios_base::scientific it would be desirabe that the default precision used is the decimal floating point’s own precision. This would imply that str.precision() == 0 and it seems unlikely that the default for str.precision() is changed. An alternative approach could be to use a new attribute on std::ios_base, e.g., decimal_precision(), which is used with formatting of decimal floating points and whose initial value is 0. To avoid any additional memory overhead this attribute could be accessed using the std::ios_base::iword().

Independent on how the precision is set, it may also be worth to add an alias for std::hexfloat which gives the operation a name meaningful in the context of decimal floating points. For example, std::decimal::ownprecision may be added which also sets the floatfield to std::ios_bse::fixed | std::ios_based::scientific.