Document number: P0586R1
Date: 2018-08-17
Project: Programming Language C++
Reply-to: Federico Kircheis
Audience: Library Evolution Working Group

Safe integral comparisons

I. Table of Contents

II. Motivation

Comparing integrals of different types may be a more complex task than expected. Most of the time we expect that a simple

	if(a < b){
		// ...
	} else {
		// ...
	}

should work in all cases, but if a and b are of different types, things are more complicated.

If a is a signed type, and b unsigned, then, supposing that no integral promotion is taking place, a is converted to the unsigned type. If a holds a number less than zero, then the result may be unexpected, since the expression a < b would evaluate to false, even though a strictly negative number is always lower than a positive one. The reason of this behavior is that unsigned types have modular arithmetic, but most of the time, for example when working with containers, when mixing signed and unsigned types, we want to have integer arithmetic. Also, converting integrals between different types can be challenging. For simplicity, most of the time we assume that values are in range, and write

	a = static_cast<decltype(a)>(b);

If we want to write a safe conversion, we need to check if b has a value between std::numeric_limits<decltype(a)>::min() and std::numeric_limits<decltype(a)>::max(). We also need to pay attention that no implicit conversion (for example between unsigned and signed types) invalidates our comparison.

Comparing and converting numbers, even of different numeric types, should be a trivial task. Unfortunately it is not, and because of implicit conversions we may write, without noticing it, unsafe code.

Most compilers are able to provide diagnostics and generate warnings when comparing values of different types, or when doing a narrowing conversion.

Developers are tempted to assume that values will mostly be in range and write a simple, but possibly wrong, cast in order to silence the warning, or not to turn on the corresponding compiler warning at all.

III. Proposal

This paper proposes to add a set of constexpr and noexcept functions for converting and comparing integrals of different signedness, except for bool and character types.

IV. Examples

Examples without current proposal

Comparing an unsigned int with an int:

	int a = ...
	unsigned int b = ...
	// add static_cast to avoid compiler warnings since we are doing a "safe" comparison
	if(a < 0 || static_cast<unsigned int>(a) < b){
		// do X
	} else {
		// do Y
	}

Comparing a uint32_t with an int16_t:

	int32_t a = ...
	uint16_t b = ...
	// add static_cast to avoid compiler warnings since we are doing a "safe" comparison
	if(a < static_cast<int32_t>(b)){
		// do X
	} else {
		// do Y
	}

Comparing an int with an intptr_t:

	int a = ...
	intptr_t b = ...
	if(???){ // no idea how to do it in one readable line without some assumption about int and intptr_t
		// do X
	} else {
		// do Y
	}

Example with current proposal

Comparing one integral type A with another integral type B (both non bool or character type):

	A a = ...
	B b = ...
	// no need for any cast since std::cmp_less is taking care of everything
	if(std::cmp_less(a,b)){
		// do X
	} else {
		// do Y
	}

V. Possible implementation

A possible implementation can be found on github. The only dependencies are the std::numeric_limits function from the limits header, some traits from the type_traits header and a standard conforming C++11 compiler.

VI. Effects on Existing Code

Since the proposed functions are not defined in any standard header, the meaning of no existing code will be changed.

VII. Design Decisions

This proposal addresses how to compare numerical values of different types (aka standard integer types and extended integer types) in a safe and simple way. It makes little sense to compare true, false, 'a' and other characters to numbers, since they represent different logical entities. The encoding of characters is also not specified, therefore the possible valid comparison 'a' == 97 might yield different results depending on the locale, compiler or platform.

Providing an overload for char might not reduce confusion, for example:

	int32_t a = ...
	char c = -1;
	cmp_less(c, 0) // true if char is signed, false if char is unsigned.

If the user has to choose between signed char or unsigned char, the behaviour will always be consistent. Using char for storing a number is a valid use case (the language permits it), but the types signed char and unsigned char should be preferred since those are standard integer types and have the same size.

I would also recommend not to provide overloads for bool and the character types because it is easier to add them later if needed, whereas removing them might be more difficult since it would be a breaking change.

If the LEWG would like to include char, I think it would be better to provide an overload for every character type for consistency.

VIII. Further Considerations

I've heard rumors that it might be possible that the current operator< et al. could get deprecated and maybe changed someday to behave like the functions proposed in this proposal.
I would like to add some considerations:

In 2016, Robert Ramey did a much bigger proposal (see p0228r0) regarding safe integer types. He also used functions similar to those proposed in this paper for implementing his classes and operators, so an alternative implementation can be found on his github repository. This proposal addresses a smaller problem, namely comparing integral values, and is therefore much smaller.
The functions provided can be also used for creating safe integer types.

Another work, by Herb Sutter (see p0515r3), is about a new comparison operator (<=>). In its current state the operator<=> will not compare different integral types, but in a previous revision as far as I've understood, the proposal stated that operator<=> should compare different integral types without modulo behaviour making part of this proposal obsolete.

X. Proposed Wording

This section presents the wording changes for P0586R1. Any differences in semantics are unintentional. n4659 has been used as reference.

During the meeting at Rapperswil the committee expressed the idea to use the function names of the spaceship operator (is_eq, is_neq, is_lt, ..., see p0515), and use for the spaceship operator some more verbose function name. Since the functions used by the spaceship operator should not appear often since they are use behind the scenes, whereas the functions in this proposal needs to get called explicitly, such a change would have the benefit to provide a short and concise name that can improve the readability. I did not rename the functions of this proposal with the function names of the spaceship operator in order to avoid confusion.

X.a Proposed Wording with long names

In 23.2.1 Header <utility> synopsis, add declarations:

	// 23.2.10, safe integral comparisons
	template <typename R, typename T>
	constexpr bool in_range(const T t) noexcept;

	template <typename T, typename U>
	constexpr bool cmp_equal(const T t, const U u) noexcept;
	template <typename T, typename U>
	constexpr bool cmp_not_equal(const T t, const U u) noexcept;

	template <typename T, typename U>
	constexpr bool cmp_less(const T t, const U u) noexcept;
	template <typename T, typename U>
	constexpr bool cmp_greater(const T t, const U u) noexcept;
	template <typename T, typename U>
	constexpr bool cmp_less_equal(const T t, const U u) noexcept;
	template <typename T, typename U>
	constexpr bool cmp_greater_equal(const T t, const U u) noexcept;

Add a new Section 23.2.10, safe integral comparisons, with following content:

  1. For each of the following functions, if either of `T` or `U` is not
  a standard integer type or an extended integer type, as specified in 6.9.1,
  the call is ill-formed.
  [Note: std::byte, char, char16_t, char32_t, wchar_t, and bool are not
         comparable with these functions. --end note]

  template <typename T, typename U>
  constexpr bool cmp_equal(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t == u`. Otherwise, if `t` or `u` is negative, returns
      `false`. Otherwise, if `T` is a signed type, constructs from `t`
      a value `tu` of the corresponding unsigned type and returns
      `tu == u`.  Otherwise, if `U` is a signed type, constructs from
      `u` a value `uu` of the corresponding unsigned type and returns
      `t == uu`.


  template <typename T, typename U>
  constexpr bool not_equal(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t != u`. Otherwise, if `t` or `u` is negative, returns
      `true`. Otherwise, if `T` is a signed type, constructs from `t`
      a value `tu` of the corresponding unsigned type and returns
      `tu != u`.  Otherwise, if `U` is a signed type, constructs from
      `u` a value `uu` of the corresponding unsigned type and returns
      `t != uu`.


  template <typename T, typename U>
  constexpr bool cmp_less(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t < u`. Otherwise, if `t` is negative, returns `true`.
      Otherwise, if `u` is negative, returns `false`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu < u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t < uu`.


  template <typename T, typename U>
  constexpr bool cmp_greater(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t > u`. Otherwise, if `t` is negative, returns `false`.
      Otherwise, if `u` is negative, returns `true`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu > u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t > uu`.


  template <typename T, typename U>
  constexpr bool cmp_less_equal(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t <= u`. Otherwise, if `t` is negative, returns `true`.
      Otherwise, if `u` is negative, returns `false`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu <= u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t <= uu`.


  template <typename T, typename U>
  constexpr bool cmp_greater_equal(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t >= u`. Otherwise, if `t` is negative, returns `false`.
      Otherwise, if `u` is negative, returns `true`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu >= u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t >= uu`.


  template <typename R, typename T>
  constexpr bool in_range(T t) noexcept;

    Returns:
      Returns the same value of
      `cmp_greater_equal(t, std::numeric_limits<R>::min()) &&
       cmp_less_equal(t, std::numeric_limits<R>::max())`

				
In case the LEWG would like to include char in the argument set, replace
1.
with
  1. For each of the following functions, if either of `T` or `U` is not
  a standard integer type or extended integer type, as defined in
  6.9.1, and not char, the call is ill-formed.  If the implementation
  defines `char` to be a signed type, its corresponding unsigned type,
  in the following, is `unsigned char`.
  [Note: std::byte, char16_t, char32_t, wchar_t, and bool are not
         comparable using these functions. --end note]
				

X.b Proposed Wording with short names

In 23.2.1 Header <utility> synopsis, add declarations:

	// 23.2.10, safe integral comparisons
	template <typename R, typename T>
	constexpr bool in_range(const T t) noexcept;

	template <typename T, typename U>
	constexpr bool is_eq(const T t, const U u) noexcept;
	template <typename T, typename U>
	constexpr bool is_neq(const T t, const U u) noexcept;

	template <typename T, typename U>
	constexpr bool is_lt(const T t, const U u) noexcept;
	template <typename T, typename U>
	constexpr bool is_gt(const T t, const U u) noexcept;
	template <typename T, typename U>
	constexpr bool is_lteq(const T t, const U u) noexcept;
	template <typename T, typename U>
	constexpr bool is_gteq(const T t, const U u) noexcept;

Add a new Section 23.2.10, safe integral comparisons, with following content:

  1. For each of the following functions, if either of `T` or `U` is not
  a standard integer type or an extended integer type, as specified in 6.9.1,
  the call is ill-formed.
  [Note: std::byte, char, char16_t, char32_t, wchar_t, and bool are not
         comparable with these functions. --end note]

  template <typename T, typename U>
  constexpr bool is_eq(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t == u`. Otherwise, if `t` or `u` is negative, returns
      `false`. Otherwise, if `T` is a signed type, constructs from `t`
      a value `tu` of the corresponding unsigned type and returns
      `tu == u`.  Otherwise, if `U` is a signed type, constructs from
      `u` a value `uu` of the corresponding unsigned type and returns
      `t == uu`.


  template <typename T, typename U>
  constexpr bool is_neq(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t != u`. Otherwise, if `t` or `u` is negative, returns
      `true`. Otherwise, if `T` is a signed type, constructs from `t`
      a value `tu` of the corresponding unsigned type and returns
      `tu != u`.  Otherwise, if `U` is a signed type, constructs from
      `u` a value `uu` of the corresponding unsigned type and returns
      `t != uu`.


  template <typename T, typename U>
  constexpr bool is_lt(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t < u`. Otherwise, if `t` is negative, returns `true`.
      Otherwise, if `u` is negative, returns `false`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu < u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t < uu`.


  template <typename T, typename U>
  constexpr bool is_gt(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t > u`. Otherwise, if `t` is negative, returns `false`.
      Otherwise, if `u` is negative, returns `true`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu > u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t > uu`.


  template <typename T, typename U>
  constexpr bool is_lteq(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t <= u`. Otherwise, if `t` is negative, returns `true`.
      Otherwise, if `u` is negative, returns `false`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu <= u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t <= uu`.


  template <typename T, typename U>
  constexpr bool is_gteq(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t >= u`. Otherwise, if `t` is negative, returns `false`.
      Otherwise, if `u` is negative, returns `true`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu >= u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t >= uu`.


  template <typename R, typename T>
  constexpr bool in_range(T t) noexcept;

    Returns:
      Returns the same value of
      `is_gteq(t, std::numeric_limits<R>::min()) &&
       is_lteq(t, std::numeric_limits<R>::max())`

				
In case the LEWG would like to include char in the argument set, replace
1.
with
  1. For each of the following functions, if either of `T` or `U` is not
  a standard integer type or extended integer type, as defined in
  6.9.1, and not char, the call is ill-formed.  If the implementation
  defines `char` to be a signed type, its corresponding unsigned type,
  in the following, is `unsigned char`.
  [Note: std::byte, char16_t, char32_t, wchar_t, and bool are not
         comparable using these functions. --end note]