N1535=03-0118, Random Number Generators Issues List

Document number:   N1535=03-0118
Date:   November 14, 2003
Project:   Programming Language C++
Reference:   ISO/IEC IS 14882:1998(E)
Reply to:   Pete Becker
  Dinkumware, Ltd.
  petebecker@acm.org


Contents

1 Confusing Text in Description of v.min()
2 Confusing and Incorrect Text in Description of v.max()
3 Table "Number Generator Requirements" Unnecessary
4 Should a variate_generator Holding a Reference Be Assignable?
5 Normal Distribution Incorrectly Specified
6 Should Random Number Initializers Take Iterators by Reference or by Value?
7 Are Global Operators Overspecified?
8 Should the Template Arguments Be Restricted to Built-in Types?
9 Do Engines Need Type Arguments?
10 Unclear Complexity Requirements for variate_generator
11 xor_combine Over-generalized?
12 xor_combine::result_type Incorrectly Specified
13 subtract_with_carry's IntType Overpecified
14 subtract_with_carry_01::seed(unsigned) Missing Constaint
15 subtract_with_carry_01::seed(unsigned) Produces Bad Values
16 subtract_with_carry_01::seed(unsigned) Argument Type Too Small
17 subtract_with_carry::seed(In&, In) Required Sequence Length Too Long
18 linear_congruential -- Giving Meaning to a Modulus of 0
19 linear_congruential::seed(IntType) -- Modify Seed Value When c == 0?
20 linear_congruential -- Should the Template Arguments Be Unsigned?
21 linear_congruential::linear_congruential(In&, In) -- Garbled Requires Clause
22 bernoulli_distribution Isn't Really a Template
23 Streaming Underspecified


Active Issues


1. Confusing Text in Description of v.min()

Section:   5.1.1 [tr.rand.req], Table 5.2
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: In "Uniform Random Number Requirements" the text says that v.min() "Returns ... l where l is ...". This is the letter ell, which is too easily confused with the numeral one. Can we change it to something less confusing, like "lim"?

Jens Maurer says: Sure.

Proposed Resolution:

Change the first sentence of the description of v.min() in 5.1.1 [tr.rand.req], Table 5.2 (Uniform random number generator requirements) from:

Returns some l where l is less than or equal to all values potentially returned by operator().

to:

Returns a value that is less than or equal to all values potentially returned by operator().

2. Confusing and Incorrect Text in Description of v.max()

Section:   5.1.1 [tr.rand.req], Table 5.2
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: In "Uniform Random Number Requirements" the text says that v.max() "returns l where l is less than or equal to all values...". Should this be "greater than or equal to"? And similarly, should "strictly less than" be "strictly greater than."?

Jens Maurer says: Yes, cut&paste goof from v.min().

Pete Becker says: And, as in the previous issue, l is unclear.

Proposed Resolution:

Change the first sentence of the description of v.max() in 5.1.1 [tr.rand.req], Table 5.2 (Uniform random number generator requirements) from:

If std::numeric_limits<T>::is_integer, returns l where l is less than or equal to all values potentially returned by operator(), otherwise, returns l where l is strictly less than all values potentially returned by operator().

to:

If std::numeric_limits<T>::is_integer, returns a value that is greater than or equal to all values potentially returned by operator(), otherwise, returns a value that is strictly greater than all values potentially returned by operator().

3. Table "Number Generator Requirements" Unnecessary

Section:   5.1.1 [tr.rand.req], Table 5.1
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: The table "Number Generator Requirements" has only one entry: X::result_type. While it's true that random nunber generators and random distributions have this member, it doesn't seem like a useful basis for classification -- there's nothing in the proposal that depends on knowing that some type satisfies this requirement. I think the specification of X::result_type should be in "Uniform Random Number Generator Requirements" and in "Random Distribution Requirements."

Proposed Resolution:

Copy the description of X::result_type from 5.1.1 [tr.rand.req], Table 5.1 (Number generator requirements) to 5.1.1 [tr.rand.req], Table 5.2 (Uniform random number generator requirements) and to 5.1.1 [tr.rand.req], Table 5.4 (Random distribution requirements) and remove 5.1.1 [tr.rand.req], Table 5.1 (Number generator requirements).


4. Should a variate_generator Holding a Reference Be Assignable?

Section:   5.1.3 [tr.rand.var]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: The third paragraph says, in part:

Specializations of variate_generator satisfy the requirements of CopyConstructible. They also satisfy the requirements of Assignable unless the template parameter Engine is of the form U&.

I haven't thought this through carefully, but this looks like an implementation artifact. Is there a reason that variate_generators whose engine type is a reference should not be copied?

Jens Maurer says: Not that I remember, except the implementation hickup. As you point out, that can be easily worked around. I need to think a little more on that.

Proposed Resolution:

Change the first two sentences of the third paragraph of 5.1.3 [tr.rand.var] from:

Specializations of variate_generator satisfy the requirements of CopyConstructible. They also satisfy the requirements of Assignable unless the template parameter Engine is of the form U&.

to:

Specializations of variate_generator satisfy the requirements of CopyConstructible and Assignable.

5. Normal Distribution Incorrectly Specified

Section:   5.1.7.8 [tr.rand.dist.norm]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: For normal_distribution, the paper says that the probability density function is 1/sqrt(2*pi*sigma) * exp(- (x - mean)^2 / (2 * sigma^2)). The references I've seen have a different initial factor, using 1/(sqrt(2*pi) * sigma). That is, sigma is outside the square root. Should we change the description in the paper to match?

Marc Paterno says: The correct probability density function is as Pete gives it, and the paper should be changed to match.

Proposed Resolution:

Change the first paragraph of 5.1.7.8 [tr.rand.dist.norm] from:

A normal_distribution random distribution produces random numbers x distributed with probability density function (1/sqrt(2*pi*sigma))e-(x-mean)2/(2*sigma2), where mean and sigma are the parameters of the distribution.

to:

A normal_distribution random distribution produces random numbers x distributed with probability density function (1/(sqrt(2*pi)*sigma))e-(x-mean)2/(2*sigma2), where mean and sigma are the parameters of the distribution.

6. Should Random Number Initializers Take Iterators by Reference or by Value?

Section:   5.1.4.1 [tr.rand.eng.lcong]
  5.1.4.2 [tr.rand.eng.mers]
  5.1.4.3 [tr.rand.eng.sub]
  5.1.4.4 [tr.rand.eng.sub1]
  5.1.4.5 [tr.rand.eng.disc]
  5.1.4.6 [tr.rand.eng.xor]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says:The random number generators can be initialized and seeded by passing two iterators that specify a range of values. The first iterator is passed by reference and not by value. The reason for doing this is things like xor_combine, which holds two generators, and needs to initialize both generators from a single sequence:

template <class _Engine1, class _Engine2> // simplified
    class xor_combine
    {
    public:
        template <class _InIt>
        xor_combine(_Init& _First, _InIt _Last)
        	: _Eng1(_First, _Last), _Eng2(_First, _Last) {}
    private:
        _Engine1 _Eng1;
        _Engine2 _Eng2;
    };

With this, _Eng1 is initialized with the values at the head of the sequence, and _Eng2 is initialized with the values that follow the ones that _Eng1 consumed.

That looks like a good idea, but there's a problem. It means that the following code is ill-formed:

unsigned long init[] = { 1, 2, 3 };
minstd_rand rng0(init, init + 3);	// illegal, init not a modifiable lvalue
minstd_rand rng1;
rng1.seed(init, init + 3);		// illegal, init not a modifiable lvalue

It seems to me that this is what most users will try first (I did), and it's surprising that it fails. It also seems like it would be the most common form of initialization if it worked.

This can be made to work by passing the first iterator by value, and changing seed to return an iterator in the usual way: it returns the iterator that points to the element just past the last one it used. With just this change, the constructor for xor_combine looks like this:

template <class _InIt>
xor_combine(_InIt _First, _InIt _Last)
	: _Eng1, _Eng2
	{
	_InIt _Next = _Eng1.seed(_First, _Last);
	_Eng2.seed(_Next, _Last);
	}

An obvious drawback with this approach is that each contained generator gets default initialized, then re-seeded. If that's a big enough problem (seeding some generators might be computationally expensive), we could add an additional constructor to each generator. The constructor is called with a value of some designated type, and doesn't do any more initialization than is necessary to make the object destructible:

class rng_no_init_t {} rng_no_init;

template <class _InIt>
xor_combine(_InIt _First, _InIt _Last)
	: _Eng1(rng_no_init), _Eng2(rng_no_init)
	{
	_InIt _Next = _Eng1.seed(_First, _Last);
	_Eng2.seed(_Next, _Last);
	}

Of course, that constructor would only be used in this sort of situation, where re-seeding will occur before any actual use of the generator.

Jens Maurer says: I don't know how to put it in standardese, but the idea is that the unsuspecting user should use the one-argument integer constructor, i.e. minstd_rand rng0(42), to likely get reasonable seeding. All base engines provide a constructor like this. The (it1,it2) constructors were meant for "expert" users that wanted fine-grained control over seeding. It appears unlikely that those "expert" users would use a fixed C-like array of seeds and could not be bothered to introduce an additional local variable for the lvalue it1.

Let's investigate what [the suggested change] means:

rng_no_init_t has to be accessible by the compound engine's constructor definition (e.g. xor_combine) and the declaration of the constructor in the base engine (e.g. _Eng1). Together with the prospect of the end user writing additional compound or base engines that should combine freely with the library-provided ones, that means that rng_no_init_t cannot be an implementation defined or class- private type. It has to be publicy specified.

In order to allow free combination of engines, the "pseudo-random number engine" requirements must require, for each engine, a constructor taking an rng_no_init_t. The operator(), together with << and >> streaming operators, would then be specified to have undefined behavior until the next successful call to seed(), if the object was constructed using rng_no_init_t.

While this can certainly be done, it leaves the odd feeling that we would deliberately introduce a constructor that starts the lifetime of the object, but leaves the object semantically unusable (internal state uninitialized). This is not what I thought user-defined constructors were meant for. There is the potential for misuse by the user, if he constructs an object with the public rng_no_init_t constructor without calling a seed() function afterwards. Calls to operator() would likely return apparently "random" numbers, because on most platforms, accessing uninitialized variables means reading "random" garbage. However, none of the assertions of the engine's specification would be met. It is possible that this misuse is not detected except by careful code review. On the other hand, the misuse in your first example is caught by the compiler. Problems caught by the compiler are cheapest to fix, compared to runtime problems.

I'm not particularly fond of the current (it1,it2) interface, either, but, given the constraints of constructors, it's the only interface I could come up with (see discussion in section III.E of N1452).

I would appreciate hearing other people's opinion on this, in particular the perceived "ugliness" of the long[] initialization in the example above.

Pete Becker says: With regard to "expert" users, it's not so much a matter of could not be bothered, as it is find it surprising. It's a range, and we don't require an lvalue as the first iterator to a range anywhere else.

It may be that this is intended for expert users, but STL-conscious folks will see iterators and use them.

My suggestion uses a fairly common idiom, called "two-phase initialization." I agree it's better to avoid it, but I don't see any other way to avoid the non-intuitive semantics of requiring an lvalue for an iterator.

With regard to the "ugliness" of the initialization, here's the way to do it currently:

    unsigned long init[] = { 1, 2, 3 };
    unsigned long *lval_init = init;
    minstd_rand rng(lval_init, init + 3);

Here's what I'd like to be able to do:

    unsigned long init[] = { 1, 2, 3 };
    minstd_rand rng(init, init + 3);

Peter Dimov says: This is, to me, an interesting issue WRT the move/forwarding proposal.

Part of the problem is that It& will not bind to rvalues, as in the example below:

    vector<unsigned long> v;
    // read v
    minstd_rand rng(v.begin(), v.end());

This can be fixed by making the first argument It && when, or if, the rvalue reference is accepted.

The second half of the problem is that in

    unsigned long init[] = { 1, 2, 3 };
    minstd_rand rng(init, init + 3);

init is an lvalue of type unsigned long [3], and init + 3 is an rvalue of type unsigned long *; even an rvalue reference-based interface will fail deduction.

This can be fixed by, for example, adding a constructor that takes T (&)[N].

Pete Becker says: Or by writing the code correctly <g>:

    minstd_rand rng(&init[0], init + 3);

In any case, my concern is to be able to write code that looks like the analogous use of sequences in other parts of the library.

Bjarne Stroustrup says: and that's an important concern.


7. Are Global Operators Overspecified?

Section:   5.1.1 [tr.rand.req]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: Table 5.3 (Pseudo-random number engine requirements) says that for every engine, each of the following must be valid:

    x == y
    x != y
    os << x
    is >> x

In addition, though, each of the four engines is specified as having a global operator for each of these operations, taking the engine type as appropriate.

Technically that means that I can't, for example, implement subtract_with_carry and subtract_with_carry_01 using a single template (suitably parametrized) as a base class, and supply the appropriate global operators for that base template. Some smart-aleck conformance test writer <g> will check for the precise signature, find that it's not there, and report a conformance failure.

So, do we want just the operational requirement that x == y must work, or do we want the more specific requirement that for each engine type there must be a corresponding operator==?

Marc Paterno says: I think the ability to test for equality (or inequality) is important, but that the way it is implemented should be up to the implementer. So I'm happy with the operational requirement that x == y must work, but see no need to require any specific signature.

Peter Dimov:

Peter Dimov says: If

    template<class T> bool operator==(T const &, T const &) { return false; }

is in scope,

    x == y

will go to that template instead of the base class version.

One possible use for introducing such an operator is to use x == y in cases where decltype(x) does not have an operator==. This technique is very rare and fragile, though.

Pete Becker says: I wouldn't have put it so delicately. I have no sympathy for someone who deliberately breaks inheritance relationships with that sort of $"#*$*&. Templates are a delicate tool, not a sledgehammer.

Peter Dimov says: I'm not sure whether I expressed myself well. I'll try again with code.

    namespace hidden
    {
    template<class T> bool operator==(T const &, T const &)
        { return false; }
    }

    template<class X> bool compare(X const & x, X const & y)
    {
    using namespace hidden;
    return x == y;
    }

    struct V {}; // no operator==

    #include <iostream>

    int main()
    {
    V v1, v2;
    // v1 == v2; // error

    std::cout << compare(1, 1) << std::endl;
    std::cout << compare(v1, v2) << std::endl;
    }

So the "evil" template operator== is delicately applied in compare (being selectively made visible only at the scope of the function) to build a comparison that yields false when there is no operator== defined for the type.

I'll leave it to others to judge whether this technique is worth supporting.

Pete Becker says: Doesn't change my reaction. It's still a sledgehammer. There are things that library designers ought to take into account and things that users should know are risky. Writing templates as if the language did not have inheritance is risky. Anyone who does this will have to pick up the pieces of the code that they broke.


8. Should the Template Arguments Be Restricted to Built-in Types?

Section:  
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: Generators and distributions are all parametrized on arithmetic types, named RealType, IntType, and UIntType. The proposal describes these types in terms of the operations that can be applied to them and the functions that can be called on them. So, for example,

A template parameter named IntType shall denote a type that represents an integer number. This type shall meet the requirements for a numeric type (26.1 [lib.numeric.requirements]), the binary operators +, -, *, /, % shall be applicable to it, and a conversion from int shall exist. [Footnote: The built-in types int and long meet these requirements.]

The idea, I assume, is to support user-defined types that look like integer types. Although it doesn't say so, the requirement seems to be that the algorithm be implemented with the user-defined type, since some of the parameters for the algorithm (specified as template arguments or as arguments to the ctor) are IntType, and there is no conversion required from IntType to any of the built-in types. This is a fairly severe constraint on implementors. Is it worth it?

It's easy to go astray here. For example the Boost implementation of uniform_int uses std::numeric_limts<IntType>, it uses <, ++, <=, +=, *=, ==, >. It also applies a static_cast to convert the result_type of the distribution (typedefed to IntType) into the result_type of the underlying random number generator. These operations cannot be expected to work under the constraints given above.

I'm not picking on the Boost implementation here: our implementation does the same sorts of things. The point is that specifying what an integer type ought to look like is hard. If we don't get it right there will be techniques that work just fine on built-in types but can't be used because user-defined types aren't required to support them.

Another example comes from the floating point types. Some of the distributions can be implemented with techniques that count the number of high-order 0 bits in the binary fraction that represents a value between 0.0 and 1.0. That can be fast and cheap if I know that the type being used is the platform's builtin floating point type, because I can look at the bits directly instead of using floating point math.

Of course, these optimization techniques can be applied through partial specialization, so that the optimized versions are "only" used for builtin types, and sub-optimal versions are available for user-defined types. That means that library vendors will have to supply code that's difficult to test and will never (rarely?) be used.

How important is this? Would it be sufficient to say that RealType means one of float, double, long double (which is, in essence, what we say for complex), IntType is either int or long, and UIntType is either unsigned int or unsigned long?


9. Do Engines Need Type Arguments?

Section:  
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: Issue number 8 gives reasons for restricting the type arguments to builtin types. If we do this, it turns out that the type arguments for engines are pretty much redundant. I'm not clear yet on whether they're needed for distributions, so I'm only talking here about engines.

The implementation can always determine an appropriate type from the other arguments to the engine templates. For example, mersenne_twister takes an argument that gives the number of bits in each word. The implementor is in the best position to decide what type to use to represent that number of bits. A user trying to write portable code won't ever specify any type other than unsigned long for a 32-bit mersenne twister, because portable code can't assume that any other type can hold a 32 bit value. The standard library implementor can do better, avoiding gratuitously large types.

Part of the issue here is an assumption in the proposal that these templates should be implemented using the user-specified type internally. For builtin types, the type used internally isn't detectable provided the algorithm is implemented correctly. And there are potentially large performance gains from, for example, implementing a linear congruential generator that returns 32-bit values using a 64-bit type internally. The as-if rule means that the user-specified type is essentially irrelevant to the implementation, and any good implemenation will make at least as good a choice here as the user can, and usually better.

The other place where the user-specified type appears is in the return type of operator(). So, for example, linear_congruential requires the user to supply a type that's large enough to hold values in the range from 0 to m-1. But the implementation can determine an appropriate type based on the value of m, so what's gained by requiring the user to specify this type?


10. Unclear Complexity Requirements for variate_generator

Section:   5.1.3 [tr.rand.var]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: The specification for variate_generator says

Specializations of variate_generator satisfy the requirements of CopyConstructible. They also satisfy the requirements of Assignable unless the template parameter Engine is of the form U&. The complexity of all functions specified in this section is constant. No function described in this section except the constructor throws an exception.

Taken literally, this isn't implementable. operator() calls the underlying distribution's operator(), whose complexity isn't directly specified. The distribution's operator() makes an amortized constant number of calls to the generator's operator(), whose complexity is, again, amortized constant. So the complexity of variate_generator::operator() ought to also be amortized constant.

variate_generator also has a constructor that takes an engine and a distribution by value, and uses their respective copy constructors to create internal copies. There are no complexity constraints on those copy constructors, but given that the default constructor for an engine has complexity O(size of state), it seems likely that an engine's copy constructor would also have complexity O(size of state). This means that variate_generator's complexity is at best O(size of engine's state), not constant.

I suspect that what was intended was that these functions would not introduce any additional complexity, that is, their complexity is the "larger" of the complexities of the functions that they call.


11. xor_combine Over-generalized?

Section:   5.1.4.6 [tr.rand.eng.xor]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: For an xor_combine engine, is there ever a case where both s1 and s2 would be non-zero? Seems like this would produce non-random values, because the low bits (up to the smaller of the two shift values) would all be 0.

Jens Maurer says: Yes, indeed. Good spotting. A case of over-generalization on my part.

Pete Becker says: If at least one has to be 0, then we only need one shift value, and the definition might look more like this:

    template <class _Engine1, class _Engine2, int _Shift = 0> ...

with the output being (_Eng1() ^ (_Eng2() << _Shift)).

Jens Maurer says: That requires that Eng1 is the engine associated with the 0 shift, so it enforces a specific ordering of arguments which might not match the user-desired initialization ordering with the dreadful seed(it1,it2) interface. But that appears as a minor drawback compared to the simplified interface.

See also: another argument


12. xor_combine::result_type Incorrectly Specified

Section:   5.1.4.6 [tr.rand.eng.xor]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: xor_combine has a member

 typedef typename base_type::result_type result_type;

However, it has no type named base_type, only base1_type and base2_type. So, what should result_type be?

Jens Maurer says: base1_type, by "random" choice.

(That's another argument why you may want to have the freedom of reordering the base engine types in the interface: With the current interface, it's easy to get base2_type, just reorder. When we have only one shift count, reordering changes the shifting.)

Pete Becker says: Could be. I think there are a couple of other reasonable result types, though: the larger of the two types might be appropriate, or the actual type of the computed result. Neither of these depends on the order of the engine arguments.


13. subtract_with_carry's IntType Overpecified

Section:   5.1.4.3 [tr.rand.eng.sub]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: The IntType for subtract_with_carry "shall denote a signed integral type large enough to store values up to m - 1." The implementation subtracts two values of that type, and if the result is < 0 it adds back the m, which makes the result non-negative. In fact, this also works for unsigned types, with just a small change in the implementation: instead of testing whether the result is < 0 you test whether it's < 0 or greater than or equal to m. This works because unsigned arithmetic wraps, and it makes the template a bit easier to use.

I suggest that we loosen the constraint to allow signed and unsigned types. Thus the constraint would read "shall denote an integral type large enough to store values up to m - 1."

Proposed Resolution:

Change:

The template parameter IntType shall denote a signed integral type large enough to store values up to m-1.

to:

The template parameter IntType shall denote an integral type large enough to store values up to m-1.

14. subtract_with_carry_01::seed(unsigned) Missing Constaint

Section:   5.1.4.4 [tr.rand.eng.sub1]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: The specification for subtract_with_carry::seed(IntVal) has a Requires clause which requires that the argument be greater than 0. This member function needs the same constraint.

Proposed Resolution:

Add:

Requires: value > 0

to the description of subtract_with_carry_01::seed(unsigned) in 5.1.4.4 [tr.rand.eng.sub1].


15. subtract_with_carry_01::seed(unsigned) Produces Bad Values

Section:   5.1.4.4 [tr.rand.eng.sub1]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: subtract_with_carry_01::seed(unsigned int) uses a linear congruential generator to produce initial values for the fictitious previously generated values. These values are generated as (y(i)*2^-w) mod 1. The linear congruential generator produces values in the range [0, 2147483564), which are at most 31 bits long. If the template argument w is greater than 31 the initial values generated by seed will all be rather small, and the first values produced by the generator will also be rather small. The Boost implementation avoids this problem by combining values from the linear congruential generator to produce longer values when w is larger than 32. Should we require something more like that?


16. subtract_with_carry_01::seed(unsigned) Argument Type Too Small

Section:   5.1.4.4 [tr.rand.eng.sub1]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: subtract_with_carry_01::seed(unsigned) has a default argument value of 19780503, which is too large to fit in a 16-bit unsigned int. Should this argument be unsigned long, to ensure that it's large enough for the default?

Proposed Resolution:

Change the signatures of this constructor and member function:

    subtract_with_carry_01(unsigned int value);
    void seed(unsigned int value = 19780503);

to:

    subtract_with_carry_01(unsigned long value);
    void seed(unsigned long value = 19780503);

17. subtract_with_carry::seed(In&, In) Required Sequence Length Too Long

Section:   5.1.4.3 [tr.rand.eng.sub]
  5.1.4.4 [tr.rand.eng.sub1]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: For both subtract_with_carry::seed(In& first, In last) and subtract_with_carry_01::seed(In& first, In last) the proposal says: "With n=w/32+1 (rounded downward) and given the values z0 ... zn*r-1." The idea is to use n unsigned long values to generate each of the initial values for the generator, so n should be the number of 32-bit words needed to provide w bits. Looks like it should be "n=(w+31)/32". As currently written, when w is 32, the function consumes two 32-bit values for each value that it generates. One is sufficient.

Proposed Resolution:

Change

With n=w/32+1 (rounded downward) and given the values z0 ... zn*r-1

to

With n=(w+31)/32 (rounded downward) and given the values z0 ... zn*r-1

in the description of subtract_with_carry::seed(In& first, In last) in 5.1.4.3 [tr.rand.eng.sub] and in the description of subtract_with_carry_01::seed(In& first, In last) in 5.1.4.4 [tr.rand.eng.sub1].


18. linear_congruential -- Giving Meaning to a Modulus of 0

Section:   5.1.4.1 [tr.rand.eng.lcong]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: Some linear congruential generators using an integral type _Ty also use a modulus that's equal to numeric_limts<_Ty>::max() + 1 (e.g. 65536 for a 16-bit unsigned int). There's no way to write this value as a constant of the type _Ty, though. Writing it as a larger type doesn't work, because the linear_congruential template expects an argument of type _Ty, so you typically end up with a value that looks like 0.

On the other hand, the current text says that the effect of specifying a modulus of 0 for linear_congruential is implementation defined. I decided to use 0 to mean max()+1, as did the Boost implementation. (Internally, the implementation of mersenne_twister needs a generator with a modulus like this). Seems to me this is a reasonable choice, and one that users ought to be able to rely on. Is there some other meaning that might reasonably be ascribed to it, or should we say that a modulus of 0 means numeric_limits<_Ty>::max() + 1 (suitably type-cast)?


19. linear_congruential::seed(IntType) -- Modify Seed Value When c == 0?

Section:   5.1.4.1 [tr.rand.eng.lcong]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: When c == 0 you get a generator with a slight quirk: if you seed it with 0 you get 0's forever; if you seed it with a non-0 value you never get 0. The first path, of course, should be avoided. The proposal does this by imposing a requirement on seed(IntType x0), requiring that c > 0 || (x0 % m) > 0. The boost implementation uses asserts to check this condition. The only reservation I have about this is that it can only be checked at runtime, when the only suitable action is, probably, to abort. An alternative would be to force a non-0 seed in that case (perhaps 1, for no particularly good reason). I think the second alternative is marginally better, and I suggest we change this requirement to impose a particular seed value when a user passes 0 to a generator with c == 0.

Proposed Resolution:

In the text describing void seed(IntType x0 = 1) in 5.1.4.1 [tr.rand.eng.lcong] remove the Requires clause and change the Effects clause from:

Sets the state x(i) of the engine to x0 mod m.

to:

Sets the state x(i) of the engine to (x0 == 0 && c == 0 ? 1 : x0) mod m.

20. linear_congruential -- Should the Template Arguments Be Unsigned?

Section:   5.1.4.1 [tr.rand.eng.lcong]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: This template takes three numeric arguments, a, c, and m, whose type is IntType. IntType is an integral type, possibly signed. These arguments specify the details of the recurrence relation for the generator:

    x(i + 1) := (a * x(i) + c) mod m

Every discussion that I've seen of this algorithm uses unsigned values. Further, In C and C++ there is no modulus operator. The result of the % operator is implementation specific when either of its operands is negative, so implementing mod when the values involved can be negative requires a test and possible adjustment:

   IntType res = (a * x + c) % m;
    if (res < 0)
        res += m;

If the three template arguments can't be negative the recurrence relation can be implemented directly:

   x = (a * x + c) % m;

This makes the generator faster.

Proposed Resolution:

In clause 5.1.4.1 [tr.rand.eng.lcong] replace every occurrence of IntType with UIntType and change the first sentence after the definition of the template from:

The template parameter IntType shall denote an integral type large enough to store values up to (m-1).

to:

The template parameter UIntType shall denote an unsigned integral type large enough to store values up to (m-1).

21. linear_congruential::linear_congruential(In&, In) -- Garbled Requires Clause

Section:   5.1.4.1 [tr.rand.eng.lcong]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: The Requires clause for the member template template <class In> linear_congruential(In& first, In last) got garbled in the translation to .pdf format.

Proposed Resolution:

Change the Requires clause for the member template template <class In> linear_congruential(In& first, In last) in 5.1.4.1 [tr.rand.eng.lcong] from:

Requires: c > 0 — *first ¿ 0—

to:

Requires: c > 0 || *first > 0

22. bernoulli_distribution Isn't Really a Template

Section:   5.1.7.2 [tr.rand.dist.bern]
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: The text says that bernoulli_distribution is a template, parametrized on a type that is required to be a real type. Its operator() returns a bool, with the probability of returning true determined by the argument passed to the object's constructor. The only place where the type parameter is used is as the type of the argument to the constructor. What is the benefit from making this type user-selectable instead of, say, double?

Marc Paterno says: I don't see a good reason to make bernoulli_distribution a template, and would be content to do as Pete implies, and make the type parameter a double.

Proposed Resolution:

In 5.1.7.2 [tr.rand.dist.bern], remove template <class RealType = double> from the declaration of bernoulli_distribtion, change the declaration of the constructor from:

    explicit bernoulli_distribution(const RealType& p = RealType(0.5));

to:

    explicit bernoulli_distribution(double p = 0.5);

and change the header for the subclause describing the constructor from:

    bernoulli_distribution(const RealType& p = RealType(0.5))

to:

    bernoulli_distribution(double p = 0.5)

23. Streaming Underspecified

Section:   5.1.1 [tr.rand.req], Table 5.3
Submitted by:   Pete Becker
Date:   October 16, 2003


Pete Becker says: The text in Table 5.3, "Pseudo-random Number Engine Requirements" says: os << x specifies particular settings for os.fmtflags and the fill character. We don't do that for, e.g., complex<whatever>, so this is a departure from the "usual" interface. We may well need to be more specific all around, but I think that ought to be done in the context of an overall review of inserters for compound types, and not ad hoc.

Jens Maurer says: That goes with the sentiment of the review that the requirements are not specific enough to ensure portability of engine state across platforms. Extending the requirements to be sufficiently paranoid is probably too easy to get incomplete or wrong.

Pete Becker says: I don't think it's practical to require state IO to be portable across platforms. There are two problems: first, it imposes contraints on how the algorithm is implemented; and second, it requires conversion capabilities that are not required anywhere else in the standard. I think "mere" binary IO is sufficient.

The biggest constraint that I see is that being able to write the state out from one program and read it into another requires that the state be represented in exactly the same way on all platforms. Hence the change in the implementation of subtract_with_carry from Boost 1.29 to Boost 1.30. subtract_with_carry generates a new value based on two earlier ones. \ Simplified, x(i) := x(i - s) - x(i - r) [where s < r]. The obvious implementation is to keep the last r values, and on each call, compute a new value and replace the old x[r] (which is no longer needed) with the new one, so that it will be available when its needed later on. A somewhat faster implementation (used in 1.29) precomputes r values, and steps through the array returning values whenever requested. When the array index steps off the end, new values are precomputed and the index gets reset to the beginning of the array. This simplifies the index computations (i-s has to wrap around to the top of the array), and perhaps allows unrolling of loops. Trouble is, if you do this you can't write out the state of the generator as specified in the proposal, because you've thrown it away. (Yes, you can hang on to the old data after you precompute new values, if you're willing to double the size of state data -- that's a high price to pay, especially for the mersenne twister, whose most common instantiation uses 624 32-bit values) So 1.30 removed this optimization from subtract_with_carry in order, apparently, to satisfy the IO requirements.

The conversion problem is more subtle. In order to write out a floating point value and read in exactly the same value you have to do some fairly sophisticated (i.e. big and slow) semi-numerical work. Stream inserters and extractors for floating point values are not currently required to support this degree of accuracy, and as far as I know, none do. Further, this accuracy is incompatible with fixed width formats. The general scheme is to write out as many digits as are needed to distinguish the current value from the two nearest values, so the width of the text representation of the values differs (you could compute every value to the same maximum number of digits, but those extra digits require more precision than the floating point type holds, so these extra digits require multiple-precision integer math, which is rather slow). Beyond that, the number of digits in the text representation depends on the precision of the floating point type. In order to write values that can be read on any platform, it seems to me that all implementations would have to write enough digits to represent values of the largest type available anywhere. That's not a good recipe for portability. <g>

It's not completely clear what the purpose of the inserters and extractors is. The paper talks about engines and distributions that store their state and recover it. It certainly makes sense that a computation might need to be interrupted and resumed later. That doesn't require text representations, nor exact reproducibility across platforms.

Would it make sense to instead require binary write and read functions, subject to all the usual contraints on binary IO? That would mean that data could only be read by a program compiled with the same compiler as the one that wrote it. This also eliminates the contraints on the state (the descriptions of the states are still important in defining what the algorithms do, but the "as-if" rule would then permit alternate implementations), while providing what seems to be the most commonly used operation, namely interrupting a computation and resuming it later.

Marc Paterno says: What is the reason for saving and restoring the state in a portable fashion?

An important requirement is given in III.A. "Overview on Requirements": "has a tight enough specification to get reliable cross-platform results". In our view this is essential for engines, but not for distributions. So we confine our comments to discussing the portable saving and storing of state for engines.

One of the reasons for saving state is the checkpointing of long-running calculations. For this, saving binary state is sufficient, and portability is not an issue.

Another important reason for saving state is large-scale parallelization of calculations. A calculation run on a farm (of possibly dissimilar hardware) will need not only the same checkpointing ability, but the ability to communicate the state from one node to another. This doesn't require human readability (textual representation), but it does require the ability to recover the complete state of an engine from some external representation in a portable fashion.

A third reason for saving the state is to reproduce a calculation on another platform. Distributed computing applications using random numbers for cryptography may need to generate the same sequence of numbers, and may need to share the state of the engine, again portably.

A fourth reason for saving the state is to allow for portable reproducibility of results. In order to verify, explore and extend simulation results, researchers require the ability to reproduce, from a given checkpoint, the stream of random numbers the original simulation relied upon.

We believe that this requirement does not demand that the state be represented in exactly the same way on all platforms. It only implies that an implementation-independent logical state exists. The manner in which the physical state realizes this logical state is left to the implementor.

In light of these requirements, we have reviewed the proposal, and have come to the following conclusion. To completely specify the behavior of an engine means answering the following 5 points:

1) to have the logical state of each engine specified (this seems to be done in the current proposal, perhaps with inconsistent notation);
2) to have a prescription for how to move from the current logical state to the successor logical state (each engine in the current proposal does this adequately);
3) to have a prescription to extract the random number to deliver from the engine's logical state (each engine in the current proposal does this adequately);
4) to have a prescription to map a sequence of unsigned longs to the logical state of an engine (this is implied -- by the seed function -- but insufficiently specified in the current proposal);
5) to have a prescription to map the current logical state into a sequence of unsigned longs (this requirement is not clearly specified by the current proposal).

We believe capabilities (4) and (5) will enable portable saving and restoring of state for engines. And we believe that portable saving and restoring of state is sufficiently important to be worth the effort.

Pete points out two specific cases in which he is concerned with cost implied by requirement (5). We would classify this as a quality of implementation issue. That is, it is up to the implementer to decide whether the extra size of the physical state is worth the benefit of speed gained by caching. The importance of the portable saving and restoring is sufficiently high that it is worth incurring one or the other of these costs.

Pete Becker says: Now that you've explained why saving on one platform and restoring on another is important, I think it's reasonable to trade some speed or space for that capability. Of course, that doesn't solve the problem of how to specify it. <g>

Part of the problem with specifying this is the requirement that generators can be instantiated on arbitrary user-defined types. I think that's a bad idea. But rather than argue about that, I'll just ignore the requirement for now.

Most of the generators store their state in integral types. Those are pretty easy to deal with. For reproducibility across platforms, the state has to be described in terms of values that will fit in 32 bits, since that's the largest type that every C++ implementation is required to have. To save and restore across diverse platforms, it's the same thing: if you go beyond 32 bits you're not portable. So the requirement here is straightforward: write and read a 32-bit integral type (not necessarily long or unsigned long, which can be larger than 32 bits). That, in turn, probably means that linear_congruential, for example, shouldn't be instantiated with a type larger than 32 bits in portable code. (That, in turn, means that I can do the computation with a 64 bit int, and not have to worry about overflows)

The generators that use floating point types (instantiations of subtract_with_carry_01) specify the number of bits of precision that result from applying the seed. Assuming a binary floating point representation, that precision will be maintained as new values are generated. I haven't read the C99 specification carefully, but it looks to me like it requires 10 decimal digits in the mantissa of a double and a long double, so this is the upper limit on the precision of portable floating point representations. This, in turn, means that portable instantiations of subtract_with_carry_01 will use less than 32 bits of precision, So one 32-bit unsigned value can hold one value from the state.

So, should the basic principle be that states are streamable as a generator-determined number of implementation-defined unsigned integral values (at least 32 bits wide)?

Marc Paterno says: I think that Pete's explanation of the problem and his suggestion are both good, and I agree with the basic principle he proposes. To make sure I'm agreeing with what I think I'm agreeing with:

Engines states should be streamable as a generator-determined number of unsigned long values, as long as the implementation doesn't count on more than 32 bits in the unsigned long values.

Pete Becker says: Yes, that's what I'm thinking.

Let me add this to the mix: we currently have a template member function named seed that takes two iterators that point to a range of 32-bit unsigned values. When saving its state, the generator should stream out a set of values that, if passed as a range to seed, would recreate the current state. Then the stream extractor simply does this (pardon my informality):

    basic_istream& operator>>(basic_istream& istr, engine& eng)
    {
    istream_iterator<unsigned long> begin(istr);
    istream_iterator<unsigned long> end();
    eng.seed(begin, end);
    return istr;
    }

That way all the constructors, all the seed functions, and the stream extractor go through a single function, seed(InIt&, InIt). Now all we have to do is properly describe the behavior of that function.

Pete Becker says: In fact, since algorithms for distributions aren't specified by the standard, it isn't possible to portably save and restore the state of a distribution. I agree that portability doesn't seem to be necessary -- reset, in conjunction with saving and restoring the state of the engine, should suffice. Which raises the question of whether saving and restoring the state of a distribution ought to be supported at all (the proposal does provide for this).

Marc Paterno says: Previously, we presented the reasons we believe that portable saving and restoration of an engine's logical state is important. We do not believe that *portable* saving and restoring of a distribution's state has the same importance, and we also believe that the cost of such a requirement would be too high.

However, we believe there are good reasons to keep the ability to save and restore the state of a distribution, *without* requiring the saved state be portable. Two of the four reasons we raised for the portable saving and restoration of an engine's logical state are closely related to reasons for requiring the saving and restoring of state for distributions:

One of the reasons for saving state is the checkpointing of long-running calculations. For this, saving binary state is sufficient, and portability is not an issue.

A second reason for saving the state is to allow for reproducibility of results. In order to verify, explore and extend simulation results, researchers require the ability to reproduce, from a given checkpoint, the stream of random numbers the original simulation relied upon. Even though portable reproducibility will not be guaranteed, the feature is still valuable. Vendors who provide a library that allows portability across different hardware types may obtain an advantage here.

So, we would like to retain the requirement that the user be able to save and restore the state of a distribution, but that we not impose the additional requirement that the saved state be portable.

Pete Becker says: I agree. I realized it after I posted my other message.

Proposed Resolution:

Need words to describe streaming of generators