Document number: | N1535=03-0118 |
Date: | November 14, 2003 |
Project: | Programming Language C++ |
Reference: | ISO/IEC IS 14882:1998(E) |
Reply to: | Pete Becker |
Dinkumware, Ltd. | |
petebecker@acm.org |
1 Confusing Text in Description of v.min()
2 Confusing and Incorrect Text in Description of v.max()
3 Table "Number Generator Requirements" Unnecessary
4 Should a variate_generator
Holding a Reference Be Assignable?
5 Normal Distribution Incorrectly Specified
6 Should Random Number Initializers Take Iterators by Reference or by Value?
7 Are Global Operators Overspecified?
8 Should the Template Arguments Be Restricted to Built-in Types?
9 Do Engines Need Type Arguments?
10 Unclear Complexity Requirements for variate_generator
11 xor_combine
Over-generalized?
12 xor_combine::result_type
Incorrectly Specified
13 subtract_with_carry
's IntType
Overpecified
14 subtract_with_carry_01::seed(unsigned)
Missing Constaint
15 subtract_with_carry_01::seed(unsigned)
Produces Bad Values
16 subtract_with_carry_01::seed(unsigned)
Argument Type Too Small
17 subtract_with_carry::seed(In&, In)
Required Sequence Length Too Long
18 linear_congruential
-- Giving Meaning to a Modulus of 0
19 linear_congruential::seed(IntType)
-- Modify Seed Value When c == 0
?
20 linear_congruential
-- Should the Template Arguments Be Unsigned?
21 linear_congruential::linear_congruential(In&, In)
-- Garbled Requires Clause
22 bernoulli_distribution
Isn't Really a Template
23 Streaming Underspecified
v.min()
Section: | 5.1.1 [tr.rand.req], Table 5.2 |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: In "Uniform Random Number Requirements"
the text says that v.min()
"Returns ... l where l is ...". This is the letter ell,
which is too easily confused with the numeral one. Can we change it
to something less confusing, like "lim"?
Jens Maurer says: Sure.
Proposed Resolution:
Change the first sentence of the description
of v.min()
in 5.1.1 [tr.rand.req], Table 5.2
(Uniform random number generator requirements) from:
Returns somel
wherel
is less than or equal to all values potentially returned byoperator()
.
to:
Returns a value that is less than or equal to all values
potentially returned by operator()
.
v.max()
Section: | 5.1.1 [tr.rand.req], Table 5.2 |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: In "Uniform Random Number Requirements"
the text says that v.max()
"returns l
where l
is less than or equal to
all values...". Should this be "greater than or equal to"? And similarly,
should "strictly less than" be "strictly greater than."?
Jens Maurer says: Yes, cut&paste goof from v.min()
.
Pete Becker says: And, as in the previous issue, l
is unclear.
Proposed Resolution:
Change the first sentence of the description
of v.max()
in 5.1.1 [tr.rand.req], Table 5.2
(Uniform random number generator requirements) from:
Ifstd::numeric_limits<T>::is_integer
, returnsl
wherel
is less than or equal to all values potentially returned byoperator()
, otherwise, returnsl
wherel
is strictly less than all values potentially returned byoperator()
.
to:
Ifstd::numeric_limits<T>::is_integer
, returns a value that is greater than or equal to all values potentially returned byoperator()
, otherwise, returns a value that is strictly greater than all values potentially returned byoperator()
.
Section: | 5.1.1 [tr.rand.req], Table 5.1 |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: The table "Number Generator Requirements" has only
one entry: X::result_type
. While it's true that random nunber
generators and random distributions have this member, it doesn't seem like a
useful basis for classification -- there's nothing in the proposal that depends
on knowing that some type satisfies this requirement. I think the specification
of X::result_type
should be in "Uniform Random Number
Generator Requirements" and in "Random Distribution Requirements."
Proposed Resolution:
Copy the description of X::result_type
from 5.1.1 [tr.rand.req], Table 5.1
(Number generator requirements) to 5.1.1 [tr.rand.req], Table 5.2 (Uniform random number
generator requirements) and to 5.1.1 [tr.rand.req], Table 5.4 (Random distribution requirements)
and remove 5.1.1 [tr.rand.req], Table 5.1 (Number generator requirements).
variate_generator
Holding a Reference Be Assignable?Section: | 5.1.3 [tr.rand.var] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: The third paragraph says, in part:
Specializations ofvariate_generator
satisfy the requirements of CopyConstructible. They also satisfy the requirements of Assignable unless the template parameterEngine
is of the formU&
.
I haven't thought this through carefully, but this looks like an implementation artifact. Is there a reason that variate_generators whose engine type is a reference should not be copied?
Jens Maurer says: Not that I remember, except the implementation hickup. As you point out, that can be easily worked around. I need to think a little more on that.
Proposed Resolution:
Change the first two sentences of the third paragraph of 5.1.3 [tr.rand.var] from:
Specializations ofvariate_generator
satisfy the requirements of CopyConstructible. They also satisfy the requirements of Assignable unless the template parameterEngine
is of the formU&
.
to:
Specializations of variate_generator
satisfy the
requirements of CopyConstructible and Assignable.
Section: | 5.1.7.8 [tr.rand.dist.norm] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: For normal_distribution
, the paper says that
the probability density function is
1/sqrt(2*pi*sigma) * exp(- (x - mean)^2 / (2 * sigma^2))
.
The references I've seen have a different initial factor, using
1/(sqrt(2*pi) * sigma)
. That is, sigma
is outside the
square root. Should we change the description in the paper to match?
Marc Paterno says: The correct probability density function is as Pete gives it, and the paper should be changed to match.
Proposed Resolution:
Change the first paragraph of 5.1.7.8 [tr.rand.dist.norm] from:
Anormal_distribution
random distribution produces random numbersx
distributed with probability density function(1/sqrt(2*pi*sigma))e^{-(x-mean)2/(2*sigma2)}
, wheremean
andsigma
are the parameters of the distribution.
to:
Anormal_distribution
random distribution produces random numbersx
distributed with probability density function(1/(sqrt(2*pi)*sigma))e^{-(x-mean)2/(2*sigma2)}
, wheremean
andsigma
are the parameters of the distribution.
Section: | 5.1.4.1 [tr.rand.eng.lcong] |
5.1.4.2 [tr.rand.eng.mers] | |
5.1.4.3 [tr.rand.eng.sub] | |
5.1.4.4 [tr.rand.eng.sub1] | |
5.1.4.5 [tr.rand.eng.disc] | |
5.1.4.6 [tr.rand.eng.xor] | |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says:The random number generators can be initialized and seeded
by passing two iterators that specify a range of values. The first
iterator is passed by reference and not by value. The reason
for doing this is things like xor_combine
, which holds two generators, and needs
to initialize both generators from a single sequence:
template <class _Engine1, class _Engine2> // simplified class xor_combine { public: template <class _InIt> xor_combine(_Init& _First, _InIt _Last) : _Eng1(_First, _Last), _Eng2(_First, _Last) {} private: _Engine1 _Eng1; _Engine2 _Eng2; };
With this, _Eng1
is initialized with the values at the head of the sequence,
and _Eng2
is initialized with the values that follow the ones that
_Eng1
consumed.
That looks like a good idea, but there's a problem. It means that the following code is ill-formed:
unsigned long init[] = { 1, 2, 3 }; minstd_rand rng0(init, init + 3); // illegal, init not a modifiable lvalue minstd_rand rng1; rng1.seed(init, init + 3); // illegal, init not a modifiable lvalue
It seems to me that this is what most users will try first (I did), and it's surprising that it fails. It also seems like it would be the most common form of initialization if it worked.
This can be made to work by passing the first iterator by value, and changing
seed
to return an iterator in the usual way: it returns the iterator
that points to the element just past the last one it used. With just this change,
the constructor for xor_combine
looks like this:
template <class _InIt> xor_combine(_InIt _First, _InIt _Last) : _Eng1, _Eng2 { _InIt _Next = _Eng1.seed(_First, _Last); _Eng2.seed(_Next, _Last); }
An obvious drawback with this approach is that each contained generator gets default initialized, then re-seeded. If that's a big enough problem (seeding some generators might be computationally expensive), we could add an additional constructor to each generator. The constructor is called with a value of some designated type, and doesn't do any more initialization than is necessary to make the object destructible:
class rng_no_init_t {} rng_no_init; template <class _InIt> xor_combine(_InIt _First, _InIt _Last) : _Eng1(rng_no_init), _Eng2(rng_no_init) { _InIt _Next = _Eng1.seed(_First, _Last); _Eng2.seed(_Next, _Last); }
Of course, that constructor would only be used in this sort of situation, where re-seeding will occur before any actual use of the generator.
Jens Maurer says: I don't know how to put it in standardese,
but the idea is that the unsuspecting user should use the one-argument
integer constructor, i.e. minstd_rand rng0(42)
, to likely
get reasonable seeding. All base engines provide a constructor like this.
The (it1,it2)
constructors were meant for "expert" users that
wanted fine-grained control over seeding. It appears unlikely that those
"expert" users would use a fixed C-like array of seeds and could not be
bothered to introduce an additional local variable for the lvalue it1
.
Let's investigate what [the suggested change] means:
rng_no_init_t
has to be accessible by the
compound engine's constructor definition (e.g. xor_combine
)
and the declaration of the constructor in the base engine
(e.g. _Eng1
). Together with the prospect of the end user
writing additional compound or base engines that should
combine freely with the library-provided ones, that means that
rng_no_init_t
cannot be an implementation defined or class-
private type. It has to be publicy specified.
In order to allow free combination of engines, the
"pseudo-random number engine" requirements must require,
for each engine, a constructor taking an rng_no_init_t
.
The operator()
, together with << and >> streaming operators,
would then be specified to have undefined behavior until
the next successful call to seed()
, if the object was
constructed using rng_no_init_t
.
While this can certainly be done, it leaves the odd
feeling that we would deliberately introduce a constructor
that starts the lifetime of the object, but leaves the
object semantically unusable (internal state uninitialized).
This is not what I thought user-defined constructors were
meant for. There is the potential for misuse by the
user, if he constructs an object with the public rng_no_init_t
constructor without calling a seed()
function afterwards.
Calls to operator()
would likely return apparently
"random" numbers, because on most platforms, accessing
uninitialized variables means reading "random" garbage.
However, none of the assertions of the engine's specification
would be met. It is possible that this misuse is not detected except by
careful code review. On the other hand, the misuse in your
first example is caught by the compiler. Problems caught
by the compiler are cheapest to fix, compared to runtime
problems.
I'm not particularly fond of the current (it1,it2)
interface, either, but, given the constraints of
constructors, it's the only interface I could come up
with (see discussion in section III.E of N1452).
I would appreciate hearing other people's opinion on
this, in particular the perceived "ugliness" of the
long[]
initialization in the example above.
Pete Becker says: With regard to "expert" users, it's not so much a matter of could not be bothered, as it is find it surprising. It's a range, and we don't require an lvalue as the first iterator to a range anywhere else.
It may be that this is intended for expert users, but STL-conscious folks will see iterators and use them.
My suggestion uses a fairly common idiom, called "two-phase initialization." I agree it's better to avoid it, but I don't see any other way to avoid the non-intuitive semantics of requiring an lvalue for an iterator.
With regard to the "ugliness" of the initialization, here's the way to do it currently:
unsigned long init[] = { 1, 2, 3 }; unsigned long *lval_init = init; minstd_rand rng(lval_init, init + 3);
Here's what I'd like to be able to do:
unsigned long init[] = { 1, 2, 3 }; minstd_rand rng(init, init + 3);
Peter Dimov says: This is, to me, an interesting issue WRT the move/forwarding proposal.
Part of the problem is that It&
will not bind to rvalues, as in the example
below:
vector<unsigned long> v; // read v minstd_rand rng(v.begin(), v.end());
This can be fixed by making the first argument It &&
when, or if, the rvalue
reference is accepted.
The second half of the problem is that in
unsigned long init[] = { 1, 2, 3 }; minstd_rand rng(init, init + 3);
init
is an lvalue of type unsigned long [3]
, and init + 3
is an rvalue of type unsigned long *
; even an rvalue reference-based interface will fail
deduction.
This can be fixed by, for example, adding a constructor that takes T (&)[N]
.
Pete Becker says: Or by writing the code correctly <g>:
minstd_rand rng(&init[0], init + 3);
In any case, my concern is to be able to write code that looks like the analogous use of sequences in other parts of the library.
Bjarne Stroustrup says: and that's an important concern.
Section: | 5.1.1 [tr.rand.req] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: Table 5.3 (Pseudo-random number engine requirements) says that for every engine, each of the following must be valid:
x == y x != y os << x is >> x
In addition, though, each of the four engines is specified as having a global operator for each of these operations, taking the engine type as appropriate.
Technically that means that I can't, for example, implement
subtract_with_carry
and subtract_with_carry_01
using a
single template (suitably parametrized) as a base class, and supply the appropriate
global operators for that base template. Some smart-aleck conformance test
writer <g> will check for the precise signature, find that it's not there,
and report a conformance failure.
So, do we want just the operational requirement that x == y
must work, or do we want the more specific requirement that for each engine
type there must be a corresponding operator==
?
Marc Paterno says: I think the ability to test for equality
(or inequality) is important, but that the way it is implemented should be
up to the implementer. So I'm happy with the operational requirement that
x == y
must work, but see no need to require any specific signature.
Peter Dimov says: If
template<class T> bool operator==(T const &, T const &) { return false; }
is in scope,
x == y
will go to that template instead of the base class version.
One possible use for introducing such an operator is to use x == y
in cases
where decltype(x)
does not have an operator==
. This technique
is very rare and fragile, though.
Pete Becker says: I wouldn't have put it so delicately. I have no sympathy for someone who deliberately breaks inheritance relationships with that sort of $"#*$*&. Templates are a delicate tool, not a sledgehammer.
Peter Dimov says: I'm not sure whether I expressed myself well. I'll try again with code.
namespace hidden { template<class T> bool operator==(T const &, T const &) { return false; } } template<class X> bool compare(X const & x, X const & y) { using namespace hidden; return x == y; } struct V {}; // no operator== #include <iostream> int main() { V v1, v2; // v1 == v2; // error std::cout << compare(1, 1) << std::endl; std::cout << compare(v1, v2) << std::endl; }
So the "evil" template operator==
is delicately
applied in compare
(being selectively made visible only at the
scope of the function) to build a comparison that yields false when there is
no operator==
defined for the type.
I'll leave it to others to judge whether this technique is worth supporting.
Pete Becker says: Doesn't change my reaction. It's still a sledgehammer. There are things that library designers ought to take into account and things that users should know are risky. Writing templates as if the language did not have inheritance is risky. Anyone who does this will have to pick up the pieces of the code that they broke.
Section: | |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: Generators and distributions are all parametrized
on arithmetic types, named RealType
, IntType
,
and UIntType
. The proposal describes these types in terms
of the operations that can be applied to them and the functions that can be called
on them. So, for example,
A template parameter named IntType
shall denote a
type that represents an integer number. This type shall
meet the requirements for a numeric type (26.1
[lib.numeric.requirements]), the binary operators +, -, *, /, %
shall be applicable to it, and a conversion from int shall
exist. [Footnote: The built-in types int and long meet these requirements.]
The idea, I assume, is to support user-defined types that look like integer types.
Although it doesn't say so, the requirement seems to be that the algorithm be
implemented with the user-defined type, since some of the parameters for the
algorithm (specified as template arguments or as arguments to the ctor) are
IntType
, and there is no conversion required from IntType
to any of the built-in types. This is a fairly severe constraint on implementors.
Is it worth it?
It's easy to go astray here. For example the Boost implementation of
uniform_int
uses std::numeric_limts<IntType>
,
it uses <, ++, <=, +=, *=, ==, >. It also applies a static_cast
to convert the result_type
of the distribution (typedefed to
IntType
) into the result_type
of the underlying
random number generator. These operations cannot be expected to work under
the constraints given above.
I'm not picking on the Boost implementation here: our implementation does the same sorts of things. The point is that specifying what an integer type ought to look like is hard. If we don't get it right there will be techniques that work just fine on built-in types but can't be used because user-defined types aren't required to support them.
Another example comes from the floating point types. Some of the distributions can be implemented with techniques that count the number of high-order 0 bits in the binary fraction that represents a value between 0.0 and 1.0. That can be fast and cheap if I know that the type being used is the platform's builtin floating point type, because I can look at the bits directly instead of using floating point math.
Of course, these optimization techniques can be applied through partial specialization, so that the optimized versions are "only" used for builtin types, and sub-optimal versions are available for user-defined types. That means that library vendors will have to supply code that's difficult to test and will never (rarely?) be used.
How important is this? Would it be sufficient to say that RealType
means one of float
, double
, long double
(which is, in essence, what we say for complex
),
IntType
is either int
or long
,
and UIntType
is either unsigned int
or unsigned long
?
Section: | |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: Issue number 8 gives reasons for restricting the type arguments to builtin types. If we do this, it turns out that the type arguments for engines are pretty much redundant. I'm not clear yet on whether they're needed for distributions, so I'm only talking here about engines.
The implementation can always determine an appropriate type from the other arguments
to the engine templates. For example, mersenne_twister
takes an argument
that gives the number of bits in each word. The implementor is in the best position to decide
what type to use to represent that number of bits. A user trying to write portable
code won't ever specify any type other than unsigned long
for a 32-bit
mersenne twister, because portable code can't assume that any other type can hold a 32 bit
value. The standard library implementor can do better, avoiding gratuitously large types.
Part of the issue here is an assumption in the proposal that these templates should be implemented using the user-specified type internally. For builtin types, the type used internally isn't detectable provided the algorithm is implemented correctly. And there are potentially large performance gains from, for example, implementing a linear congruential generator that returns 32-bit values using a 64-bit type internally. The as-if rule means that the user-specified type is essentially irrelevant to the implementation, and any good implemenation will make at least as good a choice here as the user can, and usually better.
The other place where the user-specified type appears is in the return type of
operator()
. So, for example, linear_congruential
requires
the user to supply a type that's large enough to hold values in the range from 0 to m-1
.
But the implementation can determine an appropriate type based on the value of m
, so
what's gained by requiring the user to specify this type?
variate_generator
Section: | 5.1.3 [tr.rand.var] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: The specification for variate_generator
says
Specializations ofvariate_generator
satisfy the requirements of CopyConstructible. They also satisfy the requirements of Assignable unless the template parameterEngine
is of the formU&
. The complexity of all functions specified in this section is constant. No function described in this section except the constructor throws an exception.
Taken literally, this isn't implementable. operator()
calls
the underlying distribution's operator()
, whose complexity isn't
directly specified. The distribution's operator()
makes an amortized constant
number of calls to the generator's operator()
, whose complexity is,
again, amortized constant. So the complexity of variate_generator::operator()
ought to also be amortized constant.
variate_generator
also has a constructor that takes an engine and a
distribution by value, and uses their respective copy constructors to
create internal copies. There are no complexity constraints on those
copy constructors, but given that the default constructor for an engine
has complexity O(size of state), it seems likely that an engine's copy
constructor would also have complexity O(size of state). This means that
variate_generator
's complexity is at best O(size of engine's state),
not constant.
I suspect that what was intended was that these functions would not introduce any additional complexity, that is, their complexity is the "larger" of the complexities of the functions that they call.
xor_combine
Over-generalized?Section: | 5.1.4.6 [tr.rand.eng.xor] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: For an xor_combine
engine, is there
ever a case where both s1
and s2
would be non-zero?
Seems like this would produce non-random values, because the low bits
(up to the smaller of the two shift values) would all be 0.
Jens Maurer says: Yes, indeed. Good spotting. A case of over-generalization on my part.
Pete Becker says: If at least one has to be 0, then we only need one shift value, and the definition might look more like this:
template <class _Engine1, class _Engine2, int _Shift = 0> ...
with the output being (_Eng1() ^ (_Eng2() << _Shift))
.
Jens Maurer says: That requires that Eng1
is the engine
associated with the 0 shift, so it enforces a specific ordering of arguments
which might not match the user-desired initialization ordering with the dreadful
seed(it1,it2)
interface. But that appears as a minor drawback
compared to the simplified interface.
See also: another argument
xor_combine::result_type
Incorrectly SpecifiedSection: | 5.1.4.6 [tr.rand.eng.xor] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: xor_combine
has a member
typedef typename base_type::result_type result_type;
However, it has no type named base_type
, only base1_type
and base2_type
. So, what should result_type be?
Jens Maurer says: base1_type
, by "random" choice.
(That's another argument why you may
want to have the freedom of reordering the base engine types in the interface:
With the current interface, it's easy to get base2_type
, just reorder.
When we have only one shift count, reordering changes the
shifting.)
Pete Becker says: Could be. I think there are a couple of other reasonable result types, though: the larger of the two types might be appropriate, or the actual type of the computed result. Neither of these depends on the order of the engine arguments.
subtract_with_carry
's IntType
OverpecifiedSection: | 5.1.4.3 [tr.rand.eng.sub] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: The IntType for subtract_with_carry
"shall denote a signed integral type large enough
to store values up to m - 1
." The implementation subtracts
two values of that type, and if the result is < 0
it adds back
the m
, which makes the result non-negative. In fact, this also
works for unsigned types, with just a small change in the implementation:
instead of testing whether the result is < 0
you test whether it's
< 0
or greater than or equal to m
. This works because
unsigned arithmetic wraps, and it makes the template a bit easier to use.
I suggest that we loosen the constraint to allow signed and unsigned types. Thus the
constraint would read "shall denote an integral type large enough to store values
up to m - 1
."
Proposed Resolution:
Change:
The template parameterIntType
shall denote a signed integral type large enough to store values up tom-1
.
to:
The template parameterIntType
shall denote an integral type large enough to store values up tom-1
.
subtract_with_carry_01::seed(unsigned)
Missing ConstaintSection: | 5.1.4.4 [tr.rand.eng.sub1] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: The specification for
subtract_with_carry::seed(IntVal)
has a Requires
clause which requires that the argument be greater than 0. This member
function needs the same constraint.
Proposed Resolution:
Add:
Requires: value > 0
to the description of subtract_with_carry_01::seed(unsigned)
in 5.1.4.4 [tr.rand.eng.sub1].
subtract_with_carry_01::seed(unsigned)
Produces Bad ValuesSection: | 5.1.4.4 [tr.rand.eng.sub1] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: subtract_with_carry_01::seed(unsigned int)
uses a linear congruential generator to produce initial values for the fictitious
previously generated values. These values are generated as (y(i)*2^-w) mod 1
.
The linear congruential generator produces values in the range [0, 2147483564)
,
which are at most 31 bits long. If the template argument w
is greater
than 31 the initial values generated by seed will all be rather small, and the first
values produced by the generator will also be rather small. The Boost implementation avoids
this problem by combining values from the linear congruential generator to
produce longer values when w
is larger than 32. Should we require something
more like that?
subtract_with_carry_01::seed(unsigned)
Argument Type Too SmallSection: | 5.1.4.4 [tr.rand.eng.sub1] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: subtract_with_carry_01::seed(unsigned)
has a default argument value of 19780503, which is too large to fit in a
16-bit unsigned int. Should this argument be unsigned long, to ensure that
it's large enough for the default?
Proposed Resolution:
Change the signatures of this constructor and member function:
subtract_with_carry_01(unsigned int value); void seed(unsigned int value = 19780503);
to:
subtract_with_carry_01(unsigned long value); void seed(unsigned long value = 19780503);
subtract_with_carry::seed(In&, In)
Required Sequence Length Too LongSection: | 5.1.4.3 [tr.rand.eng.sub] |
5.1.4.4 [tr.rand.eng.sub1] | |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: For both subtract_with_carry::seed(In& first, In last)
and subtract_with_carry_01::seed(In& first, In last)
the proposal says: "With n=w/32+1
(rounded downward) and given the
values z0 ... zn*r-1
." The idea is to use n
unsigned long
values to generate each of the initial values for the
generator, so n
should be the number of 32-bit words needed to
provide w
bits. Looks like it should be "n=(w+31)/32
".
As currently written, when w
is 32, the function consumes two 32-bit values for each
value that it generates. One is sufficient.
Proposed Resolution:
Change
Withn=w/32+1
(rounded downward) and given the valuesz0 ... zn*r-1
to
Withn=(w+31)/32
(rounded downward) and given the valuesz0 ... zn*r-1
in the description of subtract_with_carry::seed(In& first, In last)
in 5.1.4.3 [tr.rand.eng.sub] and in the description of
subtract_with_carry_01::seed(In& first, In last)
in 5.1.4.4 [tr.rand.eng.sub1].
linear_congruential
-- Giving Meaning to a Modulus of 0Section: | 5.1.4.1 [tr.rand.eng.lcong] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: Some linear congruential generators using an
integral type _Ty
also use a modulus that's equal to
numeric_limts<_Ty>::max() + 1
(e.g. 65536 for a
16-bit unsigned int). There's no way to write this value as a constant
of the type _Ty
, though. Writing it as a larger type
doesn't work, because the linear_congruential template expects an
argument of type _Ty
, so you typically end up with a value
that looks like 0.
On the other hand, the current text says that the effect
of specifying a modulus of 0 for linear_congruential
is
implementation defined. I decided to use 0 to mean max()+1
,
as did the Boost implementation. (Internally, the implementation of
mersenne_twister
needs a generator with a modulus like
this). Seems to me this is a reasonable choice, and one that
users ought to be able to rely on. Is there some other meaning that might
reasonably be ascribed to it, or should we say that a modulus of 0 means
numeric_limits<_Ty>::max() + 1
(suitably type-cast)?
linear_congruential::seed(IntType)
-- Modify Seed Value When c == 0
?Section: | 5.1.4.1 [tr.rand.eng.lcong] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: When c == 0 you get a generator with a slight quirk:
if you seed it with 0 you get 0's forever; if you seed it with a non-0 value
you never get 0. The first path, of course, should be avoided. The proposal
does this by imposing a requirement on seed(IntType x0)
,
requiring that c > 0 || (x0 % m) > 0
.
The boost implementation uses asserts to check this condition.
The only reservation I have about this is that it can only be checked at runtime,
when the only suitable action is, probably, to abort. An alternative would be to
force a non-0 seed in that case (perhaps 1, for no particularly good reason). I think
the second alternative is marginally better, and I suggest we change this requirement
to impose a particular seed value when a user passes 0 to a generator with c == 0
.
Proposed Resolution:
In the text describing void seed(IntType x0 = 1)
in 5.1.4.1 [tr.rand.eng.lcong]
remove the Requires clause and change the Effects
clause from:
Sets the statex(i)
of the engine tox0 mod m
.
to:
Sets the statex(i)
of the engine to(x0 == 0 && c == 0 ? 1 : x0) mod m
.
linear_congruential
-- Should the Template Arguments Be Unsigned?Section: | 5.1.4.1 [tr.rand.eng.lcong] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: This template takes three numeric arguments, a
,
c
, and m
, whose type is IntType
. IntType
is an integral type, possibly signed. These arguments specify the details of the recurrence
relation for the generator:
x(i + 1) := (a * x(i) + c) mod m
Every discussion that I've seen of this algorithm uses unsigned values. Further,
In C and C++ there is no modulus operator. The result of the %
operator is implementation specific when either of its operands is negative, so
implementing mod
when the values involved can be negative requires a
test and possible adjustment:
IntType res = (a * x + c) % m; if (res < 0) res += m;
If the three template arguments can't be negative the recurrence relation can be implemented directly:
x = (a * x + c) % m;
This makes the generator faster.
Proposed Resolution:
In clause 5.1.4.1 [tr.rand.eng.lcong] replace every occurrence of IntType
with UIntType
and change the first sentence after the definition of the template
from:
The template parameterIntType
shall denote an integral type large enough to store values up to(m-1)
.
to:
The template parameterUIntType
shall denote an unsigned integral type large enough to store values up to(m-1)
.
linear_congruential::linear_congruential(In&, In)
-- Garbled Requires ClauseSection: | 5.1.4.1 [tr.rand.eng.lcong] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: The Requires clause for the member template
template <class In> linear_congruential(In& first, In last)
got garbled in the translation to .pdf format.
Proposed Resolution:
Change the Requires clause for the member template
template <class In> linear_congruential(In& first, In last)
in 5.1.4.1 [tr.rand.eng.lcong] from:
Requires: c > 0 — *first ¿ 0—
to:
Requires: c > 0 || *first > 0
bernoulli_distribution
Isn't Really a TemplateSection: | 5.1.7.2 [tr.rand.dist.bern] |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: The text says that bernoulli_distribution
is a template, parametrized on a type that is required to be a real type. Its
operator()
returns a bool
, with the probability
of returning true determined by the argument passed to the object's constructor.
The only place where the type parameter is used is as the type of the argument
to the constructor. What is the benefit from making this type user-selectable
instead of, say, double?
Marc Paterno says: I don't see a good reason to make
bernoulli_distribution
a template, and would be content to do
as Pete implies, and make the type parameter a double.
Proposed Resolution:
In 5.1.7.2 [tr.rand.dist.bern], remove template <class RealType = double>
from the declaration of bernoulli_distribtion
, change the declaration of the
constructor from:
explicit bernoulli_distribution(const RealType& p = RealType(0.5));
to:
explicit bernoulli_distribution(double p = 0.5);
and change the header for the subclause describing the constructor from:
bernoulli_distribution(const RealType& p = RealType(0.5))
to:
bernoulli_distribution(double p = 0.5)
Section: | 5.1.1 [tr.rand.req], Table 5.3 |
Submitted by: | Pete Becker |
Date: | October 16, 2003 |
Pete Becker says: The text in Table 5.3,
"Pseudo-random Number Engine Requirements" says: os << x
specifies particular settings for os.fmtflags
and the fill character.
We don't do that for, e.g., complex<whatever>
, so this is a
departure from the "usual" interface. We may well need to be more
specific all around, but I think that ought to be done in the context of an
overall review of inserters for compound types, and not ad hoc.
Jens Maurer says: That goes with the sentiment of the review that the requirements are not specific enough to ensure portability of engine state across platforms. Extending the requirements to be sufficiently paranoid is probably too easy to get incomplete or wrong.
Pete Becker says: I don't think it's practical to require state IO to be portable across platforms. There are two problems: first, it imposes contraints on how the algorithm is implemented; and second, it requires conversion capabilities that are not required anywhere else in the standard. I think "mere" binary IO is sufficient.
The biggest constraint that I see is that being able to write the state out
from one program and read it into another requires that the state be represented
in exactly the same way on all platforms. Hence the change in the implementation
of subtract_with_carry
from Boost 1.29 to Boost 1.30.
subtract_with_carry
generates a new value based on two earlier ones. \
Simplified, x(i) := x(i - s) - x(i - r)
[where s < r].
The obvious implementation is to keep the last r
values, and on each call,
compute a new value and replace the old x[r]
(which is no longer needed) with
the new one, so that it will be available when its needed later on. A somewhat
faster implementation (used in 1.29) precomputes r
values, and steps through
the array returning values whenever requested. When the array index steps off
the end, new values are precomputed and the index gets reset to the beginning
of the array. This simplifies the index computations (i-s
has to wrap around to
the top of the array), and perhaps allows unrolling of loops. Trouble is, if
you do this you can't write out the state of the generator as specified in
the proposal, because you've thrown it away. (Yes, you can hang on to the
old data after you precompute new values, if you're willing to double the
size of state data -- that's a high price to pay, especially for the mersenne
twister, whose most common instantiation uses 624 32-bit values) So 1.30 removed
this optimization from subtract_with_carry
in order, apparently, to satisfy
the IO requirements.
The conversion problem is more subtle. In order to write out a floating point value and read in exactly the same value you have to do some fairly sophisticated (i.e. big and slow) semi-numerical work. Stream inserters and extractors for floating point values are not currently required to support this degree of accuracy, and as far as I know, none do. Further, this accuracy is incompatible with fixed width formats. The general scheme is to write out as many digits as are needed to distinguish the current value from the two nearest values, so the width of the text representation of the values differs (you could compute every value to the same maximum number of digits, but those extra digits require more precision than the floating point type holds, so these extra digits require multiple-precision integer math, which is rather slow). Beyond that, the number of digits in the text representation depends on the precision of the floating point type. In order to write values that can be read on any platform, it seems to me that all implementations would have to write enough digits to represent values of the largest type available anywhere. That's not a good recipe for portability. <g>
It's not completely clear what the purpose of the inserters and extractors is. The paper talks about engines and distributions that store their state and recover it. It certainly makes sense that a computation might need to be interrupted and resumed later. That doesn't require text representations, nor exact reproducibility across platforms.
Would it make sense to instead require binary write and read functions, subject to all the usual contraints on binary IO? That would mean that data could only be read by a program compiled with the same compiler as the one that wrote it. This also eliminates the contraints on the state (the descriptions of the states are still important in defining what the algorithms do, but the "as-if" rule would then permit alternate implementations), while providing what seems to be the most commonly used operation, namely interrupting a computation and resuming it later.
Marc Paterno says: What is the reason for saving and restoring the state in a portable fashion?
An important requirement is given in III.A. "Overview on Requirements": "has a tight enough specification to get reliable cross-platform results". In our view this is essential for engines, but not for distributions. So we confine our comments to discussing the portable saving and storing of state for engines.
One of the reasons for saving state is the checkpointing of long-running calculations. For this, saving binary state is sufficient, and portability is not an issue.
Another important reason for saving state is large-scale parallelization of calculations. A calculation run on a farm (of possibly dissimilar hardware) will need not only the same checkpointing ability, but the ability to communicate the state from one node to another. This doesn't require human readability (textual representation), but it does require the ability to recover the complete state of an engine from some external representation in a portable fashion.
A third reason for saving the state is to reproduce a calculation on another platform. Distributed computing applications using random numbers for cryptography may need to generate the same sequence of numbers, and may need to share the state of the engine, again portably.
A fourth reason for saving the state is to allow for portable reproducibility of results. In order to verify, explore and extend simulation results, researchers require the ability to reproduce, from a given checkpoint, the stream of random numbers the original simulation relied upon.
We believe that this requirement does not demand that the state be represented in exactly the same way on all platforms. It only implies that an implementation-independent logical state exists. The manner in which the physical state realizes this logical state is left to the implementor.
In light of these requirements, we have reviewed the proposal, and have come to the following conclusion. To completely specify the behavior of an engine means answering the following 5 points:
1) to have the logical state of each engine specified (this seems to be done in the current proposal, perhaps with inconsistent notation);
2) to have a prescription for how to move from the current logical state to the successor logical state (each engine in the current proposal does this adequately);
3) to have a prescription to extract the random number to deliver from the engine's logical state (each engine in the current proposal does this adequately);
4) to have a prescription to map a sequence of unsigned longs to the logical state of an engine (this is implied -- by the seed function -- but insufficiently specified in the current proposal);
5) to have a prescription to map the current logical state into a sequence of unsigned longs (this requirement is not clearly specified by the current proposal).
We believe capabilities (4) and (5) will enable portable saving and restoring of state for engines. And we believe that portable saving and restoring of state is sufficiently important to be worth the effort.
Pete points out two specific cases in which he is concerned with cost implied by requirement (5). We would classify this as a quality of implementation issue. That is, it is up to the implementer to decide whether the extra size of the physical state is worth the benefit of speed gained by caching. The importance of the portable saving and restoring is sufficiently high that it is worth incurring one or the other of these costs.
Pete Becker says: Now that you've explained why saving on one platform and restoring on another is important, I think it's reasonable to trade some speed or space for that capability. Of course, that doesn't solve the problem of how to specify it. <g>
Part of the problem with specifying this is the requirement that generators can be instantiated on arbitrary user-defined types. I think that's a bad idea. But rather than argue about that, I'll just ignore the requirement for now.
Most of the generators store their state in integral types. Those are pretty easy
to deal with. For reproducibility across platforms, the state has to be described
in terms of values that will fit in 32 bits, since that's the largest type that
every C++ implementation is required to have. To save and restore across diverse
platforms, it's the same thing: if you go beyond 32 bits you're not portable. So
the requirement here is straightforward: write and read a 32-bit integral type
(not necessarily long or unsigned long, which can be larger than 32 bits). That,
in turn, probably means that linear_congruential
, for example,
shouldn't be instantiated with a type larger than 32 bits in portable code.
(That, in turn, means that I can do the computation with a 64 bit int, and
not have to worry about overflows)
The generators that use floating point types (instantiations of
subtract_with_carry_01
) specify the number of bits of precision
that result from applying the seed. Assuming a binary floating point representation,
that precision will be maintained as new values are generated. I haven't read
the C99 specification carefully, but it looks to me like it requires 10 decimal
digits in the mantissa of a double and a long double, so this is the upper
limit on the precision of portable floating point representations. This, in
turn, means that portable instantiations of subtract_with_carry_01
will use less than 32 bits of precision, So one 32-bit unsigned value can hold
one value from the state.
So, should the basic principle be that states are streamable as a generator-determined number of implementation-defined unsigned integral values (at least 32 bits wide)?
Marc Paterno says: I think that Pete's explanation of the problem and his suggestion are both good, and I agree with the basic principle he proposes. To make sure I'm agreeing with what I think I'm agreeing with:
Engines states should be streamable as a generator-determined number of unsigned long values, as long as the implementation doesn't count on more than 32 bits in the unsigned long values.
Pete Becker says: Yes, that's what I'm thinking.
Let me add this to the mix: we currently have a template member function named
seed
that takes two iterators that point to a range of 32-bit unsigned values.
When saving its state, the generator should stream out a set of values that,
if passed as a range to seed
, would recreate the current state. Then the
stream extractor simply does this (pardon my informality):
basic_istream& operator>>(basic_istream& istr, engine& eng) { istream_iterator<unsigned long> begin(istr); istream_iterator<unsigned long> end(); eng.seed(begin, end); return istr; }
That way all the constructors, all the seed
functions, and the
stream extractor go through a single function, seed(InIt&, InIt)
.
Now all we have to do is properly describe the behavior of that function.
Pete Becker says: In fact, since algorithms for distributions aren't specified by the standard, it isn't possible to portably save and restore the state of a distribution. I agree that portability doesn't seem to be necessary -- reset, in conjunction with saving and restoring the state of the engine, should suffice. Which raises the question of whether saving and restoring the state of a distribution ought to be supported at all (the proposal does provide for this).
Marc Paterno says: Previously, we presented the reasons we believe that portable saving and restoration of an engine's logical state is important. We do not believe that *portable* saving and restoring of a distribution's state has the same importance, and we also believe that the cost of such a requirement would be too high.
However, we believe there are good reasons to keep the ability to save and restore the state of a distribution, *without* requiring the saved state be portable. Two of the four reasons we raised for the portable saving and restoration of an engine's logical state are closely related to reasons for requiring the saving and restoring of state for distributions:
One of the reasons for saving state is the checkpointing of long-running calculations. For this, saving binary state is sufficient, and portability is not an issue.
A second reason for saving the state is to allow for reproducibility of results. In order to verify, explore and extend simulation results, researchers require the ability to reproduce, from a given checkpoint, the stream of random numbers the original simulation relied upon. Even though portable reproducibility will not be guaranteed, the feature is still valuable. Vendors who provide a library that allows portability across different hardware types may obtain an advantage here.
So, we would like to retain the requirement that the user be able to save and restore the state of a distribution, but that we not impose the additional requirement that the saved state be portable.
Pete Becker says: I agree. I realized it after I posted my other message.
Proposed Resolution:
Need words to describe streaming of generators