Doc. no. N1480 = 03-0063
Date: 24 Apr 2003
Project: Programming Language C++
Reply to: Matt Austern <austern@apple.com>

C++ Standard Library Active Issues List (Revision 26)

Reference ISO/IEC IS 14882:1998(E)

Also see:

The purpose of this document is to record the status of issues which have come before the Library Working Group (LWG) of the ANSI (J16) and ISO (WG21) C++ Standards Committee. Issues represent potential defects in the ISO/IEC IS 14882:1998(E) document. Issues are not to be used to request new features or other extensions.

This document contains only library issues which are actively being considered by the Library Working Group. That is, issues which have a status of New, Open, Ready, and Review. See Library Defect Reports List for issues considered defects and Library Closed Issues List for issues considered closed.

The issues in these lists are not necessarily formal ISO Defect Reports (DR's). While some issues will eventually be elevated to official Defect Report status, other issues will be disposed of in other ways. See Issue Status.

This document is in an experimental format designed for both viewing via a world-wide web browser and hard-copy printing. It is available as an HTML file for browsing or PDF file for printing.

Prior to Revision 14, library issues lists existed in two slightly different versions; a Committee Version and a Public Version. Beginning with Revision 14 the two versions were combined into a single version.

This document includes [bracketed italicized notes] as a reminder to the LWG of current progress on issues. Such notes are strictly unofficial and should be read with caution as they may be incomplete or incorrect. Be aware that LWG support for a particular resolution can quickly change if new viewpoints or killer examples are presented in subsequent discussions.

For the most current official version of this document see http://www.dkuug.dk/jtc1/sc22/wg21. Requests for further information about this document should include the document number above, reference ISO/IEC 14882:1998(E), and be submitted to Information Technology Industry Council (ITI), 1250 Eye Street NW, Washington, DC 20005.

Public information as to how to obtain a copy of the C++ Standard, join the standards committee, submit an issue, or comment on an issue can be found in the C++ FAQ at http://www.research.att.com/~austern/csc/faq.html. Public discussion of C++ Standard related issues occurs on news:comp.std.c++.

For committee members, files available on the committee's private web site include the HTML version of the Standard itself. HTML hyperlinks from this issues list to those files will only work for committee members who have downloaded them into the same disk directory as the issues list files.

Revision History

Issue Status

New - The issue has not yet been reviewed by the LWG. Any Proposed Resolution is purely a suggestion from the issue submitter, and should not be construed as the view of LWG.

Open - The LWG has discussed the issue but is not yet ready to move the issue forward. There are several possible reasons for open status:

A Proposed Resolution for an open issue is still not be construed as the view of LWG. Comments on the current state of discussions are often given at the end of open issues in an italic font. Such comments are for information only and should not be given undue importance.

Dup - The LWG has reached consensus that the issue is a duplicate of another issue, and will not be further dealt with. A Rationale identifies the duplicated issue's issue number.

NAD - The LWG has reached consensus that the issue is not a defect in the Standard, and the issue is ready to forward to the full committee as a proposed record of response. A Rationale discusses the LWG's reasoning.

Review - Exact wording of a Proposed Resolution is now available for review on an issue for which the LWG previously reached informal consensus.

Ready - The LWG has reached consensus that the issue is a defect in the Standard, the Proposed Resolution is correct, and the issue is ready to forward to the full committee for further action as a Defect Report (DR).

DR - (Defect Report) - The full J16 committee has voted to forward the issue to the Project Editor to be processed as a Potential Defect Report. The Project Editor reviews the issue, and then forwards it to the WG21 Convenor, who returns it to the full committee for final disposition. This issues list accords the status of DR to all these Defect Reports regardless of where they are in that process.

TC - (Technical Corrigenda) - The full WG21 committee has voted to accept the Defect Report's Proposed Resolution as a Technical Corrigenda. Action on this issue is thus complete and no further action is possible under ISO rules.

WP - (Working Paper) - The proposed resolution has not been accepted as a Technical Corrigendum, but the full WG21 committee has voted to apply the Defect Report's Proposed Resolution to the working paper.

RR - (Record of Response) - The full WG21 committee has determined that this issue is not a defect in the Standard. Action on this issue is thus complete and no further action is possible under ISO rules.

Future - In addition to the regular status, the LWG believes that this issue should be revisited at the next revision of the standard. It is usually paired with NAD.

Issues are always given the status of New when they first appear on the issues list. They may progress to Open or Review while the LWG is actively working on them. When the LWG has reached consensus on the disposition of an issue, the status will then change to Dup, NAD, or Ready as appropriate. Once the full J16 committee votes to forward Ready issues to the Project Editor, they are given the status of Defect Report ( DR). These in turn may become the basis for Technical Corrigenda (TC), or are closed without action other than a Record of Response (RR ). The intent of this LWG process is that only issues which are truly defects in the Standard move to the formal ISO DR status.

Active Issues


23. Num_get overflow result

Section: 22.2.2.1.2 [lib.facet.num.get.virtuals]  Status: Open  Submitter: Nathan Myers  Date: 6 Aug 1998

The current description of numeric input does not account for the possibility of overflow. This is an implicit result of changing the description to rely on the definition of scanf() (which fails to report overflow), and conflicts with the documented behavior of traditional and current implementations.

Users expect, when reading a character sequence that results in a value unrepresentable in the specified type, to have an error reported. The standard as written does not permit this.

Further comments from Dietmar:

I don't feel comfortable with the proposed resolution to issue 23: It kind of simplifies the issue to much. Here is what is going on:

Currently, the behavior of numeric overflow is rather counter intuitive and hard to trace, so I will describe it briefly:

Further discussion from Redmond:

The basic problem is that we've defined our behavior, including our error-reporting behavior, in terms of C90. However, C90's method of reporting overflow in scanf is not technically an "input error". The strto_* functions are more precise.

There was general consensus that failbit should be set upon overflow. We considered three options based on this:

  1. Set failbit upon conversion error (including overflow), and don't store any value.
  2. Set failbit upon conversion error, and also set errno to indicated the precise nature of the error.
  3. Set failbit upon conversion error. If the error was due to overflow, store +-numeric_limits<T>::max() as an overflow indication.

Straw poll: (1) 5; (2) 0; (3) 8.

Further discussion from Santa Cruz:

There was some discussion of what the intent of our error reporting mechanism was. There was general agreement on the following principles:

The crux of the disagreement was that some people, but not all, believed that the design was also based on a fourth principle: whenever converstion fails and failbit is set, nothing is to be extracted and the value of the variable being extracted into is guaranteed to be unchanged.

Some people believe that upon overflow, an implementation should "extract" a special value that allows the user to tell that it was overflow instead of some other kind of error. Straw poll: 1 person believed the standard should require that, 2 thought it should forbid it, and 6 thought the standard should allow but not require it.

PJP will provide wording.

Proposed resolution:


44. Iostreams use operator== on int_type values

Section: 27 [lib.input.output]  Status: Open  Submitter: Nathan Myers  Date: 6 Aug 1998

Many of the specifications for iostreams specify that character values or their int_type equivalents are compared using operators == or !=, though in other places traits::eq() or traits::eq_int_type is specified to be used throughout. This is an inconsistency; we should change uses of == and != to use the traits members instead.

Proposed resolution:

[Kona: Nathan to supply proposed wording]

[ Tokyo: the LWG reaffirmed that this is a defect, and requires careful review of clause 27 as the changes are context sensitive. ]


92. Incomplete Algorithm Requirements

Section: 25 [lib.algorithms]  Status: Review  Submitter: Nico Josuttis  Date: 29 Sep 1998

The standard does not state, how often a function object is copied, called, or the order of calls inside an algorithm. This may lead to surprising/buggy behavior. Consider the following example:

class Nth {    // function object that returns true for the nth element 
  private: 
    int nth;     // element to return true for 
    int count;   // element counter 
  public: 
    Nth (int n) : nth(n), count(0) { 
    } 
    bool operator() (int) { 
        return ++count == nth; 
    } 
}; 
.... 
// remove third element 
    list<int>::iterator pos; 
    pos = remove_if(coll.begin(),coll.end(),  // range 
                    Nth(3)),                  // remove criterion 
    coll.erase(pos,coll.end()); 

This call, in fact removes the 3rd AND the 6th element. This happens because the usual implementation of the algorithm copies the function object internally:

template <class ForwIter, class Predicate> 
ForwIter std::remove_if(ForwIter beg, ForwIter end, Predicate op) 
{ 
    beg = find_if(beg, end, op); 
    if (beg == end) { 
        return beg; 
    } 
    else { 
        ForwIter next = beg; 
        return remove_copy_if(++next, end, beg, op); 
    } 
} 

The algorithm uses find_if() to find the first element that should be removed. However, it then uses a copy of the passed function object to process the resulting elements (if any). Here, Nth is used again and removes also the sixth element. This behavior compromises the advantage of function objects being able to have a state. Without any cost it could be avoided (just implement it directly instead of calling find_if()).

Proposed resolution:

Add a new paragraph following 25 [lib.algorithms] paragraph 8:

[Note: Unless otherwise specified, algorithms that take function objects as arguments are permitted to copy those function objects freely. Programmers for whom object identity is important should consider using a wrapper class that points to a noncopied implementation object, or some equivalent solution.]

[Dublin: Pete Becker felt that this may not be a defect, but rather something that programmers need to be educated about. There was discussion of adding wording to the effect that the number and order of calls to function objects, including predicates, not affect the behavior of the function object.]

[Pre-Kona: Nico comments: It seems the problem is that we don't have a clear statement of "predicate" in the standard. People including me seemed to think "a function returning a Boolean value and being able to be called by an STL algorithm or be used as sorting criterion or ... is a predicate". But a predicate has more requirements: It should never change its behavior due to a call or being copied. IMHO we have to state this in the standard. If you like, see section 8.1.4 of my library book for a detailed discussion.]

[Kona: Nico will provide wording to the effect that "unless otherwise specified, the number of copies of and calls to function objects by algorithms is unspecified".  Consider placing in 25 [lib.algorithms] after paragraph 9.]

[Santa Cruz: The standard doesn't currently guarantee that functions object won't be copied, and what isn't forbidden is allowed. It is believed (especially since implementations that were written in concert with the standard do make copies of function objects) that this was intentional. Thus, no normative change is needed. What we should put in is a non-normative note suggesting to programmers that if they want to guarantee the lack of copying they should use something like the ref wrapper.]

[Oxford: Matt provided wording.]


96. Vector<bool> is not a container

Section: 23.2.5 [lib.vector.bool]  Status: Open  Submitter: AFNOR  Date: 7 Oct 1998

vector<bool> is not a container as its reference and pointer types are not references and pointers.

Also it forces everyone to have a space optimization instead of a speed one.

See also: 99-0008 == N1185 Vector<bool> is Nonconforming, Forces Optimization Choice.

Proposed resolution:

[In Santa Cruz the LWG felt that this was Not A Defect.]

[In Dublin many present felt that failure to meet Container requirements was a defect. There was disagreement as to whether or not the optimization requirements constituted a defect.]

[The LWG looked at the following resolutions in some detail:
     * Not A Defect.
     * Add a note explaining that vector<bool> does not meet Container requirements.
     * Remove vector<bool>.
     * Add a new category of container requirements which vector<bool> would meet.
     * Rename vector<bool>.

No alternative had strong, wide-spread, support and every alternative had at least one "over my dead body" response.

There was also mention of a transition scheme something like (1) add vector_bool and deprecate vector<bool> in the next standard. (2) Remove vector<bool> in the following standard.]

[Modifying container requirements to permit returning proxies (thus allowing container requirements conforming vector<bool>) was also discussed.]

[It was also noted that there is a partial but ugly workaround in that vector<bool> may be further specialized with a customer allocator.]

[Kona: Herb Sutter presented his paper J16/99-0035==WG21/N1211, vector<bool>: More Problems, Better Solutions. Much discussion of a two step approach: a) deprecate, b) provide replacement under a new name. LWG straw vote on that: 1-favor, 11-could live with, 2-over my dead body. This resolution was mentioned in the LWG report to the full committee, where several additional committee members indicated over-my-dead-body positions.]

[Tokyo: Not discussed by the full LWG; no one claimed new insights and so time was more productively spent on other issues. In private discussions it was asserted that requirements for any solution include 1) Increasing the full committee's understanding of the problem, and 2) providing compiler vendors, authors, teachers, and of course users with specific suggestions as to how to apply the eventual solution.]


98. Input iterator requirements are badly written

Section: 24.1.1 [lib.input.iterators]  Status: Review  Submitter: AFNOR  Date: 7 Oct 1998

Table 72 in 24.1.1 [lib.input.iterators] specifies semantics for *r++ of:

   { T tmp = *r; ++r; return tmp; }

There are two problems with this. First, the return type is specified to be "T", as opposed to something like "convertible to T". This is too specific: we want to allow *r++ to return an lvalue.

Second, writing the semantics in terms of code misleadingly suggests that the effects *r++ should precisely replicate the behavior of this code, including side effects. (Does this mean that *r++ should invoke the copy constructor exactly as many times as the sample code above would?) See issue 334 for a similar problem.

Proposed resolution:

In Table 72 in 24.1.1 [lib.input.iterators], change the return type for *r++ from T to "convertible to T".

Rationale:

This issue has two parts: the return type, and the number of times the copy constructor is invoked.

The LWG believes the the first part is a real issue. It's inappropriate for the return type to be specified so much more precisely for *r++ than it is for *r. In particular, if r is of (say) type int*, then *r++ isn't int, but int&.

The LWG does not believe that the number of times the copy constructor is invoked is a real issue. This can vary in any case, because of language rules on copy constructor elision. That's too much to read into these semantics clauses.


120. Can an implementor add specializations?

Section: 17.4.3.1 [lib.reserved.names]  Status: Review  Submitter: Judy Ward  Date: 15 Dec 1998

The original issue asked whether a library implementor could specialize standard library templates for built-in types. (This was an issue because users are permitted to explicitly instantiate standard library templates.)

Specializations are no longer a problem, because of the resolution to core issue 259. Under the proposed resolution, it will be legal for a translation unit to contain both a specialization and an explicit instantiation of the same template, provided that the specialization comes first. In such a case, the explicit instantiation will be ignored. Further discussion of library issue 120 assumes that the core 259 resolution will be adopted.

However, as noted in lib-7047, one piece of this issue still remains: what happens if a standard library implementor explicitly instantiates a standard library templates? It's illegal for a program to contain two different explicit instantiations of the same template for the same type in two different translation units (ODR violation), and the core working group doesn't believe it is practical to relax that restriction.

The issue, then, is: are users allowed to explicitly instantiate standard library templates for non-user defined types? The status quo answer is 'yes'. Changing it to 'no' would give library implementors more freedom.

This is an issue because, for performance reasons, library implementors often need to explicitly instantiate standard library templates. (for example, std::basic_string<char>) Does giving users freedom to explicitly instantiate standard library templates for non-user defined types make it impossible or painfully difficult for library implementors to do this?

John Spicer suggests, in lib-8957, that library implementors have a mechanism they can use for explicit instantiations that doesn't prevent users from performing their own explicit instantiations: put each explicit instantiation in its own object file. (Different solutions might be necessary for Unix DSOs or MS-Windows DLLs.) On some platforms, library implementors might not need to do anything special: the "undefined behavior" that results from having two different explicit instantiations might be harmless.

Proposed resolution:

Append to 17.4.3.1 [lib.reserved.names] paragraph 1:

A program may explicitly instantiate any templates in the standard library only if the declaration depends on a user-defined name of external linkage and the instantiation meets the standard library requirements for the original template.

Rationale:

The LWG considered another possible resolution:

In light of the resolution to core issue 259, no normative changes in the library clauses are necessary. Add the following non-normative note to the end of 17.4.3.1 [lib.reserved.names] paragraph 1:

[Note: A program may explicitly instantiate standard library templates, even when an explicit instantiation does not depend on a user-defined name. --end note]

The LWG rejected this because it was believed that it would make it unnecessarily difficult for library implementors to write high-quality implementations. A program may not include an explicit instantiation of the same template, for the same template arguments, in two different translation units. If users are allowed to provide explicit instantiations of Standard Library templates for built-in types, then library implementors aren't, at least not without nonportable tricks.

The most serious problem is a class template that has writeable static member variables. Unfortunately, such class templates are important and, in existing Standard Library implementations, are often explicitly specialized by library implementors: locale facets, which have a writeable static member variable id. If a user's explicit instantiation collided with the implementations explicit instantiation, iostream initialization could cause locales to be constructed in an inconsistent state.

One proposed implementation technique was for Standard Library implementors to provide explicit instantiations in separate object files, so that they would not be picked up by the linker when the user also provides an explicit instantiation. However, this technique only applies for Standard Library implementations that are packaged as static archives. Most Standard Library implementations nowadays are packaged as dynamic libraries, so this technique would not apply.

The Committee is now considering standardization of dynamic linking. If there are such changes in the future, it may be appropriate to revisit this issue later.


167. Improper use of traits_type::length()

Section: 27.6.2.5.4 [lib.ostream.inserters.character]  Status: Review  Submitter: Dietmar Kühl  Date: 20 Jul 1999

Paragraph 4 states that the length is determined using traits::length(s). Unfortunately, this function is not defined for example if the character type is wchar_t and the type of s is char const*. Similar problems exist if the character type is char and the type of s is either signed char const* or unsigned char const*.

Proposed resolution:

Change 27.6.2.5.4 [lib.ostream.inserters.character] paragraph 4 from:

Effects: Behaves like an formatted inserter (as described in lib.ostream.formatted.reqmts) of out. After a sentry object is constructed it inserts characters. The number of characters starting at s to be inserted is traits::length(s). Padding is determined as described in lib.facet.num.put.virtuals. The traits::length(s) characters starting at s are widened using out.widen (lib.basic.ios.members). The widened characters and any required padding are inserted into out. Calls width(0).

to:

Effects: Behaves like an formatted inserter (as described in lib.ostream.formatted.reqmts) of out. After a sentry object is constructed it inserts n characters starting at s, where n is:

Padding is determined as described in lib.facet.num.put.virtuals. The n characters starting at s are widened using out.widen (lib.basic.ios.members). The widened characters and any required padding are inserted into out. Calls width(0).

[Santa Cruz: Matt supplied new wording]

Rationale:

We have five separate cases. In two of them we can use the user-supplied traits class without any fuss. In the other three we try to use something as close to that user-supplied class as possible. In two cases we've got a traits class that's appropriate for char and what we've got is a const signed char* or a const unsigned char*; that's close enough so we can just use a reinterpret cast, and continue to use the user-supplied traits class. Finally, there's one case where we just have to give up: where we've got a traits class for some arbitrary charT type, and we somehow have to deal with a const char*. There's nothing better to do but fall back to char_traits<char>


197. max_size() underspecified

Section: 20.1.5 [lib.allocator.requirements], 23.1 [lib.container.requirements]  Status: Open  Submitter: Andy Sawyer  Date: 21 Oct 1999

Must the value returned by max_size() be unchanged from call to call?

Must the value returned from max_size() be meaningful?

Possible meanings identified in lib-6827:

1) The largest container the implementation can support given "best case" conditions - i.e. assume the run-time platform is "configured to the max", and no overhead from the program itself. This may possibly be determined at the point the library is written, but certainly no later than compile time.

2) The largest container the program could create, given "best case" conditions - i.e. same platform assumptions as (1), but take into account any overhead for executing the program itself. (or, roughly "storage=storage-sizeof(program)"). This does NOT include any resource allocated by the program. This may (or may not) be determinable at compile time.

3) The largest container the current execution of the program could create, given knowledge of the actual run-time platform, but again, not taking into account any currently allocated resource. This is probably best determined at program start-up.

4) The largest container the current execution program could create at the point max_size() is called (or more correctly at the point max_size() returns :-), given it's current environment (i.e. taking into account the actual currently available resources). This, obviously, has to be determined dynamically each time max_size() is called.

Proposed resolution:

Change 20.1.5 [lib.allocator.requirements] table 32 max_size() wording from:

      the largest value that can meaningfully be passed to X::allocate
to:
      the value of the largest constant expression (5.19 [expr.const]) that could ever meaningfully be passed to X::allocate

Change 23.1 [lib.container.requirements] table 65 max_size() wording from:

      size() of the largest possible container.
to:
      the value of the largest constant expression (5.19 [expr.const]) that could ever meaningfully be returned by X::size().

[Kona: The LWG informally discussed this and asked Andy Sawyer to submit an issue.]

[Tokyo: The LWG believes (1) above is the intended meaning.]

[Post-Tokyo: Beman Dawes supplied the above resolution at the request of the LWG. 21.3.3 [lib.string.capacity] was not changed because it references max_size() in 23.1. The term "compile-time" was avoided because it is not defined anywhere in the standard (even though it is used several places in the library clauses).]

[Copenhagen: Exactly what max_size means is still unclear. It may have a different meaning as a container member function than as an allocator member function. For the latter, it is probably best thought of as an architectural limit. Nathan will provide new wording.]


201. Numeric limits terminology wrong

Section: 18.2.1 [lib.limits]  Status: Open  Submitter: Stephen Cleary  Date: 21 Dec 1999

In some places in this section, the terms "fundamental types" and "scalar types" are used when the term "arithmetic types" is intended. The current usage is incorrect because void is a fundamental type and pointers are scalar types, neither of which should have specializations of numeric_limits.

Proposed resolution:

Change 18.2 [lib.support.limits] para 1 from:

The headers <limits>, <climits>, and <cfloat> supply characteristics of implementation-dependent fundamental types (3.9.1).

to:

The headers <limits>, <climits>, and <cfloat> supply characteristics of implementation-dependent arithmetic types (3.9.1).

Change 18.2.1 [lib.limits] para 1 from:

The numeric_limits component provides a C++ program with information about various properties of the implementation's representation of the fundamental types.

to:

The numeric_limits component provides a C++ program with information about various properties of the implementation's representation of the arithmetic types.

Change 18.2.1 [lib.limits] para 2 from:

Specializations shall be provided for each fundamental type. . .

to:

Specializations shall be provided for each arithmetic type. . .

Change 18.2.1 [lib.limits] para 4 from:

Non-fundamental standard types. . .

to:

Non-arithmetic standard types. . .

Change 18.2.1.1 [lib.numeric.limits] para 1 from:

The member is_specialized makes it possible to distinguish between fundamental types, which have specializations, and non-scalar types, which do not.

to:

The member is_specialized makes it possible to distinguish between arithmetic types, which have specializations, and non-arithmetic types, which do not.

[post-Toronto: The opinion of the LWG is that the wording in the standard, as well as the wording of the proposed resolution, is flawed. The term "arithmetic types" is well defined in C and C++, and it is not clear that the term is being used correctly. It is also not clear that the term "implementation dependent" has any useful meaning in this context. The biggest problem is that numeric_limits seems to be intended both for built-in types and for user-defined types, and the standard doesn't make it clear how numeric_limits applies to each of those cases. A wholesale review of numeric_limits is needed. A paper would be welcome.]


226. User supplied specializations or overloads of namespace std function templates

Section: 17.4.3.1 [lib.reserved.names]  Status: Review  Submitter: Dave Abrahams  Date: 01 Apr 2000

The issues are: 

1. How can a 3rd party library implementor (lib1) write a version of a standard algorithm which is specialized to work with his own class template? 

2. How can another library implementor (lib2) write a generic algorithm which will take advantage of the specialized algorithm in lib1?

This appears to be the only viable answer under current language rules:

namespace lib1
{
    // arbitrary-precision numbers using T as a basic unit
    template <class T>
    class big_num { //...
    };
    
    // defining this in namespace std is illegal (it would be an
    // overload), so we hope users will rely on Koenig lookup
    template <class T>
    void swap(big_int<T>&, big_int<T>&);
}
#include <algorithm>
namespace lib2
{
    template <class T>
    void generic_sort(T* start, T* end)
    {
            ...
        // using-declaration required so we can work on built-in types
        using std::swap;
        // use Koenig lookup to find specialized algorithm if available
        swap(*x, *y);
    }
}

This answer has some drawbacks. First of all, it makes writing lib2 difficult and somewhat slippery. The implementor needs to remember to write the using-declaration, or generic_sort will fail to compile when T is a built-in type. The second drawback is that the use of this style in lib2 effectively "reserves" names in any namespace which defines types which may eventually be used with lib2. This may seem innocuous at first when applied to names like swap, but consider more ambiguous names like unique_copy() instead. It is easy to imagine the user wanting to define these names differently in his own namespace. A definition with semantics incompatible with the standard library could cause serious problems (see issue 225).

Why, you may ask, can't we just partially specialize std::swap()? It's because the language doesn't allow for partial specialization of function templates. If you write:

namespace std
{
    template <class T>
    void swap(lib1::big_int<T>&, lib1::big_int<T>&);
}

You have just overloaded std::swap, which is illegal under the current language rules. On the other hand, the following full specialization is legal:

namespace std
{
    template <>
    void swap(lib1::other_type&, lib1::other_type&);
}

This issue reflects concerns raised by the "Namespace issue with specialized swap" thread on comp.lang.c++.moderated. A similar set of concerns was earlier raised on the boost.org mailing list and the ACCU-general mailing list. Also see library reflector message c++std-lib-7354.

J. C. van Winkel points out (in c++std-lib-9565) another unexpected fact: it's impossible to output a container of std::pair's using copy and an ostream_iterator, as long as both pair-members are built-in or std:: types. That's because a user-defined operator<< for (for example) std::pair<const std::string, int> will not be found: lookup for operator<< will be performed only in namespace std. Opinions differed on whether or not this was a defect, and, if so, whether the defect is that something is wrong with user-defined functionality and std, or whether it's that the standard library does not provide an operator<< for std::pair<>.

Proposed resolution:

Adopt the wording proposed in Howard Hinnant's paper N1439=03-0021, "Proposed Resolution To LWG issues 225, 226, 229".

[Tokyo: Summary, "There is no conforming way to extend std::swap for user defined templates."  The LWG agrees that there is a problem.  Would like more information before proceeding. This may be a core issue. Core issue 229 has been opened to discuss the core aspects of this problem. It was also noted that submissions regarding this issue have been received from several sources, but too late to be integrated into the issues list. ]

[Post-Tokyo: A paper with several proposed resolutions, J16/00-0029==WG21/N1252, "Shades of namespace std functions " by Alan Griffiths, is in the Post-Tokyo mailing. It should be considered a part of this issue.]

[Toronto: Dave Abrahams and Peter Dimov have proposed a resolution that involves core changes: it would add partial specialization of function template. The Core Working Group is reluctant to add partial specialization of function templates. It is viewed as a large change, CWG believes that proposal presented leaves some syntactic issues unanswered; if the CWG does add partial specialization of function templates, it wishes to develop its own proposal. The LWG continues to believe that there is a serious problem: there is no good way for users to force the library to use user specializations of generic standard library functions, and in certain cases (e.g. transcendental functions called by valarray and complex) this is important. Koenig lookup isn't adequate, since names within the library must be qualified with std (see issue 225), specialization doesn't work (we don't have partial specialization of function templates), and users aren't permitted to add overloads within namespace std. ]

[Copenhagen: Discussed at length, with no consensus. Relevant papers in the pre-Copenhagen mailing: N1289, N1295, N1296. Discussion focused on four options. (1) Relax restrictions on overloads within namespace std. (2) Mandate that the standard library use unqualified calls for swap and possibly other functions. (3) Introduce helper class templates for swap and possibly other functions. (4) Introduce partial specialization of function templates. Every option had both support and opposition. Straw poll (first number is support, second is strongly opposed): (1) 6, 4; (2) 6, 7; (3) 3, 8; (4) 4, 4.]

[Redmond: Discussed, again no consensus. Herb presented an argument that a user who is defining a type T with an associated swap should not be expected to put that swap in namespace std, either by overloading or by partial specialization. The argument is that swap is part of T's interface, and thus should to in the same namespace as T and only in that namespace. If we accept this argument, the consequence is that standard library functions should use unqualified call of swap. (And which other functions? Any?) A small group (Nathan, Howard, Jeremy, Dave, Matt, Walter, Marc) will try to put together a proposal before the next meeting.]

[Curaçao: An LWG-subgroup spent an afternoon working on issues 225, 226, and 229. Their conclusion was that the issues should be separated into an LWG portion (Howard's paper, N1387=02-0045), and a EWG portion (Dave will write a proposal). The LWG and EWG had (separate) discussions of this plan the next day. The proposed resolution is the one proposed by Howard.]

[Santa Cruz: the LWG agreed with the general direction of Howard's paper, N1387. (Roughly: Koenig lookup is disabled unless we say otherwise; this issue is about when we do say otherwise.) However, there were concerns about wording. Howard will provide new wording. Bill and Jeremy will review it.]

[Oxford: Howard proposed the new wording.]

Rationale:

Informally: introduce a Swappable concept, and specify that the value types of the iterators passed to certain standard algorithms (such as iter_swap, swap_ranges, reverse, rotate, and sort) conform to that concept. The Swappable concept will make it clear that these algorithms use unqualified lookup for the calls to swap. Also, in 26.3.3.3 [lib.valarray.transcend] paragraph 1, state that the valarray transcendentals use unqualified lookup.


233. Insertion hints in associative containers

Section: 23.1.2 [lib.associative.reqmts]  Status: Open  Submitter: Andrew Koenig  Date: 30 Apr 2000

If mm is a multimap and p is an iterator into the multimap, then mm.insert(p, x) inserts x into mm with p as a hint as to where it should go. Table 69 claims that the execution time is amortized constant if the insert winds up taking place adjacent to p, but does not say when, if ever, this is guaranteed to happen. All it says it that p is a hint as to where to insert.

The question is whether there is any guarantee about the relationship between p and the insertion point, and, if so, what it is.

I believe the present state is that there is no guarantee: The user can supply p, and the implementation is allowed to disregard it entirely.

Additional comments from Nathan:
The vote [in Redmond] was on whether to elaborately specify the use of the hint, or to require behavior only if the value could be inserted adjacent to the hint. I would like to ensure that we have a chance to vote for a deterministic treatment: "before, if possible, otherwise after, otherwise anywhere appropriate", as an alternative to the proposed "before or after, if possible, otherwise [...]".

Proposed resolution:

In table 69 "Associative Container Requirements" in 23.1.2 [lib.associative.reqmts], in the row for a.insert(p, t), change

iterator p is a hint pointing to where the insert should start to search.

to

insertion adjacent to iterator p is preferred if more than one insertion point is valid.

and change

logarithmic in general, but amortized constant if t is inserted right after p.

to

logarithmic in general, but amortized constant if t is inserted adjacent to iterator p.

[Toronto: there was general agreement that this is a real defect: when inserting an element x into a multiset that already contains several copies of x, there is no way to know whether the hint will be used. The proposed resolution was that the new element should always be inserted as close to the hint as possible. So, for example, if there is a subsequence of equivalent values, then providing a.begin() as the hint means that the new element should be inserted before the subsequence even if a.begin() is far away. JC van Winkel supplied precise wording for this proposed resolution, and also for an alternative resolution in which hints are only used when they are adjacent to the insertion point.]

[Copenhagen: the LWG agreed to the original proposed resolution, in which an insertion hint would be used even when it is far from the insertion point. This was contingent on seeing a reference implementation showing that it is possible to implement this requirement without loss of efficiency. John Potter provided such a reference implementation.]

[Redmond: The LWG was reluctant to adopt the proposal that emerged from Copenhagen: it seemed excessively complicated, and went beyond fixing the defect that we identified in Toronto. PJP provided the new wording described in this issue. Nathan agrees that we shouldn't adopt the more detailed semantics, and notes: "we know that you can do it efficiently enough with a red-black tree, but there are other (perhaps better) balanced tree techniques that might differ enough to make the detailed semantics hard to satisfy."]

[Curaçao: Nathan should give us the alternative wording he suggests so the LWG can decide between the two options.]


247. vector, deque::insert complexity

Section: 23.2.4.3 [lib.vector.modifiers]  Status: Open  Submitter: Lisa Lippincott  Date: 06 June 2000

Paragraph 2 of 23.2.4.3 [lib.vector.modifiers] describes the complexity of vector::insert:

Complexity: If first and last are forward iterators, bidirectional iterators, or random access iterators, the complexity is linear in the number of elements in the range [first, last) plus the distance to the end of the vector. If they are input iterators, the complexity is proportional to the number of elements in the range [first, last) times the distance to the end of the vector.

First, this fails to address the non-iterator forms of insert.

Second, the complexity for input iterators misses an edge case -- it requires that an arbitrary number of elements can be added at the end of a vector in constant time.

At the risk of strengthening the requirement, I suggest simply

Complexity: The complexity is linear in the number of elements inserted plus the distance to the end of the vector.

For input iterators, one may achieve this complexity by first inserting at the end of the vector, and then using rotate.

I looked to see if deque had a similar problem, and was surprised to find that deque places no requirement on the complexity of inserting multiple elements (23.2.1.3 [lib.deque.modifiers], paragraph 3):

Complexity: In the worst case, inserting a single element into a deque takes time linear in the minimum of the distance from the insertion point to the beginning of the deque and the distance from the insertion point to the end of the deque. Inserting a single element either at the beginning or end of a deque always takes constant time and causes a single call to the copy constructor of T.

I suggest:

Complexity: The complexity is linear in the number of elements inserted plus the shorter of the distances to the beginning and end of the deque. Inserting a single element at either the beginning or the end of a deque causes a single call to the copy constructor of T.

Proposed resolution:

[Toronto: It's agreed that there is a defect in complexity of multi-element insert for vector and deque. For vector, the complexity should probably be something along the lines of c1 * N + c2 * distance(i, end()). However, there is some concern about whether it is reasonable to amortize away the copies that we get from a reallocation whenever we exceed the vector's capacity. For deque, the situation is somewhat less clear. Deque is notoriously complicated, and we may not want to impose complexity requirements that would imply any implementation technique more complicated than a while loop whose body is a single-element insert.]


253. valarray helper functions are almost entirely useless

Section: 26.3.2.1 [lib.valarray.cons], 26.3.2.2 [lib.valarray.assign]  Status: Review  Submitter: Robert Klarer  Date: 31 Jul 2000

This discussion is adapted from message c++std-lib-7056 posted November 11, 1999. I don't think that anyone can reasonably claim that the problem described below is NAD.

These valarray constructors can never be called:

   template <class T>
         valarray<T>::valarray(const slice_array<T> &);
   template <class T>
         valarray<T>::valarray(const gslice_array<T> &);
   template <class T>
         valarray<T>::valarray(const mask_array<T> &);
   template <class T>
         valarray<T>::valarray(const indirect_array<T> &);

Similarly, these valarray assignment operators cannot be called:

     template <class T>
     valarray<T> valarray<T>::operator=(const slice_array<T> &);
     template <class T>
     valarray<T> valarray<T>::operator=(const gslice_array<T> &);
     template <class T>
     valarray<T> valarray<T>::operator=(const mask_array<T> &);
     template <class T>
     valarray<T> valarray<T>::operator=(const indirect_array<T> &);

Please consider the following example:

   #include <valarray>
   using namespace std;

   int main()
   {
       valarray<double> va1(12);
       valarray<double> va2(va1[slice(1,4,3)]); // line 1
   }

Since the valarray va1 is non-const, the result of the sub-expression va1[slice(1,4,3)] at line 1 is an rvalue of type const std::slice_array<double>. This slice_array rvalue is then used to construct va2. The constructor that is used to construct va2 is declared like this:

     template <class T>
     valarray<T>::valarray(const slice_array<T> &);

Notice the constructor's const reference parameter. When the constructor is called, a slice_array must be bound to this reference. The rules for binding an rvalue to a const reference are in 8.5.3, paragraph 5 (see also 13.3.3.1.4). Specifically, paragraph 5 indicates that a second slice_array rvalue is constructed (in this case copy-constructed) from the first one; it is this second rvalue that is bound to the reference parameter. Paragraph 5 also requires that the constructor that is used for this purpose be callable, regardless of whether the second rvalue is elided. The copy-constructor in this case is not callable, however, because it is private. Therefore, the compiler should report an error.

Since slice_arrays are always rvalues, the valarray constructor that has a parameter of type const slice_array<T> & can never be called. The same reasoning applies to the three other constructors and the four assignment operators that are listed at the beginning of this post. Furthermore, since these functions cannot be called, the valarray helper classes are almost entirely useless.

Proposed resolution:

slice_array:

gslice_array:

mask_array:

indirect_array:

[Proposed resolution was modified in Santa Cruz: explicitly make copy constructor and copy assignment operators public, instead of removing them.]

Rationale:

Keeping the valarray constructors private is untenable. Merely making valarray a friend of the helper classes isn't good enough, because access to the copy constructor is checked in the user's environment.

Making the assignment operator public is not strictly necessary to solve this problem. A majority of the LWG (straw poll: 13-4) believed we should make the assignment operators public, in addition to the copy constructors, for reasons of symmetry and user expectation.


258. Missing allocator requirement

Section: 20.1.5 [lib.allocator.requirements]  Status: Open  Submitter: Matt Austern  Date: 22 Aug 2000

From lib-7752:

I've been assuming (and probably everyone else has been assuming) that allocator instances have a particular property, and I don't think that property can be deduced from anything in Table 32.

I think we have to assume that allocator type conversion is a homomorphism. That is, if x1 and x2 are of type X, where X::value_type is T, and if type Y is X::template rebind<U>::other, then Y(x1) == Y(x2) if and only if x1 == x2.

Further discussion: Howard Hinnant writes, in lib-7757:

I think I can prove that this is not provable by Table 32. And I agree it needs to be true except for the "and only if". If x1 != x2, I see no reason why it can't be true that Y(x1) == Y(x2). Admittedly I can't think of a practical instance where this would happen, or be valuable. But I also don't see a need to add that extra restriction. I think we only need:

if (x1 == x2) then Y(x1) == Y(x2)

If we decide that == on allocators is transitive, then I think I can prove the above. But I don't think == is necessarily transitive on allocators. That is:

Given x1 == x2 and x2 == x3, this does not mean x1 == x3.

Example:

x1 can deallocate pointers from: x1, x2, x3
x2 can deallocate pointers from: x1, x2, x4
x3 can deallocate pointers from: x1, x3
x4 can deallocate pointers from: x2, x4

x1 == x2, and x2 == x4, but x1 != x4

Proposed resolution:

[Toronto: LWG members offered multiple opinions. One opinion is that it should not be required that x1 == x2 implies Y(x1) == Y(x2), and that it should not even be required that X(x1) == x1. Another opinion is that the second line from the bottom in table 32 already implies the desired property. This issue should be considered in light of other issues related to allocator instances.]


280. Comparison of reverse_iterator to const reverse_iterator

Section: 24.4.1 [lib.reverse.iterators]  Status: Open  Submitter: Steve Cleary  Date: 27 Nov 2000

This came from an email from Steve Cleary to Fergus in reference to issue 179. The library working group briefly discussed this in Toronto and believed it should be a separate issue. There was also some reservations about whether this was a worthwhile problem to fix.

Steve said: "Fixing reverse_iterator. std::reverse_iterator can (and should) be changed to preserve these additional requirements." He also said in email that it can be done without breaking user's code: "If you take a look at my suggested solution, reverse_iterator doesn't have to take two parameters; there is no danger of breaking existing code, except someone taking the address of one of the reverse_iterator global operator functions, and I have to doubt if anyone has ever done that. . . But, just in case they have, you can leave the old global functions in as well -- they won't interfere with the two-template-argument functions. With that, I don't see how any user code could break."

Proposed resolution:

Section: 24.4.1.1 [lib.reverse.iterator] add/change the following declarations:

  A) Add a templated assignment operator, after the same manner
        as the templated copy constructor, i.e.:

  template < class U >
  reverse_iterator < Iterator >& operator=(const reverse_iterator< U >& u);

  B) Make all global functions (except the operator+) have
  two template parameters instead of one, that is, for
  operator ==, !=, <, >, <=, >=, - replace:

       template < class Iterator >
       typename reverse_iterator< Iterator >::difference_type operator-(
                 const reverse_iterator< Iterator >& x,
                 const reverse_iterator< Iterator >& y);

  with:

      template < class Iterator1, class Iterator2 >
      typename reverse_iterator < Iterator1 >::difference_type operator-(
                 const reverse_iterator < Iterator1 > & x,
                 const reverse_iterator < Iterator2 > & y);

Also make the addition/changes for these signatures in 24.4.1.3 [lib.reverse.iter.ops].

[ Copenhagen: The LWG is concerned that the proposed resolution introduces new overloads. Experience shows that introducing overloads is always risky, and that it would be inappropriate to make this change without implementation experience. It may be desirable to provide this feature in a different way. ]


283. std::replace() requirement incorrect/insufficient

Section: 25.2.4 [lib.alg.replace]  Status: Review  Submitter: Martin Sebor  Date: 15 Dec 2000

(revision of the further discussion) There are a number of problems with the requires clauses for the algorithms in 25.1 and 25.2. The requires clause of each algorithm should describe the necessary and sufficient requirements on the inputs to the algorithm such that the algorithm compiles and runs properly. Many of the requires clauses fail to do this. Here is a summary of the kinds of mistakes:

  1. Use of EqualityComparable, which only puts requirements on a single type, when in fact an equality operator is required between two different types, typically either T and the iterator's value type or between the value types of two different iterators.
  2. Use of Assignable for T when in fact what was needed is Assignable for the value_type of the iterator, and convertability from T to the value_type of the iterator. Or for output iterators, the requirement should be that T is writable to the iterator (output iterators do not have value types).

Here is the list of algorithms that contain mistakes:

Also, in the requirements for EqualityComparable, the requirement that the operator be defined for const objects is lacking.

Proposed resolution:

20.1.1 Change p1 from

In Table 28, T is a type to be supplied by a C++ program instantiating a template, a, b, and c are values of type T.

to

In Table 28, T is a type to be supplied by a C++ program instantiating a template, a, b, and c are values of type const T.

25 Between p8 and p9

Add the following sentence:

When the description of an algorithm gives an expression such as *first == value for a condition, it is required that the expression evaluate to either true or false in boolean contexts.

25.1.2 Change p1 by deleting the requires clause.

25.1.6 Change p1 by deleting the requires clause.

25.1.9

Change p4 from

-4- Requires: Type T is EqualityComparable (20.1.1), type Size is convertible to integral type (4.7.12.3).

to

-4- Requires: The type Size is convertible to integral type (4.7.12.3).

25.2.4 Change p1 from

-1- Requires: Type T is Assignable (23.1 ) (and, for replace(), EqualityComparable (20.1.1 )).

to

-1- Requires: The expression *first = new_value must be valid.

and change p4 from

-4- Requires: Type T is Assignable (23.1) (and, for replace_copy(), EqualityComparable (20.1.1)). The ranges [first, last) and [result, result + (last - first)) shall not overlap.

to

-4- Requires: The results of the expressions *first and new_value must be writable to the result output iterator. The ranges [first, last) and [result, result + (last - first)) shall not overlap.

25.2.5 Change p1 from

-1- Requires: Type T is Assignable (23.1). The type Size is convertible to an integral type (4.7.12.3).

to

-1- Requires: The expression value must be is writable to the output iterator. The type Size is convertible to an integral type (4.7.12.3).

25.2.7 Change p1 from

-1- Requires: Type T is EqualityComparable (20.1.1).

to

-1- Requires: The value type of the iterator must be Assignable (23.1).

Rationale:

The general idea of the proposed solution is to remove the faulty requires clauses and let the returns and effects clauses speak for themselves. That is, the returns clauses contain expressions that must be valid, and therefore already imply the correct requirements. In addition, a sentence is added at the beginning of chapter 25 saying that expressions given as conditions must evaluate to true or false in a boolean context. An alternative would be to say that the type of these condition expressions must be literally bool, but that would be imposing a greater restriction that what the standard currently says (which is convertible to bool).


290. Requirements to for_each and its function object

Section: 25.1.1 [lib.alg.foreach]  Status: Open  Submitter: Angelika Langer  Date: 03 Jan 2001

The specification of the for_each algorithm does not have a "Requires" section, which means that there are no restrictions imposed on the function object whatsoever. In essence it means that I can provide any function object with arbitrary side effects and I can still expect a predictable result. In particular I can expect that the function object is applied exactly last - first times, which is promised in the "Complexity" section.

I don't see how any implementation can give such a guarantee without imposing requirements on the function object.

Just as an example: consider a function object that removes elements from the input sequence. In that case, what does the complexity guarantee (applies f exactly last - first times) mean?

One can argue that this is obviously a nonsensical application and a theoretical case, which unfortunately it isn't. I have seen programmers shooting themselves in the foot this way, and they did not understand that there are restrictions even if the description of the algorithm does not say so.

Proposed resolution:

Add a "Requires" section to section 25.1.1 similar to those proposed for transform and the numeric algorithms (see issue 242):

-2- Requires: In the range [first, last], f shall not invalidate iterators or subranges.

[Copenhagen: The LWG agrees that a function object passed to an algorithm should not invalidate iterators in the range that the algorithm is operating on. The LWG believes that this should be a blanket statement in Clause 25, not just a special requirement for for_each. ]


291. Underspecification of set algorithms

Section: 25.3.5 [lib.alg.set.operations]  Status: Review  Submitter: Matt Austern  Date: 03 Jan 2001

The standard library contains four algorithms that compute set operations on sorted ranges: set_union, set_intersection, set_difference, and set_symmetric_difference. Each of these algorithms takes two sorted ranges as inputs, and writes the output of the appropriate set operation to an output range. The elements in the output range are sorted.

The ordinary mathematical definitions are generalized so that they apply to ranges containing multiple copies of a given element. Two elements are considered to be "the same" if, according to an ordering relation provided by the user, neither one is less than the other. So, for example, if one input range contains five copies of an element and another contains three, the output range of set_union will contain five copies, the output range of set_intersection will contain three, the output range of set_difference will contain two, and the output range of set_symmetric_difference will contain two.

Because two elements can be "the same" for the purposes of these set algorithms, without being identical in other respects (consider, for example, strings under case-insensitive comparison), this raises a number of unanswered questions:

The standard should either answer these questions, or explicitly say that the answers are unspecified. I prefer the former option, since, as far as I know, all existing implementations behave the same way.

Proposed resolution:

Add the following to the end of 25.3.5.2 [lib.set.union] paragraph 5:

If [first1, last1) contains m elements that are equivalent to each other and [first2, last2) contains n elements that are equivalent to them, then max(m, n) of these elements will be copied to the output range: all m of these elements from [first1, last1), and the last max(n-m, 0) of them from [first2, last2), in that order.

Add the following to the end of 25.3.5.3 [lib.set.intersection] paragraph 5:

If [first1, last1) contains m elements that are equivalent to each other and [first2, last2) contains n elements that are equivalent to them, the first min(m, n) of those elements from [first1, last1) are copied to the output range.

Add a new paragraph, Notes, after 25.3.5.4 [lib.set.difference] paragraph 4:

If [first1, last1) contains m elements that are equivalent to each other and [first2, last2) contains n elements that are equivalent to them, the last max(m-n, 0) elements from [first1, last1) are copied to the output range.

Add a new paragraph, Notes, after 25.3.5.5 [lib.set.symmetric.difference] paragraph 4:

If [first1, last1) contains m elements that are equivalent to each other and [first2, last2) contains n elements that are equivalent to them, then |m - n| of those elements will be copied to the output range: the last m - n of these elements from [first1, last1) if m > n, and the last n - m of these elements from [first2, last2) if m < n.

[Santa Cruz: it's believed that this language is clearer than what's in the Standard. However, it's also believed that the Standard may already make these guarantees (although not quite in these words). Bill and Howard will check and see whether they think that some or all of these changes may be redundant. If so, we may close this issue as NAD.]

Rationale:

For simple cases, these descriptions are equivalent to what's already in the Standard. For more complicated cases, they describe the behavior of existing implementations.


294. User defined macros and standard headers

Section: 17.4.3.1.1 [lib.macro.names]  Status: Open  Submitter: James Kanze  Date: 11 Jan 2001

Paragraph 2 of 17.4.3.1.1 [lib.macro.names] reads: "A translation unit that includes a header shall not contain any macros that define names declared in that header." As I read this, it would mean that the following program is legal:

  #define npos 3.14
  #include <sstream>

since npos is not defined in <sstream>. It is, however, defined in <string>, and it is hard to imagine an implementation in which <sstream> didn't include <string>.

I think that this phrase was probably formulated before it was decided that a standard header may freely include other standard headers. The phrase would be perfectly appropriate for C, for example. In light of 17.4.4.1 [lib.res.on.headers] paragraph 1, however, it isn't stringent enough.

Proposed resolution:

In paragraph 2 of 17.4.3.1.1 [lib.macro.names], change "A translation unit that includes a header shall not contain any macros that define names declared in that header." to "A translation unit that includes a header shall not contain any macros that define names declared in any standard header."

[Copenhagen: the general idea is clearly correct, but there is concern about making sure that the two paragraphs in 17.4.3.1.1 [lib.macro.names] remain consistent. Nathan will provide new wording.]


299. Incorrect return types for iterator dereference

Section: 24.1.4 [lib.bidirectional.iterators], 24.1.5 [lib.random.access.iterators]  Status: Open  Submitter: John Potter  Date: 22 Jan 2001

In section 24.1.4 [lib.bidirectional.iterators], Table 75 gives the return type of *r-- as convertible to T. This is not consistent with Table 74 which gives the return type of *r++ as T&. *r++ = t is valid while *r-- = t is invalid.

In section 24.1.5 [lib.random.access.iterators], Table 76 gives the return type of a[n] as convertible to T. This is not consistent with the semantics of *(a + n) which returns T& by Table 74. *(a + n) = t is valid while a[n] = t is invalid.

Discussion from the Copenhagen meeting: the first part is uncontroversial. The second part, operator[] for Random Access Iterators, requires more thought. There are reasonable arguments on both sides. Return by value from operator[] enables some potentially useful iterators, e.g. a random access "iota iterator" (a.k.a "counting iterator" or "int iterator"). There isn't any obvious way to do this with return-by-reference, since the reference would be to a temporary. On the other hand, reverse_iterator takes an arbitrary Random Access Iterator as template argument, and its operator[] returns by reference. If we decided that the return type in Table 76 was correct, we would have to change reverse_iterator. This change would probably affect user code.

History: the contradiction between reverse_iterator and the Random Access Iterator requirements has been present from an early stage. In both the STL proposal adopted by the committee (N0527==94-0140) and the STL technical report (HPL-95-11 (R.1), by Stepanov and Lee), the Random Access Iterator requirements say that operator[]'s return value is "convertible to T". In N0527 reverse_iterator's operator[] returns by value, but in HPL-95-11 (R.1), and in the STL implementation that HP released to the public, reverse_iterator's operator[] returns by reference. In 1995, the standard was amended to reflect the contents of HPL-95-11 (R.1). The original intent for operator[] is unclear.

In the long term it may be desirable to add more fine-grained iterator requirements, so that access method and traversal strategy can be decoupled. (See "Improved Iterator Categories and Requirements", N1297 = 01-0011, by Jeremy Siek.) Any decisions about issue 299 should keep this possibility in mind.

Proposed resolution:

In section 24.1.4 [lib.bidirectional.iterators], change the return type in table 75 from "convertible to T" to T&.

In section 24.1.5 [lib.random.access.iterators], change the return type in table 76 from "convertible to T" to T&.

[Curaçao: Jeremy volunteered to work on this issue.]

Rationale:

This proposed resolution is a compromise between John Potter's resolution, which requires T& as the return type of a[n], and the current wording, which requires convertible to T. The compromise is to keep the convertible to T for the return type of the expression a[n], but to also add a[n] = t as a valid expression. This compromise "saves" the common case uses of random access iterators, while at the same time allowing iterators such as counting iterator and caching file iterators to remain random access iterators (iterators where the lifetime of the object returned by operator*() is tied to the lifetime of the iterator).

Note there is one kind of mutable random access iterator that will no longer meet the new requirements. Currently, iterators that return an r-value from operator[] meet the requirements for a mutable random access iterartor, even though the expression a[n] = t will only modify a temporary that goes away. With this proposed resolution, a[n] = t will be required to have the same operational semantics as *(a + n) = t.

Note also that reverse_iterator (24.4.1 [lib.reverse.iterators]) may be broken. However, this proposed resolution doesn't break it any worse than it already is. This is a separate issue. (Issue 386.)


309. Does sentry catch exceptions?

Section: 27.6 [lib.iostream.format]  Status: Review  Submitter: Martin Sebor  Date: 19 Mar 2001

The descriptions of the constructors of basic_istream<>::sentry (27.6.1.1.2 [lib.istream::sentry]) and basic_ostream<>::sentry (27.6.2.3 [lib.ostream::sentry]) do not explain what the functions do in case an exception is thrown while they execute. Some current implementations allow all exceptions to propagate, others catch them and set ios_base::badbit instead, still others catch some but let others propagate.

The text also mentions that the functions may call setstate(failbit) (without actually saying on what object, but presumably the stream argument is meant). That may have been fine for basic_istream<>::sentry prior to issue 195, since the function performs an input operation which may fail. However, issue 195 amends 27.6.1.1.2 [lib.istream::sentry], p2 to clarify that the function should actually call setstate(failbit | eofbit), so the sentence in p3 is redundant or even somewhat contradictory.

The same sentence that appears in 27.6.2.3 [lib.ostream::sentry], p3 doesn't seem to be very meaningful for basic_istream<>::sentry which performs no input. It is actually rather misleading since it would appear to guide library implementers to calling setstate(failbit) when os.tie()->flush(), the only called function, throws an exception (typically, it's badbit that's set in response to such an event).

Proposed resolution:

Remove the last sentence of 27.6.1.1.2 [lib.istream::sentry] p5 (but not the footnote, which should be moved to the preceding sentence).

Remove the last sentence of 27.6.2.3 [lib.ostream::sentry] p3 (but not the footnote, which should be moved to the preceding sentence).

Rationale:

The LWG feels that no clarification of EH policy is necessary: the standard is precise about which operations sentry's constructor performs, and about which of those operations can throw. However, the sentence at the end should be removed because it's redundant.


342. seek and eofbit

Section: 27.6.1.3 [lib.istream.unformatted]  Status: Open  Submitter: Howard Hinnant  Date: 09 Oct 201

I think we have a defect.

According to lwg issue 60 which is now a dr, the description of seekg in 27.6.1.3 [lib.istream.unformatted] paragraph 38 now looks like:

Behaves as an unformatted input function (as described in 27.6.1.3, paragraph 1), except that it does not count the number of characters extracted and does not affect the value returned by subsequent calls to gcount(). After constructing a sentry object, if fail() != true, executes rdbuf()­>pubseekpos( pos).

And according to lwg issue 243 which is also now a dr, 27.6.1.3, paragraph 1 looks like:

Each unformatted input function begins execution by constructing an object of class sentry with the default argument noskipws (second) argument true. If the sentry object returns true, when converted to a value of type bool, the function endeavors to obtain the requested input. Otherwise, if the sentry constructor exits by throwing an exception or if the sentry object returns false, when converted to a value of type bool, the function returns without attempting to obtain any input. In either case the number of extracted characters is set to 0; unformatted input functions taking a character array of non-zero size as an argument shall also store a null character (using charT()) in the first location of the array. If an exception is thrown during input then ios::badbit is turned on in *this'ss error state. If (exception()&badbit)!= 0 then the exception is rethrown. It also counts the number of characters extracted. If no exception has been thrown it ends by storing the count in a member object and returning the value specified. In any event the sentry object is destroyed before leaving the unformatted input function.

And finally 27.6.1.1.2/5 says this about sentry:

If, after any preparation is completed, is.good() is true, ok_ != false otherwise, ok_ == false.

So although the seekg paragraph says that the operation proceeds if !fail(), the behavior of unformatted functions says the operation proceeds only if good(). The two statements are contradictory when only eofbit is set. I don't think the current text is clear which condition should be respected.

Further discussion from Redmond:

PJP: It doesn't seem quite right to say that seekg is "unformatted". That makes specific claims about sentry that aren't quite appropriate for seeking, which has less fragile failure modes than actual input. If we do really mean that it's unformatted input, it should behave the same way as other unformatted input. On the other hand, "principle of least surprise" is that seeking from EOF ought to be OK.

Dietmar: nothing should depend on eofbit. Eofbit should only be examined by the user to determine why something failed.

[Taken from c++std-lib-8873, c++std-lib-8874, c++std-lib-8876]

Proposed resolution:

[Santa Cruz: On the one hand, it would clearly be silly to seek to a non-EOF position without resetting eofbit. On the other hand, having seek clear eofbit explicitly would set a major precedent: there is currently no place where any of the flags are reset without the user explicitly asking for them to be. This is the tip of a general problem, that the various flags are stickier than many users might expect. Bill, Gaby, and Howard will discuss this issue and propose a resolution.]


347. locale::category and bitmask requirements

Section: 22.1.1.1.1 [lib.locale.category]  Status: Review  Submitter: P.J. Plauger, Nathan Myers  Date: 23 Oct 2001

In 22.1.1.1.1 [lib.locale.category] paragraph 1, the category members are described as bitmask elements. In fact, the bitmask requirements in 17.3.2.1.2 [lib.bitmask.types] don't seem quite right: none and all are bitmask constants, not bitmask elements.

In particular, the requirements for none interact poorly with the requirement that the LC_* constants from the C library must be recognizable as C++ locale category constants. LC_* values should not be mixed with these values to make category values.

We have two options for the proposed resolution. Informally: option 1 removes the requirement that LC_* values be recognized as category arguments. Option 2 changes the category type so that this requirement is implementable, by allowing none to be some value such as 0x1000 instead of 0.

Nathan writes: "I believe my proposed resolution [Option 2] merely re-expresses the status quo more clearly, without introducing any changes beyond resolving the DR.

Proposed resolution:

Replace the first two paragraphs of 22.1.1.1 [lib.locale.types] with:

    typedef int category;

Valid category values include the locale member bitmask elements collate, ctype, monetary, numeric, time, and messages, each of which represents a single locale category. In addition, locale member bitmask constant none is defined as zero and represents no category. And locale member bitmask constant all is defined such that the expression

    (collate | ctype | monetary | numeric | time | messages | all) == all

is true, and represents the union of all categories. Further the expression (X | Y), where X and Y each represent a single category, represents the union of the two categories.

locale member functions expecting a category argument require one of the category values defined above, or the union of two or more such values. Such a category argument identifies a set of locale categories. Each locale category, in turn, identifies a set of locale facets, including at least those shown in Table 51:

[Curaçao: need input from locale experts.]

Rationale:

The LWG considered, and rejected, an alternate proposal (described as "Option 2" in the discussion). The main reason for rejecting it was that library implementors were concerened about implementation difficult, given that getting a C++ library to work smoothly with a separately written C library is already a delicate business. Some library implementers were also concerned about the issue of adding extra locale categories.

Option 2:
Replace the first paragraph of 22.1.1.1 [lib.locale.types] with:

Valid category values include the enumerated values. In addition, the result of applying commutative operators | and & to any two valid values is valid, and results in the setwise union and intersection, respectively, of the argument categories. The values all and none are defined such that for any valid value cat, the expressions (cat | all == all), (cat & all == cat), (cat | none == cat) and (cat & none == none) are true. For non-equal values cat1 and cat2 of the remaining enumerated values, (cat1 & cat2 == none) is true. For any valid categories cat1 and cat2, the result of (cat1 & ~cat2) is valid, and equals the setwise union of those categories found in cat1 but not found in cat2. [Footnote: it is not required that all equal the setwise union of the other enumerated values; implementations may add extra categories.]


352. missing fpos requirements

Section: 21.1.2 [lib.char.traits.typedefs]  Status: Review  Submitter: Martin Sebor  Date: 2 Dec 2001

(1) There are no requirements on the stateT template parameter of fpos listed in 27.4.3. The interface appears to require that the type be at least Assignable and CopyConstructible (27.4.3.1, p1), and I think also DefaultConstructible (to implement the operations in Table 88).

21.1.2, p3, however, only requires that char_traits<charT>::state_type meet the requirements of CopyConstructible types.

(2) Additionally, the stateT template argument has no corresponding typedef in fpos which might make it difficult to use in generic code.

Proposed resolution:

Modify 21.1.2, p4 from

Requires: state_type shall meet the requirements of CopyConstructible types (20.1.3).

Requires: state_type shall meet the requirements of Assignable (23.1, p4), CopyConstructible (20.1.3), and DefaultConstructible (20.1.4) types.

Rationale:

The LWG feels this is two issues, as indicated above. The first is a defect---std::basic_fstream is unimplementable without these additional requirements---and the proposed resolution fixes it. The second is questionable; who would use that typedef? The class template fpos is used only in a very few places, all of which know the state type already. Unless motivation is provided, the second should be considered NAD.


355. Operational semantics for a.back()

Section: 23.1.1 [lib.sequence.reqmts]  Status: Review  Submitter: Yaroslav Mironov  Date: 23 Jan 2002

Table 68 "Optional Sequence Operations" in 23.1.1/12 specifies operational semantics for "a.back()" as "*--a.end()", which may be ill-formed [because calling operator-- on a temporary (the return) of a built-in type is ill-formed], provided a.end() returns a simple pointer rvalue (this is almost always the case for std::vector::end(), for example). Thus, the specification is not only incorrect, it demonstrates a dangerous construct: "--a.end()" may successfully compile and run as intended, but after changing the type of the container or the mode of compilation it may produce compile-time error.

Proposed resolution:

Change the specification in table 68 "Optional Sequence Operations" in 23.1.1/12 for "a.back()" from

*--a.end()

to

{ iterator tmp = a.end(); --tmp; *tmp; }

and the specification for "a.pop_back()" from

a.erase(--a.end())

to

{ iterator tmp = a.end(); --tmp; a.erase(tmp); }

[Curaçao: LWG changed PR from "{ X::iterator tmp = a.end(); return *--tmp; }" to "*a.rbegin()", and from "{ X::iterator tmp = a.end(); a.erase(--tmp); }" to "a.erase(rbegin())".]

[There is a second possible defect; table 68 "Optional Sequence Operations" in the "Operational Semantics" column uses operations present only in the "Reversible Container" requirements, yet there is no stated dependency between these separate requirements tables. Ask in Santa Cruz if the LWG would like a new issue opened.]

[Santa Cruz: the proposed resolution is even worse than what's in the current standard: erase is undefined for reverse iterator. If we're going to make the change, we need to define a temporary and use operator--. Additionally, we don't know how prevalent this is: do we need to make this change in more than one place? Martin has volunteered to review the standard and see if this problem occurs elsewhere.]

[Oxford: Matt provided new wording to address the concerns raised in Santa Cruz. It does not appear that this problem appears anywhere else in clauses 23 or 24.]


356. Meaning of ctype_base::mask enumerators

Section: 22.2.1 [lib.category.ctype]  Status: Open  Submitter: Matt Austern  Date: 23 Jan 2002

What should the following program print?

  #include <locale>
  #include <iostream>

  class my_ctype : public std::ctype<char>
  {
    typedef std::ctype<char> base;
  public:
    my_ctype(std::size_t refs = 0) : base(my_table, false, refs)
    {
      std::copy(base::classic_table(), base::classic_table() + base::table_size,
                my_table);
      my_table[(unsigned char) '_'] = (base::mask) (base::print | base::space);
    }
  private:
    mask my_table[base::table_size];
  };

  int main()
  {
    my_ctype ct;
    std::cout << "isspace: " << ct.is(std::ctype_base::space, '_') << "    "
              << "isalpha: " << ct.is(std::ctype_base::alpha, '_') << std::endl;
  }

The goal is to create a facet where '_' is treated as whitespace.

On gcc 3.0, this program prints "isspace: 1 isalpha: 0". On Microsoft C++ it prints "isspace: 1 isalpha: 1".

I believe that both implementations are legal, and the standard does not give enough guidance for users to be able to use std::ctype's protected interface portably.

The above program assumes that ctype_base::mask enumerators like space and print are disjoint, and that the way to say that a character is both a space and a printing character is to or those two enumerators together. This is suggested by the "exposition only" values in 22.2.1 [lib.category.ctype], but it is nowhere specified in normative text. An alternative interpretation is that the more specific categories subsume the less specific. The above program gives the results it does on the Microsoft compiler because, on that compiler, print has all the bits set for each specific printing character class.

From the point of view of std::ctype's public interface, there's no important difference between these two techniques. From the point of view of the protected interface, there is. If I'm defining a facet that inherits from std::ctype<char>, I'm the one who defines the value that table()['a'] returns. I need to know what combination of mask values I should use. This isn't so very esoteric: it's exactly why std::ctype has a protected interface. If we care about users being able to write their own ctype facets, we have to give them a portable way to do it.

Related reflector messages: lib-9224, lib-9226, lib-9229, lib-9270, lib-9272, lib-9273, lib-9274, lib-9277, lib-9279.

Issue 339 is related, but not identical. The proposed resolution if issue 339 says that ctype_base::mask must be a bitmask type. It does not say that the ctype_base::mask elements are bitmask elements, so it doesn't directly affect this issue.

More comments from Benjamin Kosnik, who believes that that C99 compatibility essentially requires what we're calling option 1 below.

I think the C99 standard is clear, that isspace -> !isalpha.
--------

#include <locale>
#include <iostream>

class my_ctype : public std::ctype<char>
{
private:
  typedef std::ctype<char> base;
  mask my_table[base::table_size];

public:
  my_ctype(std::size_t refs = 0) : base(my_table, false, refs)
  {
    std::copy(base::classic_table(), base::classic_table() + base::table_size,
              my_table);
    mask both = base::print | base::space;
    my_table[static_cast<mask>('_')] = both;
  }
};

int main()
{
  using namespace std;
  my_ctype ct;
  cout << "isspace: " << ct.is(ctype_base::space, '_') << endl;
  cout << "isprint: " << ct.is(ctype_base::print, '_') << endl;

  // ISO C99, isalpha iff upper | lower set, and !space.
  // 7.5, p 193
  // -> looks like g++ behavior is correct.
  // 356 -> bitmask elements are required for ctype_base
  // 339 -> bitmask type required for mask
  cout << "isalpha: " << ct.is(ctype_base::alpha, '_') << endl;
}

Proposed resolution:

Informally, we have three choices:

  1. Require that the enumerators are disjoint (except for alnum and graph)
  2. Require that the enumerators are not disjoint, and specify which of them subsume which others. (e.g. mandate that lower includes alpha and print)
  3. Explicitly leave this unspecified, which the result that the above program is not portable.

Either of the first two options is just as good from the standpoint of portability. Either one will require some implementations to change.

[ More discussion is needed. Nobody likes option 3. Options 1 and 2 are both controversial, 2 perhaps less so. Benjamin thinks that option 1 is required for C99 compatibility. ]


359. num_put<>::do_put (..., bool) undocumented

Section: 22.2.2.2.1 [lib.facet.num.put.members]  Status: Review  Submitter: Martin Sebor  Date: 12 Mar 2002

22.2.2.2.1, p1:

    iter_type put (iter_type out, ios_base& str, char_type fill,
                   bool val) const;
    ...

    1   Returns: do_put (out, str, fill, val).
    

AFAICS, the behavior of do_put (..., bool) is not documented anywhere, however, 22.2.2.2.2, p23:

iter_type put (iter_type out, ios_base& str, char_type fill,
               bool val) const;
Effects: If (str.flags() & ios_base::boolalpha) == 0 then do out = do_put(out, str, fill, (int)val) Otherwise do
             string_type s =
                 val ? use_facet<ctype<charT> >(loc).truename()
                     : use_facet<ctype<charT> >(loc).falsename();
and then insert the characters of s into out. out.

This means that the bool overload of do_put() will never be called, which contradicts the first paragraph. Perhaps the declaration should read do_put(), and not put()?

Note also that there is no Returns clause for this function, which should probably be corrected, just as should the second occurrence of "out." in the text.

I think the least invasive change to fix it would be something like the following:

Proposed resolution:

In 22.2.2.2.2 [lib.facet.num.put.virtuals], just above paragraph 1, remove the bool overload.

In 22.2.2.2.2 [lib.facet.num.put.virtuals], p23, make the following changes

Replace put() with do_put() in the declaration of the member function.
Change the Effects clause to a Returns clause (to avoid the requirement to call do_put(..., int) from do_put (..., bool)) like so:
23 Returns: If (str.flags() & ios_base::boolalpha) == 0 then do_put (out, str, fill, (long)val) Otherwise the function obtains a string s as if by
             string_type s =
                val ? use_facet<ctype<charT> >(loc).truename()
                    : use_facet<ctype<charT> >(loc).falsename();
and then inserts each character c of s into out via *out++ = c and returns out.

Rationale:

This fixes a couple of obvious typos, and also fixes what appears to be a requirement of gratuitous inefficiency.


362. bind1st/bind2nd type safety

Section: 20.3.6.2 [lib.bind.1st]  Status: Open  Submitter: Andrew Demkin  Date: 26 Apr 2002

The definition of bind1st() (20.3.6.2 [lib.bind.1st]) can result in the construction of an unsafe binding between incompatible pointer types. For example, given a function whose first parameter type is 'pointer to T', it's possible without error to bind an argument of type 'pointer to U' when U does not derive from T:

   foo(T*, int);

   struct T {};
   struct U {};

   U u;

   int* p;
   int* q;

   for_each(p, q, bind1st(ptr_fun(foo), &u));    // unsafe binding

The definition of bind1st() includes a functional-style conversion to map its argument to the expected argument type of the bound function (see below):

  typename Operation::first_argument_type(x)

A functional-style conversion (5.2.3 [expr.type.conv]) is defined to be semantically equivalent to an explicit cast expression (5.4 [expr.cast]), which may (according to 5.4, paragraph 5) be interpreted as a reinterpret_cast, thus masking the error.

The problem and proposed change also apply to 20.3.6.4 [lib.bind.2nd].

Proposed resolution:

The simplest and most localized change to prevent such errors is to require bind1st() use a static_cast expression rather than the functional-style conversion; that is, have bind1st() return:

   binder1st<Operation>( op,
     static_cast<typename Operation::first_argument_type>(x)).

A more agressive solution is to change the semantics of functional-style conversions to not permit a reinterpret_cast. For contexts that require the semantics of reinterpret_cast, the language may want to require the use of an explicit cast expression such as '(T) x' or 'reinterpret_cast<T>(x)' and limit the behavior of the functional notation to match statically-checked and standard conversions (as defined by 5.2.9 and 4.10, etc.). Although changing the semantics of functional-style conversions may seem drastic and does have language-wide ramifications, it has the benefit of better unifying the conversion rules for user defined types and built-in types, which can be especially important for generic template programming.

[Santa Cruz: it's clear that a function-style cast is wrong. Maybe a static cast would be better, or maybe no cast at all. Jeremy will check with the original author of this part of the Standard and will see what the original intent was.]


365. Lack of const-qualification in clause 27

Section: 27 [lib.input.output]  Status: Review  Submitter: Walter Brown, Marc Paterno  Date: 10 May 2002

Some stream and streambuf member functions are declared non-const, even thought they appear only to report information rather than to change an object's logical state. They should be declared const. See document N1360 for details and rationale.

The list of member functions under discussion: in_avail, showmanyc, tellg, tellp, is_open.

Related issue: 73

Proposed resolution:

In 27.8.1.5, 27.8.1.7, 27.8.1.8, 27.8.1.10, 27.8.1.11, and 27.8.1.13

Replace

  bool is_open();

with

  bool is_open() const;

Rationale:

Of the changes proposed in N1360, the only one that is safe is changing the filestreams' is_open to const. The LWG believed that this was NAD the first time it considered this issue (issue 73), but now thinks otherwise. The corresponding streambuf member function, after all,is already const.

The other proposed changes are less safe, because some streambuf functions that appear merely to report a value do actually perform mutating operations. It's not even clear that they should be considered "logically const", because streambuf has two interfaces, a public one and a protected one. These functions may, and often do, change the state as exposed by the protected interface, even if the state exposed by the public interface is unchanged.

Note that implementers can make this change in a binary compatible way by providing both overloads; this would be a conforming extension.


366. Excessive const-qualification

Section: 27 [lib.input.output]  Status: Open  Submitter: Walter Brown, Marc Paterno  Date: 10 May 2002

The following member functions are declared const, yet return non-const pointers. We believe they are should be changed, because they allow code that may surprise the user. See document N1360 for details and rationale.

[Santa Cruz: the real issue is that we've got const member functions that return pointers to non-const, and N1360 proposes replacing them by overloaded pairs. There isn't a consensus about whether this is a real issue, since we've never said what our constness policy is for iostreams. N1360 relies on a distinction between physical constness and logical constness; that distinction, or those terms, does not appear in the standard.]

Proposed resolution:

In 27.4.4 and 27.4.4.2

Replace

  basic_ostream<charT,traits>* tie() const;

with

  basic_ostream<charT,traits>* tie();
  const basic_ostream<charT,traits>* tie() const;

and replace

  basic_streambuf<charT,traits>* rdbuf() const;

with

  basic_streambuf<charT,traits>* rdbuf();
  const basic_streambuf<charT,traits>* rdbuf() const;

In 27.5.2 and 27.5.2.3.1

Replace

  char_type* eback() const;

with

  char_type* eback();
  const char_type* eback() const;

Replace

  char_type gptr() const;

with

  char_type* gptr();
  const char_type* gptr() const;

Replace

  char_type* egptr() const;

with

  char_type* egptr();
  const char_type* egptr() const;

In 27.5.2 and 27.5.2.3.2

Replace

  char_type* pbase() const;

with

  char_type* pbase();
  const char_type* pbase() const;

Replace

  char_type* pptr() const;

with

  char_type* pptr();
  const char_type* pptr() const;

Replace

  char_type* epptr() const;

with

  char_type* epptr();
  const char_type* epptr() const;

In 27.7.2, 27.7.2.2, 27.7.3 27.7.3.2, 27.7.4, and 27.7.6

Replace

  basic_stringbuf<charT,traits,Allocator>* rdbuf() const;

with

  basic_stringbuf<charT,traits,Allocator>* rdbuf();
  const basic_stringbuf<charT,traits,Allocator>* rdbuf() const;

In 27.8.1.5, 27.8.1.7, 27.8.1.8, 27.8.1.10, 27.8.1.11, and 27.8.1.13

Replace

  basic_filebuf<charT,traits>* rdbuf() const;

with

  basic_filebuf<charT,traits>* rdbuf();
  const basic_filebuf<charT,traits>* rdbuf() const;

368. basic_string::replace has two "Throws" paragraphs

Section: 21.3.5.6 [lib.string::replace]  Status: Open  Submitter: Beman Dawes  Date: 3 Jun 2002

21.3.5.6 [lib.string::replace] basic_string::replace, second signature, given in paragraph 1, has two "Throws" paragraphs (3 and 5).

In addition, the second "Throws" paragraph (5) includes specification (beginning with "Otherwise, the function replaces ...") that should be part of the "Effects" paragraph.

Proposed resolution:

[This is a typo that escalated. It's clear that what's in the Standard is wrong. It's less clear what the fix ought to be. Someone who understands string replace well needs to work on this.]


369. io stream objects and static ctors

Section: 27.3 [lib.iostream.objects]  Status: Open  Submitter: Ruslan Abdikeev  Date: 8 Jul 2002

Is it safe to use standard iostream objects from constructors of static objects? Are standard iostream objects constructed and are their associations established at that time?

Surpisingly enough, Standard does NOT require that.

27.3/2 [lib.iostream.objects] guarantees that standard iostream objects are constructed and their associations are established before the body of main() begins execution. It also refers to ios_base::Init class as the panacea for constructors of static objects.

However, there's nothing in 27.3 [lib.iostream.objects], in 27.4.2 [lib.ios.base], and in 27.4.2.1.6 [lib.ios::Init], that would require implementations to allow access to standard iostream objects from constructors of static objects.

Details:

Core text refers to some magic object ios_base::Init, which will be discussed below:

"The [standard iostream] objects are constructed, and their associations are established at some time prior to or during first time an object of class basic_ios<charT,traits>::Init is constructed, and in any case before the body of main begins execution." (27.3/2 [lib.iostream.objects])

The first non-normative footnote encourages implementations to initialize standard iostream objects earlier than required.

However, the second non-normative footnote makes an explicit and unsupported claim:

"Constructors and destructors for static objects can access these [standard iostream] objects to read input from stdin or write output to stdout or stderr." (27.3/2 footnote 265 [lib.iostream.objects])

The only bit of magic is related to that ios_base::Init class. AFAIK, the rationale behind ios_base::Init was to bring an instance of this class to each translation unit which #included <iostream> or related header. Such an inclusion would support the claim of footnote quoted above, because in order to use some standard iostream object it is necessary to #include <iostream>.

However, while Standard explicitly describes ios_base::Init as an appropriate class for doing the trick, I failed to found a mention of an _instance_ of ios_base::Init in Standard.

Proposed resolution:

At the end of header <iostream> synopsis in 27.3 [lib.iostream.objects]

       namespace std
       {
          ... extern istream cin; ...

add the following lines

          namespace
          {
             ios_base::Init <some_implementation_defined_name>;
          }
        }

[Santa Cruz: The LWG is leaning toward NAD. There isn't any normative wording saying that the Init scheme will be used, but that is probably intentional. Implementers use dirty tricks for iostream initialization, and doing it portably is somewhere between difficult and impossible. Too much constraint in this area is dangerous, and if we are to make any changes it would probably be more appropriate forthem to be nonnormative. Martin will try to come up with clearer wording that expreses this intent.]


371. Stability of multiset and multimap member functions

Section: 23.1 [lib.container.requirements]  Status: Open  Submitter: Frank Compagner  Date: 20 Jul 2002

The requirements for multiset and multimap containers (23.1 [lib.containers.requirements], 23.1.2 [lib.associative.reqmnts], 23.3.2 [lib.multimap] and 23.3.4 [lib.multiset]) make no mention of the stability of the required (mutating) member functions. It appears the standard allows these functions to reorder equivalent elements of the container at will, yet the pervasive red-black tree implementation appears to provide stable behaviour.

This is of most concern when considering the behaviour of erase(). A stability requirement would guarantee the correct working of the following 'idiom' that removes elements based on a certain predicate function.

  multimap<int, int> m;
  multimap<int, int>::iterator i = m.begin();
  while (i != m.end()) {
      if (pred(i))
          m.erase (i++);
      else
          ++i;
  }

Although clause 23.1.2/8 guarantees that i remains a valid iterator througout this loop, absence of the stability requirement could potentially result in elements being skipped. This would make this code incorrect, and, furthermore, means that there is no way of erasing these elements without iterating first over the entire container, and second over the elements to be erased. This would be unfortunate, and have a negative impact on both performance and code simplicity.

If the stability requirement is intended, it should be made explicit (probably through an extra paragraph in clause 23.1.2).

If it turns out stability cannot be guaranteed, i'd argue that a remark or footnote is called for (also somewhere in clause 23.1.2) to warn against relying on stable behaviour (as demonstrated by the code above). If most implementations will display stable behaviour, any problems emerging on an implementation without stable behaviour will be hard to track down by users. This would also make the need for an erase_if() member function that much greater.

This issue is somewhat related to LWG issue 130.

[Santa Cruz: More people need to look at this. Much user code may assume stability. On the other hand, it seems drastic to add a new requirement now.]

Proposed resolution:


376. basic_streambuf semantics

Section: 27.7.1.3 [lib.stringbuf.virtuals]  Status: Open  Submitter: Ray Lischner  Date: 14 Aug 2002

In Section 27.7.1.3 [lib.stringbuf.virtuals], Table 90, the implication is that the four conditions should be mutually exclusive, but they are not. The first two cases, as written, are subcases of the third. I think it would be clearer if the conditions were rewritten as follows:

(which & (ios_base::in|ios_base::out)) == ios_base::in

(which & (ios_base::in|ios_base::out)) == ios_base::out

(which & (ios_base::in|ios_base::out)) == (ios_base::in|ios_base::out) and way == either ios_base::beg or ios_base::end

Otherwise

As written, it is unclear what should be the result if cases 1 & 2 are true, but case 3 is false, e.g.,

seekoff(0, ios_base::cur, ios_base::in | ios_base::out)

[Santa Cruz: The ambiguity seems real. We need to do a survey of implementations before we decide on a solution.]

Proposed resolution:


378. locale immutability and locale::operator=()

Section: 22.1.1 [lib.locale]  Status: Open  Submitter: Martin Sebor  Date: 6 Sep 2002

I think there is a problem with 22.1.1, p6 which says that

    -6- An instance of locale is immutable; once a facet reference
        is obtained from it, that reference remains usable as long
        as the locale value itself exists.

and 22.1.1.2, p4:

    const locale& operator=(const locale& other) throw();

    -4- Effects: Creates a copy of other, replacing the current value.

How can a reference to a facet obtained from a locale object remain valid after an assignment that clearly must replace all the facets in the locale object? Imagine a program such as this

    std::locale loc ("de_DE");
    const std::ctype<char> &r0 = std::use_facet<std::ctype<char> >(loc);
    loc = std::locale ("en_US");
    const std::ctype<char> &r1 = std::use_facet<std::ctype<char> >(loc);

Is r0 really supposed to be preserved and destroyed only when loc goes out of scope?

Proposed resolution:

Suggest to replace 22.1.1 [lib.locale], p6 with

    -6- Unless assigned a new value, locale objects are immutable;
        once a facet reference is obtained from it, that reference
        remains usable as long as the locale object itself exists
        or until the locale object is assigned the value of another,
        distinct locale object.

[Santa Cruz: Dietmar agrees with this general direction, but is uncomfortable about the proposed wording. He and Martin will try to come up with better wording.]


379. nonsensical ctype::do_widen() requirement

Section: 22.2.1.1.2 [lib.locale.ctype.virtuals]  Status: Review  Submitter: Martin Sebor  Date: 6 Sep 2002

The last sentence in 22.2.1.1.2, p11 below doesn't seem to make sense.

  charT do_widen (char c) const;

  -11- Effects: Applies the simplest reasonable transformation from
       a char value or sequence of char values to the corresponding
       charT value or values. The only characters for which unique
       transformations are required are those in the basic source
       character set (2.2). For any named ctype category with a
       ctype<charT> facet ctw and valid ctype_base::mask value
       M (is(M, c) || !ctw.is(M, do_widen(c))) is true.

Shouldn't the last sentence instead read

       For any named ctype category with a ctype<char> facet ctc
       and valid ctype_base::mask value M
       (ctc.is(M, c) || !is(M, do_widen(c))) is true.

I.e., if the narrow character c is not a member of a class of characters then neither is the widened form of c. (To paraphrase footnote 224.)

Proposed resolution:

Replace the last sentence of 22.2.1.1.2 [lib.locale.ctype.virtuals], p11 with the following text:

       For any named ctype category with a ctype<char> facet ctc
       and valid ctype_base::mask value M
       (ctc.is(M, c) || !is(M, do_widen(c))) is true.

Rationale:

The LWG believes this is just a typo, and that this is the correct fix.


382. codecvt do_in/out result

Section: 22.2.1.5 [lib.locale.codecvt]  Status: Review  Submitter: Martin Sebor  Date: 30 Aug 2002

It seems that the descriptions of codecvt do_in() and do_out() leave sufficient room for interpretation so that two implementations of codecvt may not work correctly with the same filebuf. Specifically, the following seems less than adequately specified:

  1. the conditions under which the functions terminate
  2. precisely when the functions return ok
  3. precisely when the functions return partial
  4. the full set of conditions when the functions return error
  1. 22.2.1.5.2 [lib.locale.codecvt.virtuals], p2 says this about the effects of the function: ...Stops if it encounters a character it cannot convert... This assumes that there *is* a character to convert. What happens when there is a sequence that doesn't form a valid source character, such as an unassigned or invalid UNICODE character, or a sequence that cannot possibly form a character (e.g., the sequence "\xc0\xff" in UTF-8)?
  2. Table 53 says that the function returns codecvt_base::ok to indicate that the function(s) "completed the conversion." Suppose that the source sequence is "\xc0\x80" in UTF-8, with from pointing to '\xc0' and (from_end==from + 1). It is not clear whether the return value should be ok or partial (see below).
  3. Table 53 says that the function returns codecvt_base::partial if "not all source characters converted." With the from pointers set up the same way as above, it is not clear whether the return value should be partial or ok (see above).
  4. Table 53, in the row describing the meaning of error mistakenly refers to a "from_type" character, without the symbol from_type having been defined. Most likely, the word "source" character is intended, although that is not sufficient. The functions may also fail when they encounter an invalid source sequence that cannot possibly form a valid source character (e.g., as explained in bullet 1 above).

Finally, the conditions described at the end of 22.2.1.5.2 [lib.locale.codecvt.virtuals], p4 don't seem to be possible:

"A return value of partial, if (from_next == from_end), indicates that either the destination sequence has not absorbed all the available destination elements, or that additional source elements are needed before another destination element can be produced."

If the value is partial, it's not clear to me that (from_next ==from_end) could ever hold if there isn't enough room in the destination buffer. In order for (from_next==from_end) to hold, all characters in that range must have been successfully converted (according to 22.2.1.5.2 [lib.locale.codecvt.virtuals], p2) and since there are no further source characters to convert, no more room in the destination buffer can be needed.

It's also not clear to me that (from_next==from_end) could ever hold if additional source elements are needed to produce another destination character (not element as incorrectly stated in the text). partial is returned if "not all source characters have been converted" according to Table 53, which also implies that (from_next==from) does NOT hold.

Could it be that the intended qualifying condition was actually (from_next != from_end), i.e., that the sentence was supposed to read

"A return value of partial, if (from_next != from_end),..."

which would make perfect sense, since, as far as I understand it, partial can only occur if (from_next != from_end)?

Proposed resolution:

To address these issues, I propose that paragraphs 2, 3, and 4 be rewritten as follows. The proposal incorporates the accepted resolution of lwg issue 19.

-2- Effects: Converts characters in the range of source elements
    [from, from_end), placing the results in sequential positions
    starting at destination to. Converts no more than (from_end ­ from)
    source elements, and stores no more than (to_limit ­ to)
    destination elements.

    Stops if it encounters a sequence of source elements it cannot
    convert to a valid destination character. It always leaves the
    from_next and to_next pointers pointing one beyond the last
    element successfully converted.

    [Note: If returns noconv, internT and externT are the same type
    and the converted sequence is identical to the input sequence
    [from, from_next). to_next is set equal to to, the value of
    state is unchanged, and there are no changes to the values in
    [to, to_limit). --end note]

-3- Notes: Its operations on state are unspecified.
    [Note: This argument can be used, for example, to maintain shift
    state, to specify conversion options (such as count only), or to
    identify a cache of seek offsets. --end note]

-4- Returns: An enumeration value, as summarized in Table 53:

    Table 53 -- do_in/do_out result values

     Value      Meaning
    +---------+----------------------------------------------------+
    | ok      | successfully completed the conversion of all       |
    |         | complete characters in the source range            |
    +---------+----------------------------------------------------+
    | partial | the characters in the source range would, after    |
    |         | conversion, require space greater than that        |
    |         | available in the destination range                 |
    +---------+----------------------------------------------------+
    | error   | encountered either a sequence of elements in the   |
    |         | source range forming a valid source character that |
    |         | could not be converted to a destination character, |
    |         | or a sequence of elements in the source range that |
    |         | could not possibly form a valid source character   |
    +---------+----------------------------------------------------+
    | noconv  | internT and externT are the same type, and input   |
    |         | sequence is identical to converted sequence        |
    +---------+----------------------------------------------------+

    A return value of partial, i.e., if (from_next != from_end),
    indicates that either the destination sequence has not absorbed
    all the available destination elements, or that additional
    source elements are needed before another destination character
    can be produced.

[Santa Cruz: The LWG agrees that this is an important issue and that this general direction is probably correct. Dietmar, Howard, PJP, and Matt will review this wording.]


384. equal_range has unimplementable runtime complexity

Section: 25.3.3.3 [lib.equal.range]  Status: Open  Submitter: Hans Bos  Date: 18 Oct 2002

Section 25.3.3.3 [lib.equal.range] states that at most 2 * log(last - first) + 1 comparisons are allowed for equal_range.

It is not possible to implement equal_range with these constraints.

In a range of one element as in:

    int x = 1;
    equal_range(&x, &x + 1, 1)

it is easy to see that at least 2 comparison operations are needed.

For this case at most 2 * log(1) + 1 = 1 comparison is allowed.

I have checked a few libraries and they all use the same (nonconforming) algorithm for equal_range that has a complexity of

     2* log(distance(first, last)) + 2.

I guess this is the algorithm that the standard assumes for equal_range.

It is easy to see that 2 * log(distance) + 2 comparisons are enough since equal range can be implemented with lower_bound and upper_bound (both log(distance) + 1).

I think it is better to require something like 2log(distance) + O(1) (or even logarithmic as multiset::equal_range). Then an implementation has more room to optimize for certain cases (e.g. have log(distance) characteristics when at most match is found in the range but 2log(distance) + 4 for the worst case).

[Santa Cruz: The issue is real, but of greater scope than just equal_range: it affects all of the binary search algorithms. What is the complexity supposed to be for ranges of 0 or 1 elements? What base are we using for the logarithm? Are these bounds supposed to be exact, or asymptotic? (If the latter, of course, then none of the other questions matter.)]

Proposed resolution:


385. Does call by value imply the CopyConstructible requirement?

Section: 17 [lib.library]  Status: Open  Submitter: Matt Austern  Date: 23 Oct 2002

Many function templates have parameters that are passed by value; a typical example is find_if's pred parameter in 25.1.2 [lib.alg.find]. Are the corresponding template parameters (Predicate in this case) implicitly required to be CopyConstructible, or does that need to be spelled out explicitly?

This isn't quite as silly a question as it might seem to be at first sight. If you call find_if in such a way that template argument deduction applies, then of course you'll get call by value and you need to provide a copy constructor. If you explicitly provide the template arguments, however, you can force call by reference by writing something like find_if<my_iterator, my_predicate&>. The question is whether implementation are required to accept this, or whether this is ill-formed because my_predicate& is not CopyConstructible.

The scope of this problem, if it is a problem, is unknown. Function object arguments to generic algorithms in clauses 25 [lib.algorithms] and 26 [lib.numerics] are obvious examples. A review of the whole library is necessary.

Proposed resolution:

[ This is really two issues. First, predicates are typically passed by value but we don't say they must be Copy Constructible. They should be. Second: is specialization allowed to transform value arguments into references? References aren't copy constructible, so this should not be allowed. ]


386. Reverse iterator's operator[] has impossible return type

Section: 24.4.1.3.11 [lib.reverse.iter.opindex]  Status: New  Submitter: Matt Austern  Date: 23 Oct 2002

In 24.4.1.3.11 [lib.reverse.iter.opindex], reverse_iterator<>::operator[] is specified as having a return type of reverse_iterator::reference, which is the same as iterator_traits<Iterator>::reference. (Where Iterator is the underlying iterator type.)

The trouble is that Iterator's own operator[] doesn't necessarily have a return type of iterator_traits<Iterator>::reference. Its return type is merely required to be convertible to Iterator's value type. The return type specified for reverse_iterator's operator[] would thus appear to be impossible.

Related issue: 299. Jeremy will work on this.

Proposed resolution:

[ Comments from Dave Abrahams: IMO we should resolve 386 by just saying that the return type of reverse_iterator's operator[] is unspecified, allowing the random access iterator requirements to impose an appropriate return type. If we accept 299's proposed resolution (and I think we should), the return type will be readable and writable, which is about as good as we can do. ]


387. std::complex over-encapsulated

Section: 26.2 [lib.complex.numbers]  Status: Review  Submitter: Gabriel Dos Reis  Date: 8 Nov 2002

The absence of explicit description of std::complex<T> layout makes it imposible to reuse existing software developed in traditional languages like Fortran or C with unambigous and commonly accepted layout assumptions. There ought to be a way for practitioners to predict with confidence the layout of std::complex<T> whenever T is a numerical datatype. The absence of ways to access individual parts of a std::complex<T> object as lvalues unduly promotes severe pessimizations. For example, the only way to change, independently, the real and imaginary parts is to write something like

complex<T> z;
// ...
// set the real part to r
z = complex<T>(r, z.imag());
// ...
// set the imaginary part to i
z = complex<T>(z.real(), i);

At this point, it seems appropriate to recall that a complex number is, in effect, just a pair of numbers with no particular invariant to maintain. Existing practice in numerical computations has it that a complex number datatype is usually represented by Cartesian coordinates. Therefore the over-encapsulation put in the specification of std::complex<> is not justified.

Proposed resolution:

Add the following requirements to 26.2 [lib.complex.numbers] as 26.2/4:

If z is an lvalue expression of type cv std::complex<T> then

Moreover, if a is an expression of pointer type cv complex<T>* and the expression a[i] is well-defined for an integer expression i then:

In the header synopsis in 26.2.1 [lib.complex.synopsis], replace

  template<class T> T real(const complex<T>&);
  template<class T> T imag(const complex<T>&);

with

  template<class T> const T& real(const complex<T>&);
  template<class T>       T& real(      complex<T>&);
  template<class T> const T& imag(const complex<T>&);
  template<class T>       T& imag(      complex<T>&);

In 26.2.7 [lib.complex.value.ops] paragraph 1, change

  template<class T> T real(const complex<T>&);

to

  template<class T> const T& real(const complex<T>&);
  template<class T>       T& real(      complex<T>&);

and change the Returns clause to "Returns: The real part of x

.

In 26.2.7 [lib.complex.value.ops] paragraph 2, change

  template<class T> T imag(const complex<T>&);

to

  template<class T> const T& imag(const complex<T>&);
  template<class T>       T& imag(      complex<T>&);

and change the Returns clause to "Returns: The imaginary part of x

.

Rationale:

The LWG believes that C99 compatibility would be enough justification for this change even without other considerations. All existing implementations already have the layout proposed here.


389. Const overload of valarray::operator[] returns by value

Section: 26.3.2 [lib.template.valarray]  Status: Review  Submitter: Gabriel Dos Reis  Date: 8 Nov 2002

Consider the following program:

    #include <iostream>
    #include <ostream>
    #include <vector>
    #include <valarray>
    #include <algorithm>
    #include <iterator>
    template<typename Array>
    void print(const Array& a)
    {
    using namespace std;
    typedef typename Array::value_type T;
    copy(&a[0], &a[0] + a.size(),
    ostream_iterator<T>(std::cout, " "));
    }
    template<typename T, unsigned N>
    unsigned size(T(&)[N]) { return N; }
    int main()
    {
    double array[] = { 0.89, 9.3, 7, 6.23 };
    std::vector<double> v(array, array + size(array));
    std::valarray<double> w(array, size(array));
    print(v); // #1
    std::cout << std::endl;
    print(w); // #2
    std::cout << std::endl;
    }

While the call numbered #1 succeeds, the call numbered #2 fails because the const version of the member function valarray<T>::operator[](size_t) returns a value instead of a const-reference. That seems to be so for no apparent reason, no benefit. Not only does that defeats users' expectation but it also does hinder existing software (written either in C or Fortran) integration within programs written in C++. There is no reason why subscripting an expression of type valarray<T> that is const-qualified should not return a const T&.

Proposed resolution:

In the class synopsis in 26.3.2 [lib.template.valarray], and in 26.3.2.3 [lib.valarray.access] just above paragraph 1, change

  T operator[](size_t const;)

to

  const T& operator[](size_t const;)

Rationale:

Return by value seems to serve no purpose. Valaray was explicitly designed to have a specified layout so that it could easily be integrated with libraries in other languages, and return by value defeats that purpose. It is believed that this change will have no impact on allowable optimizations.


391. non-member functions specified as const

Section: 22.1.3.2 [lib.conversions]  Status: Review  Submitter: James Kanze  Date: 10 Dec 2002

The specifications of toupper and tolower both specify the functions as const, althought they are not member functions, and are not specified as const in the header file synopsis in section 22.1 [lib.locales].

Proposed resolution:

In 22.1.3.2 [lib.conversions], remove const from the function declarations of std::toupper and std::tolower

Rationale:

Fixes an obvious typo


394. behavior of formatted output on failure

Section: 27.6.2.5 [lib.ostream.formatted]  Status: Review  Submitter: Martin Sebor  Date: 27 Dec 2002

There is a contradiction in Formatted output about what bit is supposed to be set if the formatting fails. On sentence says it's badbit and another that it's failbit.

27.6.2.5.1, p1 says in the Common Requirements on Formatted output functions:

     ... If the generation fails, then the formatted output function
     does setstate(ios::failbit), which might throw an exception.

27.6.2.5.2, p1 goes on to say this about Arithmetic Inserters:

... The formatting conversion occurs as if it performed the following code fragment:

     bool failed =
         use_facet<num_put<charT,ostreambuf_iterator<charT,traits>
         > >
         (getloc()).put(*this, *this, fill(), val). failed();

     ... If failed is true then does setstate(badbit) ...

The original intent of the text, according to Jerry Schwarz (see c++std-lib-10500), is captured in the following paragraph:

In general "badbit" should mean that the stream is unusable because of some underlying failure, such as disk full or socket closure; "failbit" should mean that the requested formatting wasn't possible because of some inconsistency such as negative widths. So typically if you clear badbit and try to output something else you'll fail again, but if you clear failbit and try to output something else you'll succeed.

In the case of the arithmetic inserters, since num_put cannot report failure by any means other than exceptions (in response to which the stream must set badbit, which prevents the kind of recoverable error reporting mentioned above), the only other detectable failure is if the iterator returned from num_put returns true from failed().

Since that can only happen (at least with the required iostream specializations) under such conditions as the underlying failure referred to above (e.g., disk full), setting badbit would seem to be the appropriate response (indeed, it is required in 27.6.2.5.2, p1). It follows that failbit can never be directly set by the arithmetic (it can only be set by the sentry object under some unspecified conditions).

The situation is different for other formatted output functions which can fail as a result of the streambuf functions failing (they may do so by means other than exceptions), and which are then required to set failbit.

The contradiction, then, is that ostream::operator<<(int) will set badbit if the disk is full, while operator<<(ostream&, char) will set failbit under the same conditions. To make the behavior consistent, the Common requirements sections for the Formatted output functions should be changed as proposed below.

Proposed resolution:

In paragraph one of section 27.6.2.5 [lib.ostream.formatted], delete the sentence beginning with "If the generation fails".

Rationale:

There isn't any contradiction here. put returns a streambuf iterator. failed() is a member function of the streambuf iterator. If it's set then that's a streambuf error, not a conversion error.

The real problem isn't that there's a contradiction, but that the "If the generation fails" part makes little sense. "Generation" isn't clearly defined. It's not clear what it means for generation to fail, or even whether it can fail. The intention is probably that generaion meant formatting, as opposed to character insertion, and that this sentence was intended as analogous to character parsing.

A more precise definition would be that we set failbit if the facet reports failure. However, the mechanism for the facet reporting failure is that it sets failbit! Saying that we set failbit if the facet sets failbit would be silly, so the best thing to say is nothing.


395. inconsistencies in the definitions of rand() and random_shuffle()

Section: 26.5 [lib.c.math]  Status: Review  Submitter: James Kanze  Date: 3 Jan 2003

In 26.5 [lib.c.math], the C++ standard refers to the C standard for the definition of rand(); in the C standard, it is written that "The implementation shall behave as if no library function calls the rand function."

In 25.2.11 [lib.alg.random.shuffle], there is no specification as to how the two parameter version of the function generates its random value. I believe that all current implementations in fact call rand() (in contradiction with the requirement avove); if an implementation does not call rand(), there is the question of how whatever random generator it does use is seeded. Something is missing.

Proposed resolution:

In [lib.c.math], add a paragraph specifying that the C definition of rand shal be modified to say that "Unless otherwise specified, the implementation shall behave as if no library function calls the rand function."

In [lib.alg.random.shuffle], add a sentence to the effect that "In the two argument form of the function, the underlying source of random numbers is implementation defined. [Note: in particular, an implementation is permitted to use rand.]

Rationale:

The original proposed resolution proposed requiring the two-argument from of random_shuffle to use rand. We don't want to do that, because some existing implementations already use something else: gcc uses lrand48, for example. Using rand presents a problem if the number of elements in the sequence is greater than RAND_MAX.


396. what are characters zero and one

Section: 23.3.5.1 [lib.bitset.cons]  Status: Review  Submitter: Martin Sebor  Date: 5 Jan 2003

23.3.5.1, p6 [lib.bitset.cons] talks about a generic character having the value of 0 or 1 but there is no definition of what that means for charT other than char and wchar_t. And even for those two types, the values 0 and 1 are not actually what is intended -- the values '0' and '1' are. This, along with the converse problem in the description of to_string() in 23.3.5.2, p33, looks like a defect remotely related to DR 303.

http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html#303

23.3.5.1:
  -6-  An element of the constructed string has value zero if the
       corresponding character in str, beginning at position pos,
       is 0. Otherwise, the element has the value one.
    
23.3.5.2:
  -33-  Effects: Constructs a string object of the appropriate
        type and initializes it to a string of length N characters.
        Each character is determined by the value of its
        corresponding bit position in *this. Character position N
        ?- 1 corresponds to bit position zero. Subsequent decreasing
        character positions correspond to increasing bit positions.
        Bit value zero becomes the character 0, bit value one becomes
        the character 1.
    

Also note the typo in 23.3.5.1, p6: the object under construction is a bitset, not a string.

Proposed resolution:

Change the constructor's function declaration immediately before 23.3.5.1 [lib.bitset.cons] p3 to:

    template <class charT, class traits, class Allocator>
    explicit
    bitset(const basic_string<charT, traits, Allocator>& str,
           typename basic_string<charT, traits, Allocator>::size_type pos = 0,
           typename basic_string<charT, traits, Allocator>::size_type n =
             basic_string<charT, traits, Allocator>::npos,
           charT zero = charT('0'));

Change the first two sentences of 23.3.5.1 [lib.bitset.cons] p6 to: "An element of the constructed string has value 0 if the corresponding character in str, beginning at position pos, is zero. Otherwise, the element has the value 1.

Change the declaration of the to_string member function immediately before 23.3.5.2 [lib.bitset.members] p33 to:

    template <class charT, class traits, class Allocator>
    basic_string<charT, traits, Allocator> to_string(charT zero = charT('0')) const;

Change the last sentence of 23.3.5.2 [lib.bitset.members] p33 to: "Bit value 0 becomes the character zero, bit value 1 becomes the character zero + 1.

Change 23.3.5.3 [lib.bitset.operators] p8 to:

Returns:

  os << x.template to_string<charT,traits,allocator<charT> >(
      use_facet<ctype<charT> >(os.getloc()).widen('0'))

Rationale:

There is a real problem here: we need the character values of '0' and '1', and we have no way to get them since strings don't have imbued locales. In principle the "right" solution would be to provide an extra object, either a ctype facet or a full locale, which would be used to widen '0' and '1'. However, there was some discomfort about using such a heavyweight mechanism. The proposed resolution allows those users who care about this issue to get it right.

Note that we only need one extra argument, because the character codes for digits are guaranteed to be contiguous.

We fix the inserter to use the new argument. Note that we already fixed the analogous problem with the extractor in issue 303.


397. ostream::sentry dtor throws exceptions

Section: 27.6.2.3 [lib.ostream::sentry]  Status: Open  Submitter: Martin Sebor  Date: 5 Jan 2003

17.4.4.8, p3 prohibits library dtors from throwing exceptions.

27.6.2.3, p4 says this about the ostream::sentry dtor:

    -4- If ((os.flags() & ios_base::unitbuf) && !uncaught_exception())
        is true, calls os.flush().
    

27.6.2.6, p7 that describes ostream::flush() says:

    -7- If rdbuf() is not a null pointer, calls rdbuf()->pubsync().
        If that function returns ?-1 calls setstate(badbit) (which
        may throw ios_base::failure (27.4.4.3)).
    

That seems like a defect, since both pubsync() and setstate() can throw an exception.

Proposed resolution:

[ The contradiction is real. Clause 17 says destructors may never throw exceptions, and clause 27 specifies a destructor that does throw. In principle we might change either one. We're leaning toward changing clause 17: putting in an "unless otherwise specified" clause, and then putting in a footnote saying the sentry destructor is the only one that can throw. ]


398. effects of end-of-file on unformatted input functions

Section: 27.6.2.3 [lib.ostream::sentry]  Status: Open  Submitter: Martin Sebor  Date: 5 Jan 2003

While reviewing unformatted input member functions of istream for their behavior when they encounter end-of-file during input I found that the requirements vary, sometimes unexpectedly, and in more than one case even contradict established practice (GNU libstdc++ 3.2, IBM VAC++ 6.0, STLPort 4.5, SunPro 5.3, HP aCC 5.38, Rogue Wave libstd 3.1, and Classic Iostreams).

The following unformatted input member functions set eofbit if they encounter an end-of-file (this is the expected behavior, and also the behavior of all major implementations):

    basic_istream<charT, traits>&
    get (char_type*, streamsize, char_type);
    

Also sets failbit if it fails to extract any characters.

    basic_istream<charT, traits>&
    get (char_type*, streamsize);
    

Also sets failbit if it fails to extract any characters.

    basic_istream<charT, traits>&
    getline (char_type*, streamsize, char_type);
    

Also sets failbit if it fails to extract any characters.

    basic_istream<charT, traits>&
    getline (char_type*, streamsize);
    

Also sets failbit if it fails to extract any characters.

    basic_istream<charT, traits>&
    ignore (int, int_type);
    

    basic_istream<charT, traits>&
    read (char_type*, streamsize);
    

Also sets failbit if it encounters end-of-file.

    streamsize readsome (char_type*, streamsize);
    

The following unformated input member functions set failbit but not eofbit if they encounter an end-of-file (I find this odd since the functions make it impossible to distinguish a general failure from a failure due to end-of-file; the requirement is also in conflict with all major implementation which set both eofbit and failbit):

    int_type get();
    

    basic_istream<charT, traits>&
    get (char_type&);
    

These functions only set failbit of they extract no characters, otherwise they don't set any bits, even on failure (I find this inconsistency quite unexpected; the requirement is also in conflict with all major implementations which set eofbit whenever they encounter end-of-file):

    basic_istream<charT, traits>&
    get (basic_streambuf<charT, traits>&, char_type);
    

    basic_istream<charT, traits>&
    get (basic_streambuf<charT, traits>&);
    

This function sets no bits (all implementations except for STLport and Classic Iostreams set eofbit when they encounter end-of-file):

    int_type peek ();
    

Proposed resolution:

Informally, what we want is a global statement of intent saying that eofbit gets set if we trip across EOF, and then we can take away the specific wording for individual functions. A full review is necessary. The wording currently in the standard is a mishmash, and changing it on an individual basis wouldn't make things better. Dietmar will do this work.


400. redundant type cast in lib.allocator.members

Section: 20.4.1.1 [lib.allocator.members]  Status: Ready  Submitter: Markus Mauhart  Date: 27 Feb 2003

20.4.1.1 [lib.allocator.members] allocator members, contains the following 3 lines:

  12 Returns: new((void *) p) T( val)
     void destroy(pointer p);
  13 Returns: ((T*) p)->~T()

The type cast "(T*) p" in the last line is redundant cause we know that std::allocator<T>::pointer is a typedef for T*.

Proposed resolution:

Replace "((T*) p)" with "p".

Rationale:

Just a typo, this is really editorial.


401.  incorrect type casts in table 32 in lib.allocator.requirements

Section: 20.1.5 [lib.allocator.requirements]  Status: New  Submitter: Markus Mauhart  Date: 27 Feb 2003

I think that in par2 of 20.1.5 [lib.allocator.requirements] the last two lines of table 32 contain two incorrect type casts. The lines are ...

  a.construct(p,t)   Effect: new((void*)p) T(t)
  a.destroy(p)       Effect: ((T*)p)?->~T()

.... with the prerequisits coming from the preceding two paragraphs, especially from table 31:

  alloc<T>             a     ;// an allocator for T
  alloc<T>::pointer    p     ;// random access iterator
                              // (may be different from T*)
  alloc<T>::reference  r = *p;// T&
  T const&             t     ;

For that two type casts ("(void*)p" and "(T*)p") to be well-formed this would require then conversions to T* and void* for all alloc<T>::pointer, so it would implicitely introduce extra requirements for alloc<T>::pointer, additionally to the only current requirement (being a random access iterator).

Proposed resolution:

"(void*)p" should be replaced with "(void*)&*p" and that "((T*)p)?->" should be replaced with "(*p)." or with "(&*p)->".

Note: Actually I would prefer to replace "((T*)p)?->dtor_name" with "p?->dtor_name", but AFAICS this is not possible cause of an omission in 13.5.6 [over.ref] (for which I have filed another DR on 29.11.2002).


402. wrong new expression in [some_]allocator::construct

Section: 20.1.5 [lib.allocator.requirements], 20.4.1.1 [lib.allocator.members],   Status: New  Submitter: Markus Mauhart  Date: 27 Feb 2003

This applies to the new expression that is contained in both par12 of 20.4.1.1 [lib.allocator.members] and in par2 (table 32) of 20.1.5 [lib.allocator.requirements]. I think this new expression is wrong, involving unintended side effects.

20.4.1.1 [lib.allocator.members] contains the following 3 lines:

  11 Returns: the largest value N for which the call allocate(N,0) might succeed.
     void construct(pointer p, const_reference val);
  12 Returns: new((void *) p) T( val)

20.1.5 [lib.allocator.requirements] in table 32 has the following line:

  a.construct(p,t)   Effect: new((void*)p) T(t)

.... with the prerequisits coming from the preceding two paragraphs, especially from table 31:

  alloc<T>             a     ;// an allocator for T
  alloc<T>::pointer    p     ;// random access iterator
                              // (may be different from T*)
  alloc<T>::reference  r = *p;// T&
  T const&             t     ;

Cause of using "new" but not "::new", any existing "T::operator new" function will hide the global placement new function. When there is no "T::operator new" with adequate signature, every_alloc<T>::construct(..) is ill-formed, and most std::container<T,every_alloc<T>> use it; a workaround would be adding placement new and delete functions with adequate signature and semantic to class T, but class T might come from another party. Maybe even worse is the case when T has placement new and delete functions with adequate signature but with "unknown" semantic: I dont like to speculate about it, but whoever implements any_container<T,any_alloc> and wants to use construct(..) probably must think about it.

Proposed resolution:

Therefore I think that "new" should be replaced with "::new" in both cases.


403. basic_string::swap should not throw exceptions

Section: 21.3.5.8 [lib.string::swap]  Status: New  Submitter: Beman Dawes  Date: 25 Mar 2003

std::basic_string, 21.3 [lib.basic.string] paragraph 2 says that basic_string "conforms to the requirements of a Sequence, as specified in (23.1.1)." The sequence requirements specified in (23.1.1) to not include any prohibition on swap members throwing exceptions.

Section 23.1 [lib.container.requirements] paragraph 10 does limit conditions under which exceptions may be thrown, but applies only to "all container types defined in this clause" and so excludes basic_string::swap because it is defined elsewhere.

Eric Niebler points out that 21.3 [lib.basic.string] paragraph 5 explicitly permits basic_string::swap to invalidates iterators, which is disallowed by 23.1 [lib.container.requirements] paragraph 10. Thus the standard would be contradictory if it were read or extended to read as having basic_string meet 23.1 [lib.container.requirements] paragraph 10 requirements.

Yet several LWG members have expressed the belief that the original intent was that basic_string::swap should not throw exceptions as specified by 23.1 [lib.container.requirements] paragraph 10, and that the standard is unclear on this issue. The complexity of basic_string::swap is specified as "constant time", indicating the intent was to avoid copying (which could cause a bad_alloc or other exception). An important use of swap is to ensure that exceptions are not thrown in exception-safe code.

Note: There remains long standing concern over whether or not it is possible to reasonably meet the 23.1 [lib.container.requirements] paragraph 10 swap requirements when allocators are unequal. The specification of basic_string::swap exception requirements is in no way intended to address, prejudice, or otherwise impact that concern.

Proposed resolution:

In 21.3.5.8 [lib.string::swap], add a throws clause:

Throws: Shall not throw exceptions.


404. May a replacement allocation function be declared inline?

Section: 17.4.3.4 [lib.replacement.functions], 18.4.1 [lib.new.delete]  Status: New  Submitter: Matt Austern  Date: 24 Apr 2003

The eight basic dynamic memory allocation functions (single-object and array versions of ::operator new and ::operator delete, in the ordinary and nothrow forms) are replaceable. A C++ program may provide an alternative definition for any of them, which will be used in preference to the implementation's definition.

Three different parts of the standard mention requirements on replacement functions: 17.4.3.4 [lib.replacement.functions], 18.4.1.1 [lib.new.delete.single] and 18.4.1.2 [lib.new.delete.array], and 3.7.3 [basic.stc.dynamic].

None of these three places say whether a replacement function may be declared inline. 18.4.1.1 [lib.new.delete.single] paragraph 2 specifies a signature for the replacement function, but that's not enough: the inline specifier is not part of a function's signature. One might also reason from 7.1.2 [dcl.fct.spec] paragraph 2, which requires that "an inline function shall be defined in every translation unit in which it is used," but this may not be quite specific enough either. We should either explicitly allow or explicitly forbid inline replacement memory allocation functions.

Proposed resolution:

Add a new sentence to the end of 17.4.3.4 [lib.replacement.functions] paragraph 3: "The program's definitions shall not be specified as inline."

Rationale:

The fact that inline isn't mentioned appears to have been nothing more than an oversight. Existing implementations do not permit inline functions as replacement memory allocation functions. Providing this functionality would be difficult in some cases, and is believed to be of limited value.

----- End of document -----