Doc. no.   WG21/N1975=06-0045
Date:        2006-04-04
Project:     Programming Language C++
Reply to:   Beman Dawes <bdawes@acm.org>

Filesystem Library Proposal for TR2 (Revision 3)

Table of Contents

Introduction
Motivation and Scope
Impact on the Standard
Important Design Decisions
Proposed Text for TR2
    Introductory chapter
    Diagnostics library chapter
    Filesystem library chapter
        Definitions
        Requirements
            Requirements on programs
            Requirements
               on implementations
        Header <filesystem> synopsis
        Path traits
        Class template basic_path
            Pathname formats
            Pathname grammar
            Filename conversion
            Requirements
 Class template basic_path (continued)
            basic_path constructors
            basic_path assignments
            basic_path modifiers
            basic_path operators
            basic_path observers
            basic_path iterators
            basic_path non-member functions
            basic_path inserter and extractor
        Class template basic_filesystem_error
            basic_filesystem_error constructors
            basic_filesystem_error observers
        Class template basic_directory_entry
            basic_directory_entry constructors
            basic_directory_entry modifiers
            basic_directory_entry observers
            basic_directory_entry comparisons
Filesystem library chapter (continued)
        Class template basic_directory_iterator
            basic_directory_iterator constructors
        Class template basic_recursive_directory_iterator
        Class file_status
        Non-member operational functions
            Status functions
            Predicate functions
            Attribute functions
            Other operations functions
            Convenience functions
        Additions to header <cerrno>
        Additions to header <fstream>
Suggestions for <fstream>
 
 implementations
Path decomposition table
Issues
Acknowledgements
References
Revision History

Introduction

This paper proposes addition of a filesystem library component to the C++ Standard Library Technical Report 2. The proposal is based on the Boost Filesystem Library (see www.boost.org/libs/filesystem).

The library provides portable facilities to query and manipulate paths, files, and directories. The Boost version of the library is widely used. It would be a pure addition to the C++ standard, leaving in place existing standard library functionality in the relatively few areas where there is overlap.

Users say they prefer the Boost Filesystem Library interface to native operating system or POSIX API's, even in code without portability requirements, because the design follows modern C++ practice.

The proposed text includes an example of a program using the library.

Motivation and Scope

Why is this important?

The motivation for the library is the desire to perform safe, portable, script-like filesystem operations from within C++ programs. Because the C++ Standard Library currently contains no facilities for such filesystem tasks as directory iteration or directory creation, programmers currently must rely on operating system specific interfaces, making it difficult to write portable programs.

The intent is not to compete with Python, Perl, or shell scripting languages, but rather to provide file system operations where C++ is already the language of choice. The design encourages, but does not require, safe and portable usage.

What kinds of problems does it address, and what kinds of programmers is it intended to support?

The library addresses everyday needs, for both application programs and libraries. It is useful across every application domain that uses files. It is intended to be useful to all levels of programmers, from rank beginners to seasoned experts.

Is it based on existing practice?

Yes, very much so. The proposal is based on the Boost Filesystem Library, which has been in use since 2002 and by now is in very wide use.

Note, however, that until recently all the Boost experience was with a narrow-character only version of the library. The internationalized version as described in this proposal is just starting to be used, and will not be fully released until Boost release 1.34.

The underlying mechanisms have been in use for decades on the world's most wide-spread operating systems, such as POSIX, Windows, and various mainframe operating systems. What this proposal brings to the table is an approach that is C++ Standard Library friendly and fully internationalized.

Is there a reference implementation?

Yes. The Boost Filesystem Library is freely and publicly available. The Boost library will track the TR2 proposed library as the proposal evolves.

Impact on the Standard

What does it depend on, and what depends on it?

It depends on some standard library components, such as basic_string. No other proposals depend on it.

If a revision to the Code Conversion Proposal (See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1683.html) is accepted, it may be advantageous for the Filesystem Library to use that library rather than the current code conversion facilities proposed below.

Is it a pure extension, or does it require changes to standard components?

Most of the proposed library is a pure extension.

There are additions to header <cerrno>. Since the critical portions that might require change to C headers (always a sore point) are already mandated for POSIX compliance, and represent existing practice for many non-POSIX operating systems such as for Windows, it is not expected that they will cause any problems.

There are additions to header <fstream>.  These have been carefully specified to avoid breaking existing code in common operating environments such as POSIX, Windows, and OpenVMS. See Suggestions for <fstream> implementations for techniques to avoid breaking existing code in other environments, particularly on operating systems allowing slashes in filenames.

Can it be implemented using today's compilers, or does it require language features that will only be available as part of C++0x?

It can be (and has been) implemented with today's compilers.

There is one minor function that can best be implemented by an addition to current C++ runtime libraries, although an acceptable workaround is documented.

On operating systems with built-in support for wide-character file names, such as Windows, high-quality implementation of the header <fstream> additions require an addition to the C++ Standard Library implementation. The addition is relatively small and localized, and is already supplied by the most recent Dinkumware implementation of the Standard Library. There is a workaround that avoids modifying the standard library, but it is very much a hack and depends on a Windows feature (8.3 filename support) which some users disable, thereby disabling the workaround. The issue doesn't affect implementations on operating systems which only support narrow character file names.

Important Design Decisions

Why did you choose the specific design that you did?

Many of the specific design decisions were driven by the desire to provide a modern C++ interface that works well with the C++ Standard Library. The intent is that Standard Library users can become comfortable with the Filesystem Library in very short order.

The proposed library encourages both syntactic and semantic portability, yet does not force implementors into heroic efforts on hopeless systems. This balances the benefits to users of both code and knowledge portability with the realities faced by implementors on some operating systems.

In some cases users always need portable semantics. In some cases users always need platform specific semantics. In some cases users need to be able to choose between portable and platform specific semantics. The design evolved over a period of years to identify and meet each of those needs.

Because of the desire to support simple "script-like" usage, use cases often drove design choices. For example, users can write if (exists("foo")) rather than the lengthier if (exists(path("foo"))).

Because filesystem operations often encounter unexpected runtime errors, the library by default reports runtime errors via C++ exceptions, and ensures enough information is provided for meaningful error messages, including internationalized error messages.

What alternatives did you consider, and what are the tradeoffs?

Additional observers and modifiers for file system attributes. Attribute functions which cannot supply portable semantics are not provided, avoiding the illusion of portability in cases where it cannot in fact exist.

A larger number of operational convenience functions. Convenience functions (functions which can be portably created by composition from basic functions) were not provided unless there was widespread agreement on usefulness and need.

Compile-time or run-time options for operational functions. Numerous trial implementations were abandoned because the added complexity out-weighed the benefits, and because consensus could not be reached on the feature set.

Automatic path name checking. This feature, supplied by the Boost library for several years, allowed users to specify both default and per constructor path name checking, and thus allowed the desired degree of portability to be automatically enforced. This implicit name checking was abandoned because of user confusion and complaints of excessive nannyism..

Separate path types for regular file and directory pathnames. Pathname formats that use different syntax for regular pathnames versus directory pathnames are passing into extinction. Why prolong the agony at the cost of torturing those using modern systems? It is perhaps significant that one of the few web sites dedicated to preserving a dual pathname format operating system is named Deathrow (http://deathrow.vistech.net/).

Single path type which can at runtime accept narrow or wide character pathnames. Although certainly interesting, and possibly superior, such a design would not interoperate well with the current Standard Library's compile-time typed basic_string. A new runtime polymorphic string class would be the best place to experiment with this concept, not a path class.

What are the consequences of your choices, for users and implementors?

The design has evolved over a period of four years of actual experience by Boost users, and the most frequent causes of user complaints (such as enforced name-checking and several over-strict preconditions) were eliminated. The TR process will allow further refinement. The intent is to ensure user needs are met.

Because the Boost implementation is tested and used in a wide range of POSIX and Windows environments, many implementation concerns have already been addressed.

What decisions are left up to implementors?

Because implementations of the library are dependent on facilities of the underlying operating system, implementors are given unusual freedom to redefine semantics of the library. That being said, implementors are given strong normative encouragement to provide the TR described semantics whenever feasible.

If there are any similar libraries in use, how do their design decisions compare to yours?

There are a number of libraries which address the problem domain. Most of the C/C++ libraries have C, rather than C++ interfaces. For example, see the Apache Portable Runtime Project (http://apr.apache.org). The ACE toolkit (http://www.cs.wustl.edu/~schmidt/ACE.html) uses a C++ approach, but doesn't mesh well with the C++ Standard Library. For example, the ACE directory iterator differs greatly from Standard Library iterator requirements.

Proposed Text for Technical Report 2

Gray-shaded italic text is commentary on the proposal. It is not to be added to the TR.

Italic text is editorial guidance. It is not to be added to the TR.

Add to the introductory section of the TR:

The following standard contains provisions which, through reference in this text, constitute provisions of this Technical Report. At the time of publication, the editions indicated were valid. All standards are subject to revision, and parties to agreements based on this Technical Report are encouraged to investigate the possibility of applying the most recent editions of the standard indicated below. Members of IEC and ISO maintain registers of currently valid International Standards.

ISO/IEC 9945:2003, with the indicated corrections, is hereinafter called POSIX.

Some library behavior in this Technical Report is defined by reference to POSIX. How such behavior is actually implemented is unspecified.

[Note: This constitutes an "as if" rule for implementation of operating system dependent behavior. Presumably implementations will usually call native operating system API's. --end note]

Implementations are encouraged, but not required, to support such behavior as it is defined by POSIX. Implementations shall document any behavior that differs from the POSIX defined behavior. Implementations that do not support exact POSIX behavior are encouraged to provide behavior as close to POSIX behavior as is reasonable given the limitations of actual operating systems. If an implementation cannot provide any reasonable behavior, the implementation shall report an error in an implementation-defined manner.

[Note: Such errors might be reported by an #error directive, a static_assert, a basic_filesystem_error exception, a special return value, or some other manner. --end note]

Footnote 1: POSIX® is a registered trademark of The IEEE.

Footnote 2: UNIX® is a registered trademark of The Open Group.

Add a new clause to the TR:


Chapter (tbs) - Diagnostics library


This clause describes components that C++ programs may use to detect and report error conditions.

Header <system_error>

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      typedef implementation-defined system_error_type;
      typedef int errno_type; // determined by C standard

      system_error_type system_code(errno_type err);

      errno_type iso_code(system_error_type err);

      std::string& system_message(error_code err, std::string& target);
      std::wstring& system_message(error_code err, std::wstring& target);

      enum iso_t { iso };

      class error_code;
      class system_error;

    } // namespace sys
  } // namespace tr2
} // namespace std

Type system_error_type is the implementation-defined type used by the operating system to report error codes.

[Note: On POSIX, system_error_type is normally int. On Windows it is normally unsigned int. This type might differ if the implementation is built on an emulation layer such as Cygwin. -- end note]

system_error_type system_code(errno_type err);

Returns: An system_error_type value corresponding to err.

[Note: There is no guarantee that for a value err of type errno_type, err == iso_code( system_code(err) ). -- end note]

errno_type iso_code(system_error_type err);

Returns: An errno_type value corresponding to err.

[Note: There is no guarantee that for a value err of type system_error_type, err == system_code( iso_code(err) ). -- end note]

std::string& system_message(error_code err, std::string& target);
std::wstring& system_message(error_code err, std::wstring& target);

Effects: Appends to target an operating system specific and locale specific message corresponding to err.system().

Returns: target.

Remarks: Implementors and users are permitted to supply additional overloads in namespace std::tr2::sys.

Class error_code

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      class error_code
      {
      public:
        error_code();
        error_code(system_error_type err);
        error_code(errno_type err, iso_t);

        system_error_type system() const;
        void system(system_error_type err);

        errno_type iso() const; 
        void iso(errno_type err);

        bool error() const;

        bool operator==(error_code rhs) const;
      };
    } // namespace sys
  } // namespace tr2
} // namespace std

The class error_code defines the type of objects used to identify specific errors originating from the operating system.

error_code();

Postcondition: !error()&& iso()==0, and system() returns the value used by the operating system to represent not an error.

error_code(system_error_type err);

Postcondition: system()==err.

error_code(errno_type err, iso_t);

Postcondition: iso()==err.

system_error_type system() const;

Returns: If the most recent non-const function called, or the constructor if no non-const function has been called, had an err argument of type system_error_type, then return that argument. Otherwise, return system_code(iso()).

void system(system_error_type err);

Postcondition: system()==err.

errno_type iso() const;

Returns: If the most recent non-const function called, or the constructor if no non-const function has been called, had an err argument of type errno_type, then return that argument. Otherwise, return iso_code(system()).

void iso(errno_type err);

Postcondition: iso()==err.

bool error() const;

Returns: system()==error_code(x), where x is the value used by the operating system to represent not an error.

bool operator==(error_code rhs) const;

Returns: system()==rhs.system().

Class system_error

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      class system_error : public std::runtime_error
      {
      public:
        system_error(const std::string & what_arg, error_code ec);

        error_code code() const;
        const char * what() const;
      };
    } // namespace sys
  } // namespace tr2
} // namespace std

The class system_error defines the type of objects thrown as exceptions to report errors originating from the operating system.

system_error(const std::string & what_arg, error_code ec);

Effects: Constructs an object of class system_error.

Postcondition: strcmp(runtime_error::what(), what_arg .c_str()) == 0 && code() == ec.

error_code code() const;

Returns: the ec constructor argument.

const char * what() const;

Returns: A string containing runtime_error::what() and the result of calling system_message() with a first argument of code(). The exact format is unspecified.

Add a new clause to the TR:


Chapter (tbs) - Filesystem library


This clause describes components that C++ programs may use to interrogate and manipulate files (including directories), and certain of their attributes.

This clause applies only to hosted implementations (C++ Std, 1.4, Implementation compliance [intro.compliance]).

[Note: This clause applies to any hosted implementation. Specific operating systems such as OpenMVS3, UNIX, and Windows4 are mentioned only for purposes of illustration or to give guidance to implementors. No slight to other operating systems is implied or intended. --end note.]

Unless otherwise specified, all components described in this clause are declared in namespace std::tr2::sys.

[Note: The sys sub-namespace prevents collisions with names already in the standard library and emphasizes reliance on the operating system dependent behavior inherent in file system operations. -- end note]

The Effects and Postconditions of functions described in this clause may not be achieved in the presence of race conditions. No diagnostic is required.

If the possibility of race conditions makes it unreliable for a program to test for a precondition before calling a function described in this clause, Requires is not specified for the condition. Instead, the condition is specified as a Throws condition.

[Note: As a design practice, preconditions are not specified when it is unreasonable for a program to detect them prior to calling the function. -- end note]

Footnote 3: OpenMVS® is a registered trademark of Hewlett-Packard Development Company.

Footnote 4: Windows® is a registered trademark of Microsoft Corporation.

Definitions

The following definitions shall apply to this clause:

File: An object that can be written to, or read from, or both. A file has certain attributes, including type. File types include regular file, symbolic link, and directory. Other types of files may be supported by the implementation.

File system: A collection of files and certain of their attributes.

Filename: The name of a file. The format is as specified by the POSIX Filename base definition.

Path: A sequence of elements which identify a location within a filesystem. The elements are the root-name, root-directory, and each successive filename. See Pathname grammar.

Pathname: A character string that represents a path.

Link: A directory entry object that associates a filename with a file. On some file systems, several directory entries can associate names with the same file.

Hard link: A link to an existing file. Some file systems support multiple hard links to a file. If the last hard link to a file is removed, the file itself is removed.

[Note: A hard link can be thought of as a shared-ownership smart pointer to a file. -- end note]

Symbolic link: A type of file with the property that when the file is encountered during pathname resolution, a string stored by the file is used to modify the pathname resolution.

[Note: A symbolic link can be thought of as a raw pointer to a file. If the file pointed to does not exist, the symbolic link is said to be a "dangling" symbolic link. -- end note]

Slash: The character '/', also known as solidus.

Dot: The character '.', also known as period.

Race condition: The condition that occurs when multiple threads, processes, or computers interleave access and modification of the same object within a file system.

Requirements

Requirements on programs

The arguments for template parameters named Path, Path1, or Path2 described in this clause shall be of type basic_path, or a class derived from basic_path, unless otherwise specified.

Requirements on implementations

Some function templates described in this clause have a template parameter named Path, Path1, or Path2. When called with a function argument s of type char* or std::string, the implementation shall treat the argument as if it were coded path(s). When called with a function argument s of type wchar_t* or std::wstring, the implementation shall treat the argument as if it were coded wpath(s). For functions with two arguments, implementations shall not supply this treatment when Path1 and Path2 are different types.

[Note: This "do-the-right-thing" rule allows users to write exists("foo"), taking advantage of class basic_path's string conversion constructor,  rather than the lengthier and more error prone exists(path("foo")). This is particularly important for the simple, script-like, programs which are an important use case for the library. Calling two argument functions with different types is a very rare usage, and may well be a coding error, so automatic conversion is not supported for such cases.

The implementation technique is unspecified. One possible implementation technique, using exists() as an example, is:

template <class Path>
  typename boost::enable_if<is_basic_path<Path>,bool>::type exists(const Path& p);
inline bool exists(const path& p) { return exists<path>(p); }
inline bool exists(const wpath& p) { return exists<wpath>(p); }

 The enable_if will fail for a C string or std::basic_string argument, which will then be automatically converted to a basic_path object via the appropriate basic_path conversion constructor.   -- end note]

The two overloads are not given in the normative text because:

Implementations of functions described in this clause are permitted to call the applications program interface (API) provided by the operating system. If such an operating system API call results in an error, implementations shall report the error by throwing exception basic_filesystem_error, unless otherwise specified.

[Note: Such exceptions and the conditions that cause them to be thrown are not explicitly described in each Throws element within this clause. Because hardware failures, network failures, race conditions, and a plethora of other errors occur frequently in file system operations, users should be aware that unless otherwise specified any file system operation, not matter how apparently innocuous, may throw an exception. -- end note]

Functions commonly used in contexts where errors are not exceptional have overloads taking an additional argument of type error_code& ec. Such overloaded functions shall not throw exceptions. If an error occurs, ec shall be set to the error code reported by the operating system, otherwise ec shall be set to 0. If an overload without an argument of type error_code& ec returns void, the other overload (with an argument of type error_code& ec) returns an error_code with the value of ec.

Header <filesystem> synopsis

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      template <class String, class Traits> class basic_path;

      template<class String, class Traits>
      void swap(basic_path<String, Traits> & lhs, basic_path<String, Traits> & rhs);

      template<class String, class Traits> bool operator<(a a, b b);
      template<class String, class Traits> bool operator==(a a, b b);
      template<class String, class Traits> bool operator!=(a a, b b);
      template<class String, class Traits> bool operator>(a a, b b);
      template<class String, class Traits> bool operator<=(a a, b b);
      template<class String, class Traits> bool operator>=(a a, b b);
      template<class String, class Traits> bool operator/(a a, b b);

      template<class Path>
        basic_ostream<typename Path::string_type::value_type, typename Path::string_type::traits_type> &
          operator<<(basic_ostream<typename Path::string_type::value_type, typename Path::string_type::traits_type>& os, const Path & ph);

      template<class Path>
        basic_istream<typename Path::string_type::value_type, typename Path::string_type::traits_type> &
          operator>>(basic_istream<typename Path::string_type::value_type, typename Path::string_type::traits_type>& is, Path & ph);
      
      struct path_traits;
      struct wpath_traits;

      typedef basic_path<std::string, path_traits>    path;
      typedef basic_path<std::wstring, wpath_traits>  wpath;

      template<class Path> struct is_basic_path;

      template<class Path> struct slash { static const char value = '/'; };
      template<class Path> struct dot   { static const char value = '.'; };
      template<class Path> struct colon { static const char value = ':'; };

      class filesystem_error;
      
      template <class Path> class basic_filesystem_error;

      typedef basic_filesystem_error<path> filesystem_error;
      typedef basic_filesystem_error<wpath> wfilesystem_error;

      template <class Path> class basic_directory_entry;

      typedef basic_directory_entry<path> directory_entry;
      typedef basic_directory_entry<wpath> wdirectory_entry;

      template <class Path> class basic_directory_iterator;

      typedef basic_directory_iterator<path> directory_iterator;
      typedef basic_directory_iterator<wpath> wdirectory_iterator;

      template <class Path> class basic_recursive_directory_iterator;

      typedef basic_recursive_directory_iterator<path> recursive_directory_iterator;
      typedef basic_recursive_directory_iterator<wpath> wrecursive_directory_iterator;

      enum file_type { status_unknown, file_not_found, regular_file, directory_file,
                       symlink_file, block_file, character_file, fifo_file, socket_file,
                       type_unknown
                     };

      class file_status;

      struct space_info  // returned by space function
      {
        uintmax_t capacity;
        uintmax_t free;
        uintmax_t available;
      };

      //  status functions
      template <class Path> file_status status(const Path& p);
      template <class Path> file_status status(const Path& p, error_code& ec);
      template <class Path> file_status symlink_status(const Path& p);
      template <class Path> file_status symlink_status(const Path& p, error_code& ec);

      //  predicate functions
      bool status_known( file_status s ); 
      bool exists( file_status s );
      bool is_regular( file_status s ); 
      bool is_directory( file_status s );
      bool is_symlink( file_status s );
      bool is_other( file_status s );

      template <class Path> bool exists(const Path& p);
      template <class Path> bool is_directory(const Path& p);
      template <class Path> bool is_regular(const Path& p);
      template <class Path> bool is_other(const Path& p);
      template <class Path> bool is_symlink(const Path& p);
      template <class Path> bool is_empty(const Path& p);

      template <class Path1, class Path2>
        bool equivalent(const Path1& p1, const Path2& p2);

      //  attribute functions
      template <class Path> Path current_path();
      template <class Path> const Path& initial_path();
      template <class Path> uintmax_t file_size(const Path& p);
      template <class Path> space_info space(const Path& p);
      template <class Path> std::time_t last_write_time(const Path& p);
      template <class Path>
        void last_write_time(const Path& p, const std::time_t new_time);

      //  operations functions
      template <class Path> bool create_directory(const Path& dp);
      template <class Path1, class Path2>
        void create_hard_link(const Path1& old_fp, const Path2& new_fp);
      template <class Path1, class Path2>
        error_code create_hard_link(const Path1& old_fp, const Path2& new_fp, error_code& ec);
      template <class Path1, class Path2>
        void create_symlink(const Path1& old_fp, const Path2& new_fp);
      template <class Path1, class Path2>
        error_code create_symlink(const Path1& old_fp, const Path2& new_fp, error_code& ec);
      template <class Path> bool remove(const Path& p);
      template <class Path1, class Path2>
        void rename(const Path1& from_p, const Path2& to_p);
      template <class Path1, class Path2>
        void copy_file(const Path1& from_fp, const Path2& to_fp);
      template <class Path> Path system_complete(const Path& p);
      template <class Path> Path complete(const Path& p, const Path& base=initial_path<Path>());

      //  convenience functions
      template <class Path> bool create_directories(const Path& p);
      template <class Path> typename Path::string_type extension(const Path& p);
      template <class Path> typename Path::string_type basename(const Path& p);
      template <class Path>
        Path replace_extension(const Path& p, const typename Path::string_type& new_extension);

    } // namespace sys
  } // namespace tr2
} // namespace std

Path traits

This subclause defines requirements on classes representing path behavior traits, and defines two classes that satisfy those requirements for paths based on string and wstring.. It also defines several path additional path traits structure templates, and defines several specializations of them.

Class template basic_path defined in this clause requires additional types, values, and behavior to complete the definition of its semantics.

For purposes of exposition, Traits behaves as if it is a class with private members bool m_locked, initialized false, and std::locale m_locale, initialized

Path Behavior Traits Requirements
Expression Requirements
Traits::external_string_type A typedef which is a specialization of basic_string. The value_type is a character type used by the operating system to represent pathnames.
Traits::internal_string_type A typedef which is a specialization of basic_string. The value_type is a character type to be used by the program to represent pathnames. Required be the same type as the basic_path String template parameter.
Traits::to_external( p, is ) is, converted by the m_locale codecvt facet to external_string_type.
Traits::to_internal( p, xs ) xs, converted by the m_locale codecvt facet to to internal_string_type.
Traits::imbue(loc) Effects: if m_locked, throw. Otherwise, m_locked = true; m_locale = loc;
Returns: void
Throws: basic_filesystem_error
Traits::imbue(loc, std::nothrow) Effects: if (!m_locked) m_locale = loc; bool temp(m_locked); m_locked = true;
Returns: temp

Type is_basic_path shall be a UnaryTypeTrait (TR1, 4.1). The primary template shall be derived directly or indirectly from std::tr1::false_type. Type is_basic_path shall be specialized for path, wpath, and any user-specialized basic_path types, and such specializations shall be derived directly or indirectly from std::tr1::true_type.

Structure templates slash, dot, and colon are supplied with values of type char. If a user-specialized basic_path has a value_type type which is not convertible from char, the templates  slash and dot shall be specialized to provide value with type which is convertible to basic_path::value_type.

Class template basic_path

Class template basic_path provides a portable mechanism for representing paths in C++ programs, using a portable generic pathname grammar. When portability is not a requirement, native file system specific formats can be used. Class template basic_path is concerned only with the lexical and syntactic aspects of a path. The path does not have to exist in the operating system's file system, and may contain names which are not even valid for the current operating system.

[Note: If the library's functions trafficked only in C++ or C-style strings, they would provide only the illusion of portability since while the syntax of function calls would be portable, the semantics of the strings they operate on would not be portable. -- end note]

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      template <class String, class Traits> class basic_path
      {
      public:
        typedef basic_path<String, Traits> path_type;
        typedef String string_type;
        typedef typename String::value_type value_type;
        typedef Traits traits_type;
        typedef typename Traits::external_string_type external_string_type; 

        // constructors/destructor
        basic_path();
        basic_path(const basic_path& p);
        basic_path(const string_type& s);
        basic_path(const value_type* s);
        template <class InputIterator>
          basic_path(InputIterator first, InputIterator last);

       ~basic_path();

        // assignments
        basic_path& operator=(const basic_path& p);
        basic_path& operator=(const string_type& s);
        basic_path& operator=(const value_type* s);
        template <class InputIterator>
          basic_path& assign(InputIterator first, InputIterator last);

        // modifiers
        basic_path& operator/=(const basic_path& rhs);
        basic_path& operator/=(const string_type& s);
        basic_path& operator/=(const value_type* s);
        template <class InputIterator>
          basic_path& append(InputIterator first, InputIterator last);

        void clear();
        void swap( basic_path & rhs );
        basic_path& remove_leaf();

        // observers
        const string_type string() const;
        const string_type file_string() const;
        const string_type directory_string() const;

        const external_string_type external_file_string() const;
        const external_string_type external_directory_string() const;

        string_type  root_name() const;
        string_type  root_directory() const;
        basic_path   root_path() const;
        basic_path   relative_path() const;
        string_type  leaf() const;
        basic_path   branch_path() const;

        bool empty() const;
        bool is_complete() const;
        bool has_root_name() const;
        bool has_root_directory() const;
        bool has_root_path() const;
        bool has_relative_path() const;
        bool has_leaf() const;
        bool has_branch_path() const;

        // iterators
        class iterator;
        typedef iterator const_iterator;

        iterator begin() const;
        iterator end() const;

      };

    } // namespace sys
  } // namespace tr2
} // namespace std

A basic_path object stores a possibly empty path. The internal form of the stored path is unspecified.

Functions described in this clause which access files or their attributes do so by resolving a basic_path object into a particular file in a file hierarchy. The pathname, suitably converted to the string type, format, and encoding required by the operating system, is resolved as if by the POSIX Pathname Resolution mechanism. The encoding of the resulting pathname is determined by the Traits::to_external conversion function.

[Note: There is no guarantee that the path stored in a  basic_path object is valid for a particular operating system or file system. -- end note]

Some functions in this clause return basic_path objects for paths composed partly or wholly of pathnames obtained from the operating system. Such pathnames are suitably converted from the actual format and string type supplied by the operating system. The encoding of the resulting path is determined by the Traits::to_internal conversion function.

For member functions described as returning "const string_type" or "const external_string_type", implementations are permitted to return "const string_type&" or  "const external_string_type&" respectively.

[Note: This allows implementations to avoid unnecessary copies. Return-by-value is specified as const to ensure programs won't break if moved to a return-by-reference implementation. -- end note]

Pathname formats

There are two formats for string or sequence arguments that describe a path:

All basic_path string or sequence arguments that describe a path shall accept the portable pathname format, and shall accept the native format if explicitly identified by a native format escape sequence prefix of slash slash colon.

[Note: slash slash colon was chosen as the escape sequence because a leading slash slash  is already implementation-defined by POSIX, colon is prohibited in a Windows filename, and on any system a single slash can be used when a filename beginning with a colon is desired. These factors eliminate the chance of collision with a real filename. -- end note]

Implementations are encouraged to implicitly recognize the native pathname format if it can be lexically identified. An implementation shall document whether or not the native pathname format is implicitly recognized.

[Example:

-- OpenVMS: "SYS1::DISK1:[JANE.TYLER.HARRY]" is treated as a native pathname with a system name, drive name, and three directory filenames, rather than a portable pathname with one filename.

-- Windows: "c:\\jane\\tyler\\harry" is treated as a native pathname with a drive letter, root-directory, and three filenames, rather than a portable pathname with one filename.

-- Counter-example 1: An operating system that allows slashes in filenames and uses dot as a directory separator. Distinguishing between portable and native format argument strings or sequences is not possible as there is no other distinguishing syntax. The implementation does not accept native format pathnames unless the native argument is present.

-- Counter-example 2: An operating system that allows slashes in filenames and uses some unusual character as a directory separator. The implementation does accept native format pathnames without the additional native argument, which only has to be used for native format arguments containing slashes in filenames.

-- end example]

[Note: This duck-rule ("if it looks like a duck, walks like a duck, and quacks like a duck, it must be a duck") eliminates format confusion as a source of programmer error and support requests. -- end note]

If both the portable and native formats are accepted, implementations shall document what characters or character sequences are used to distinguish between portable and native formats.

[Note: Windows implementations are encouraged to define colons and backslashes as the characters which distinguish native from portable formats. --end note]

Pathname grammar

The grammar for the portable pathname format is as follows:

pathname:
            root-nameopt root-directoryopt relative-pathopt

root-name:
            implementation-defined

root-directory:
            slash
            root-directory slash
            implementation-defined

relative-path:
            filename
            relative-path slash
            relative-path slash filename

filename:
            name
            dot
            dot dot

slash:
            slash<Path>::value

dot:
            dot<Path>::value

The grammar is aligned with the POSIX  Filename, Pathname and Pathname Resolution definitions. Any conflict between the grammar and POSIX is unintentional. This technical report defers to POSIX.

The form of the above wording was taken from POSIX, which uses it in several places to defer to the C standard.

[Note: Windows implementations are encouraged to define slash slash name as a permissible root-name. POSIX permits, but does not require, implementations to do the same. Windows implementations are encouraged to define an additional root-directory element root_directory name. It is applicable only to the slash slash name form of root-name.

Windows implementations are encouraged to recognize a name followed by a colon as a native format root-name, and a backslash as a format element equivalent to slash. -- end note]

Filename conversion

When converting filenames to the native operating system format, implementations are encouraged, but not required, to convert otherwise invalid characters or character sequences to valid characters or character sequences. Such conversions are implementation-defined.

[Note: Filename conversion allows much wider portability of both programs and filenames that would otherwise be possible.

Implementations are encouraged to base conversion on existing standards or practice. Examples include the Uniform Resource Locator escape syntax of a percent sign ('%') followed by two hex digits representing the character value. On OpenVMS, which does not allow percent signs in filenames, a dollar sign ('$') followed by two hex digits is the existing practice, as is converting lowercase letters to uppercase. -- end note.]

The Boost implementation for Windows currently does not map invalid characters. Pending feedback from the LWG, Boost may settle on % hex hex as the preferred escape sequence. If so, should there be normative encouragement?

Requirements

The argument for the template parameter named String shall be a class that includes members with the same names, types, values, and semantics as class template basic_string.

The argument for the template parameter named Traits shall be a class that satisfies the requirements specified in the Path Behavior Traits Requirements table.

The argument for template parameters named InputIterator shall satisfy the requirements of an input iterator (C++ Std, 24.1.1, Input iterators [lib.input.iterators]) and shall have a value type convertible to basic_path::value_type.

Some function templates with a template parameter named InputIterator also have non-template overloads. Implementations shall only select the function template overload if the type named by InputIterator is not path_format_t.

[Note: This "do-the-right-thing" rule ensures that the overload expected by the user is selected. The implementation technique is unspecified - implementations may use enable_if or other techniques to achieve the effect. -- end note]

basic_path constructors

basic_path();

Postconditions: empty().

basic_path(const string_type& s);
basic_path(const value_type * s);
template <class InputIterator>
  basic_path(InputIterator s, InputIterator last);

Remarks: The format of string s and sequence [first,last) is described in Pathname formats.

Effects: The path elements in string s or sequence [first,last) are stored.

basic_path assignments

basic_path& operator=(const string_type& s);
basic_path& operator=(const value_type* s);
template <class InputIterator>
  basic_path& assign(InputIterator first, InputIterator last);

Remarks: The format of string s and sequence [first,last) is described in Pathname formats.

Effects: The path elements in string s or sequence [first,last) are stored.

Returns: *this

basic_path modifiers

basic_path& operator/=(const basic_path& rhs);

Effects: The path stored in rhs is appended to the stored path.

Returns: *this

basic_path& operator/=(const string_type& s);
basic_path& operator/=(const value_type* s);
template <class InputIterator>
basic_path& append(InputIterator first, InputIterator last);

Remarks: The format of string s and sequence [first,last) is described in Pathname formats.

Effects: The path elements in string s or sequence [first,last) are appended to the stored path.

Returns: *this

void clear();

Postcondition: this->empty() is true.

void swap( basic_path & rhs );

Effects: Swaps the contents of the two paths.

Throws: nothing.

Postcondition: this->string() contains the same sequence of characters that were in rhs.string(), rhs.string() contains the same sequence of characters that were is this->string().

Complexity: constant time.

basic_path& remove_leaf();

Effects: If has_branch_path() then remove the last filename from the stored path. If that leaves the stored path with one or more trailing slash elements not representing  root-directory, remove them.

Returns: *this

[Note: This function is needed to efficiently implement basic_directory_iterator. It is made public to allow additional uses. -- end note]

basic_path observers

See the Path decomposition table for examples for values returned by decomposition functions.

const string_type string() const;

Returns: The stored path, formatted according to the Pathname grammar rules.

const string_type file_string() const;

Returns: The stored path, formatted according to the operating system rules for regular file pathnames, with any Filename conversion applied.

[Note: For some operating systems, including POSIX and Windows, the native format for regular file pathnames and directory pathnames is the same, so file_string() and directory_string() return the same string. On OpenMVS, however, the expression path("/cats/jane").file_string() would return the string "[CATS]JANE" while path("/cats/jane").directory_string() would return the string "[CATS.JANE]". -- end note]

const string_type directory_string() const;

Returns: The stored path, formatted according to the operating system rules for directory pathnames, with any Filename conversion applied.

const external_string_type external_file_string() const;

Returns: The stored path, formatted according to the operating system rules for regular file pathnames, with any Filename conversion applied, and encoded by the Traits::to_external conversion function.

const external_string_type external_directory_string() const;

Returns: The stored path, formatted according to the operating system rules for directory pathnames, with any Filename conversion applied, and encoded by the Traits::to_external conversion function.

string_type root_name() const;

Returns: root-name, if the stored path includes root-name, otherwise string_type().

string_type root_directory() const;

Returns: root-directory, if the stored path includes root-directory, otherwise string_type().

If root-directory is composed slash name, slash is excluded from the returned string.

basic_path root_path() const;

Returns: root_name() / root_directory()

basic_path relative_path() const;

Returns: A basic_path composed from the the stored path, if any, beginning with the first filename after root-path. Otherwise, an empty basic_path.

string_type leaf() const;

Returns: empty() ? string_type() : *--end()

basic_path branch_path() const;

Returns: (string().empty() || begin() == --end()) ? path_type("") : br, where br is constructed as if by starting with an empty basic_path and successively applying operator/= for each element in the range begin(), --end().

bool empty() const;

Returns: string().empty().

bool is_complete() const;

Returns: true, if the elements of root_path() uniquely identify a directory, else false.

bool has_root_path() const;

Returns: !root_path().empty()

bool has_root_name() const;

Returns: !root_name().empty()

bool has_root_directory() const;

Returns: !root_directory().empty()

bool has_relative_path() const;

Returns: !relative_path().empty()

bool has_leaf() const;

Returns: !leaf().empty()

bool has_branch_path() const;

Returns: !branch_path().empty()

basic_path iterators

A basic_path::iterator is a constant iterator satisfying all the requirements of a bidirectional iterator (C++ Std, 24.1.4 Bidirectional iterators [lib.bidirectional.iterators]). Its value_type is string_type.

Calling any non-const member function of a basic_path object invalidates all iterators referring to elements of the object.

The forward traversal order is as follows:

The backward traversal order is the reverse of forward traversal.

iterator begin() const;

Returns: An iterator for the first present element in the traversal list above. If no elements are present, the end iterator.

iterator end() const;

Returns: The end iterator.

basic_path non-member functions

template<class String, class Traits>
void swap( basic_path<String, Traits> & lhs, basic_path<String, Traits> & rhs )

Effects: lhs.swap( rhs )

basic_path non-member operators

There are seven basic_path non-member operators (/, ==, !=, <, >, <=, >=), each with five overloads. For brevity, the specifications are given in tabular form. Each of the resulting thirty-five signatures is a template, with template parameter list template<class String, class Traits>. The format of such arguments is described in Pathname formats.

Argument type overloads

basic_path<String, Traits>& a, basic_path<String, Traits>& b
const typename basic_path<String, Traits>::string_type& a, basic_path<String, Traits>& b
const typename basic_path<String, Traits>::string_type::value_type* a, basic_path<String, Traits>& b
const basic_path<String, Traits>& a, typename basic_path<String, Traits>::string_type& b
const basic_path<String, Traits>& a, typename basic_path<String, Traits>::string_type::value_type* b

In the basic_path non-member operators table, a and b are of the types given in the Argument type overloads table. If a or b is of type const basic_path<String, Traits>&, then a' or b' respectively is a or b respectively. Otherwise a' or b' respectively represent named or unnamed temporary basic_path<String, Traits> objects constructed from a or b respectively.

basic_path non-member operators
Expression Return type Semantics
a / b basic_path<String, Traits> basic_path<String, Traits> tmp(a);
return a /=
b';
a < b bool return lexicographical_compare(a'.begin(), a'.end(), b'.begin(), b'.end());
a == b bool return !(a' < b') && !(b' < a');
a != b bool return !(a' == b');
a > b bool return b' < a';
a <= b bool return !(b' < a');
a >= b bool return !(a' < b');

[Note: Path equality and path equivalence have different semantics.

Equality is determined by basic_path's non-member operator==, which considers the two path's lexical representations only. Paths "abc" and "ABC" are never equal.

Equivalence is determined by the equivalent() non-member function, which determines if two paths resolve to the same file system entity. Paths "abc" and "ABC" may or may not resolve to the same file, depending on the file system.

Programmers wishing to determine if two paths are "the same" must decide if "the same" means "the same representation" or "resolve to the same actual file", and choose the appropriate function accordingly. -- end note]

basic_path inserter and extractor

template<class Path>
  basic_istream<typename Path::string_type::value_type, typename Path::string_type::traits_type>&
    operator>>(basic_istream< typename Path::string_type::value_type, typename Path::string_type::traits_type>& is,
               Path& ph );

Effects:  typename Path::string_type str;
       is >> str;
       ph = str;

Returns: is

template<class Path>
  basic_ostream<typename Path::string_type::value_type, typename Path::string_type::traits_type>&
    operator<<(basic_ostream< typename Path::string_type::value_type, typename Path::string_type::traits_type>& os,
               const Path& ph );

Effects:  os << ph.string()

Returns: os

Class template basic_filesystem_error

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      template <class Path> class basic_filesystem_error : public system_error
      {
      public:
        typedef Path path_type;

        explicit basic_filesystem_error(const std::string& what_arg, error_code ec);
        basic_filesystem_error(const std::string& what_arg, const path_type& p1, error_code ec);
        basic_filesystem_error(const std::string& what_arg, const path_type& p1, const path_type& p2, error_code ec);

        const path_type& path1() const;
        const path_type& path2() const;

        const char * what() const;
      };
    } // namespace sys
  } // namespace tr2
} // namespace std

The class template basic_filesystem_error defines the type of objects thrown as exceptions to report file system errors from functions described in this clause.

basic_filesystem_error constructors

explicit basic_filesystem_error(const std::string& what_arg, error_code ec);

Postconditions:

Expression Value
runtime_error::what() what_arg.c_str()
code() ec
path1().empty() true
path2().empty() true
basic_filesystem_error(const std::string& what_arg, const path_type& p1, error_code ec);

Postconditions:

Expression Value
runtime_error::what() what_arg.c_str()
code() ec
path1() Reference to stored copy of p1
path2().empty() true
basic_filesystem_error(const std::string& what_arg, const path_type& p1, const path_type& p2, error_code ec);

Postconditions:

Expression Value
runtime_error::what() what_arg.c_str()
code() ec
path1() Reference to stored copy of p1
path2() Reference to stored copy of p2

basic_filesystem_error observers

const path_type& path1() const;

Returns: Reference to copy of p1 stored by the constructor, or, if none, an empty path.

const path_type& path2() const;

Returns: Reference to copy of p2 stored by the constructor, or, if none, an empty path.

const char * what() const;

Returns: A string containing runtime_error::what() and the result of calling system_message() with a first argument of code(). The exact format is unspecified.

The implementation shall supply a specialization template<> const char * basic_filesystem_error<path>::what() const that returns a string containing runtime_error::what(), the result of calling system_message() with a first argument of code(), and if non-empty, path1().file_string() and path2.file_string(). The exact format is unspecified.

Implementations and users are permitted to provide other specializations of the what member function.

Class template basic_directory_entry

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      template <class Path> class basic_directory_entry
      {
      public:
        typedef Path path_type;
        typedef typename Path::string_type string_type;

        // constructors
        basic_directory_entry();
        explicit basic_directory_entry(const path_type& p,
          file_status st=file_status(), file_status symlink_st=file_status());

        // modifiers
        void assign(const path_type& p, file_status st=file_status(), file_status symlink_st=file_status());
        void replace_leaf(const string_type& s, file_status st=file_status(), file_status symlink_st=file_status());

        // observers
        const Path& path() const;
        operator const Path&() const;

        file_status  status() const;
        file_status  status(error_code& ec) const;
        file_status  symlink_status() const;
        file_status  symlink_status(error_code& ec) const;

        // comparisons
        bool operator<(const basic_directory_entry<Path>& rhs);
        bool operator==(const basic_directory_entry<Path>& rhs);
        bool operator!=(const basic_directory_entry<Path>& rhs);
        bool operator>(const basic_directory_entry<Path>& rhs);
        bool operator<=(const basic_directory_entry<Path>& rhs);
        bool operator>=(const basic_directory_entry<Path>& rhs);

      private:
        path_type            m_path;           // for exposition only
        mutable file_status  m_status;         // for exposition only; stat()-like
        mutable file_status  m_symlink_status; // for exposition only; lstat()-like
      };

    } // namespace sys
  } // namespace tr2
} // namespace std

A basic_directory_entry object stores a basic_path object, a file_status object for non-symbolic link status, and a file_status object for symbolic link status. The file_status objects act as value caches.

[Note: Because status()on a pathname may be a very expensive operation, some operating systems provide status information as a byproduct of directory iteration. Caching such status information can result is significant time savings. Cached and non-cached results may differ in the presence of race conditions. -- end note]

Actual cold-boot timing of iteration over a directory with 15,047 entries was six seconds for non-cached status queries versus one second for cached status queries. Windows XP, 3.0 GHz processor, with a moderately fast hard-drive. Similar speedup expected on Linux and BSD-derived Unix variants that provide status during directory iteration.

basic_directory_entry constructors

basic_directory_entry();

Postconditions:

Expression Value
path().empty() true
status() file_status()
symlink_status() file_status()
explicit basic_directory_entry(const path_type& p, file_status st=file_status(), file_status symlink_st=file_status());

Postconditions:

Expression Value
path() p
status() st
symlink_status() symlink_st

basic_directory_entry modifiers

void assign(const path_type& p, file_status st=file_status(), file_status symlink_st=file_status());

Postconditions:

Expression Value
path() p
status() st
symlink_status() symlink_st
void replace_leaf(const string_type& s, file_status st=file_status(), file_status symlink_st=file_status());

Postconditions:

Expression Value
path() path().branch() / s
status() st
symlink_status() symlink_st

basic_directory_entry observers

const Path& path() const;
operator const Path&() const;

Returns: m_path

file_status status() const;

Effects: As if,

if ( !status_known( m_status ) )
{
  if ( status_known(m_symlink_status) && !is_symlink(m_symlink_status) )
    { m_status = m_symlink_status; }
  else { m_status = status(m_path); }
}

Throws: See status function.

Returns: m_status

file_status status(error_code& ec) const;

Effects: As if,

if ( !status_known( m_status ) )
{
  if ( status_known(m_symlink_status) && !is_symlink(m_symlink_status) )
    { m_status = m_symlink_status; }
  else { m_status = status(m_path, ec); }
}
else ec = 0;

Returns: m_status

file_status symlink_status() const;

Effects: As if,

if ( !status_known( m_symlink_status ) )
{
  m_symlink_status = symlink_status(m_path);
}

Throws: See symlink_status function.

Returns: m_symlink_status

file_status symlink_status(error_code& ec) const;

Effects: As if,

if ( !status_known( m_symlink_status ) )
{
  m_symlink_status = symlink_status(m_path, ec);
}
else ec = 0;

Returns: m_symlink_status

Class template basic_directory_iterator

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      template <class Path>
      class basic_directory_iterator :
        public iterator<input_iterator_tag, basic_directory_entry<Path> >
      {
      public:
        typedef Path path_type;

        // constructors
        basic_directory_iterator();
        explicit basic_directory_iterator(const Path& dp);
        basic_directory_iterator(const Path& dp, error_code& ec);
        basic_directory_iterator(const basic_directory_iterator& bdi);
        basic_directory_iterator& operator=(const basic_directory_iterator& bdi);
       ~basic_directory_iterator();

        // other members as required by
        //  C++ Std, 24.1.1 Input iterators [lib.input.iterators]
      };

    } // namespace sys
  } // namespace tr2
} // namespace std

basic_directory_iterator satisfies the requirements of an input iterator (C++ Std, 24.1.1, Input iterators [lib.input.iterators]).

A basic_directory_iterator reads successive elements from the directory for which it was constructed, as if by calling POSIX readdir_r(). After a basic_directory_iterator is constructed, and every time operator++ is called, it reads and stores a value of basic_directory_entry<Path> and possibly stores associated status values. operator++ is not equality preserving; that is, i == j does not imply that ++i == ++j.

[Note: The practical consequence of not preserving equality is that directory iterators can be used only for single-pass algorithms. --end note]

If the end of the directory elements is reached, the iterator becomes equal to the end iterator value. The constructor basic_directory_iterator() with no arguments always constructs an end iterator object, which is the only legitimate iterator to be used for the end condition. The result of operator* on an end iterator is not defined. For any other iterator value a const basic_directory_entry<Path>& is returned. The result of operator-> on an end iterator is not defined. For any other iterator value a const basic_directory_entry<Path>* is returned.

Two end iterators are always equal. An end iterator is not equal to a non-end iterator.

The above wording is based on the Standard Library's istream_iterator wording. Commentary was shortened and moved into a note.

The result of calling the path() member of the basic_directory_entry object obtained by dereferencing a basic_directory_iterator is a reference to a basic_path object composed of the directory argument from which the iterator was constructed with filename of the directory entry appended as if by operator/=.

[Example: This program accepts an optional command line argument, and if that argument is a directory pathname, iterates over the contents of the directory. For each directory entry, the name is output, and if the entry is for a regular file, the size of the file is output.

#include <iostream>
#include <filesystem>

using std::tr2::sys;
using std::cout;

int main(int argc, char* argv[])
{
  std::string p(argc <= 1 ? "." : argv[1]);

  if (is_directory(p))
  {
    for (directory_iterator itr(p); itr!=directory_iterator(); ++itr)
    {
      cout << itr->path().leaf() << ' '; // display filename only
      if (is_regular(itr->status())) cout << " [" << file_size(itr->path()) << ']';
      cout << '\n';
    }
  }
  else cout << (exists(p) : "Found: " : "Not found: ") << p << '\n';

  return 0;
}

-- end example]

Directory iteration shall not yield directory entries for the current (dot) and parent (dot dot) directories.

The order of directory entries obtained by dereferencing successive increments of a basic_directory_iterator is unspecified.

[Note: Programs performing directory iteration may wish to test if the path obtained by dereferencing a directory iterator actually exists. It could be a symbolic link to a non-existent file. Programs recursively walking directory trees for purposes of removing and renaming entries may wish to avoid following symbolic links.

If a file is removed from or added to a directory after the construction of a basic_directory_iterator for the directory, it is unspecified whether or not subsequent incrementing of the iterator will ever result in an iterator whose value is the removed or added directory entry. See POSIX readdir_r(). --end note]

basic_directory_iterator constructors

basic_directory_iterator();

Effects: Constructs the end iterator.

explicit basic_directory_iterator(const Path& dp);

Effects: Constructs a iterator representing the first entry in the directory resolved to by dp, otherwise, the end iterator.

[Note: To iterate over the current directory, write directory_iterator(".") rather than directory_iterator(""). -- end note]

basic_directory_iterator(const Path& dp, error_code& ec );

Effects: Constructs a iterator representing the first entry in the directory resolved to by dp, otherwise, the end iterator. If an error occurs while establishing the results, the iterator constructed represents the end iterator and ec is set to the error code reported by the operating system, otherwise to 0.

Class template basic_recursive_directory_iterator

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      template <class Path>
      class basic_recursive_directory_iterator :
        public iterator<input_iterator_tag, basic_directory_entry<Path> >
      {
      public:
        typedef Path path_type;

        // constructors
        basic_recursive_directory_iterator();
        explicit basic_recursive_directory_iterator(const Path& dp);
        basic_recursive_directory_iterator(const basic_recursive_directory_iterator& brdi);
        basic_recursive_directory_iterator& operator=(const basic_recursive_directory_iterator& brdi);
       ~basic_recursive_directory_iterator();

        // observers
        int level() const;

        // modifiers
        void pop();
        void no_push();

        // other members as required by
        //  C++ Std, 24.1.1 Input iterators [lib.input.iterators]

      private:
        int m_level; // for exposition only
      };

    } // namespace sys
  } // namespace tr2
} // namespace std

The behavior of a basic_recursive_directory_iterator is the same as a basic_directory_iterator unless otherwise specified.

[Note: One of the uses of no_push() is to prevent unwanted recursion into symlinked directories. This may be necessary to prevent loops on some operating systems. --end note]

Class file_status

namespace std
{
  namespace tr2
  {
    namespace sys
    {
      class file_status
      {
      public:
        explicit file_status( file_type v = status_unknown );

        file_type type() const;
        void type( file_type v );
      };
    } // namespace sys
  } // namespace tr2
} // namespace std

A file_status object stores information about the status of a file. The internal form of the stored information is unspecified.

[Note: The class may be extended in the future to store additional status information. --end note]

Members

explicit file_status( file_type v = status_unknown );

Effects: Stores v.

file_type type() const;

Returns: The stored file_type.

void type( file_type v );

Effects: Stores v, replacing the previously stored value.

Non-member operational functions

Status functions

template <class Path> file_status status(const Path& p, error_code& ec);
template <class Path> file_status symlink_status(const Path& p, error_code& ec);

Returns:

For status, determine the attributes of p as if by POSIX stat(), for symlink_status determine the attributes as if by POSIX lstat().

[Note: For symbolic links, stat() continues pathname resolution using the contents of the symbolic link, lstat() does not. -- end note]

If the operating system reports an error during attribute determination:

Otherwise:

[Note: directory_file implies basic_directory_iterator on the file would succeed, and regular_file implies appropriate <fstream> operations would succeed, assuming no hardware, permission, access, or race condition errors. For regular_file, the converse is not true; lack of regular_file does not necessarily imply <fstream> operations would fail on a directory. -- end note]

template <class Path> file_status status(const Path& p);

Effects: system_error_code ec;
              file_status stat(status(p, ec));

Throws: basic_filesystem_error<Path> if ec != 0

Returns: stat

template <class Path> file_status symlink_status(const Path& p);

Effects: system_error_code ec;
              file_status stat(symlink_status(p, ec));

Throws: basic_filesystem_error<Path> if ec != 0

Returns: stat

Predicate functions

bool status_known(file_status s);

Returns: s.type() != status_unknown

bool exists(file_status s);

Returns: status_known(s) && s.type() != file_not_found

template <class Path> bool exists(const Path& p);

Returns: exists( status(p) )

bool is_regular(file_status s);

Returns: s.type() == regular_file

template <class Path> bool is_regular(const Path& p);

Returns: is_regular( status(p) )

bool is_directory(file_status s);

Returns: s.type() == directory_file

template <class Path> bool is_directory(const Path& p);

Returns: is_directory( status(p) )

bool is_symlink(file_status s);

Returns: s.type() == symlink_file

template <class Path> bool is_symlink(const Path& p);

Returns: is_symlink( symlink_status(p) )

bool is_other(file_status s);

Returns: return exists(s) && !is_regular(s) && !is_directory(s) && !is_symlink(s)

[Note: The specification of is_other() will remain unchanged even if additional is_xxx() functions are added in the future. -- end note]

template <class Path> bool is_other(const Path& p);

Returns: is_other( status(p) )

template <class Path> bool is_empty(const Path& p);

Effects: Determines file_status s, as if by status(p).

Throws: basic_filesystem_error<Path> if !exist(s) || is_other(s).

Returns: is_directory(s)
         ? basic_directory_iterator<Path>(p) == basic_directory_iterator<Path>()
         : file_size(p) == 0;

template <class Path1, class Path2> bool equivalent(const Path1& p1, const Path2& p2);

Requires: Path1::external_string_type and Path2::external_string_type are the same type.

Effects: Determines file_status s1 and s2, as if by status(p1) and  status(p2), respectively.

Throws: basic_filesystem_error<Path1> if (!exists(s1) && !exists(s2)) || (is_other(s1) && is_other(s2)).

Returns: true, if sf1 == sf2 and p1 and p2 resolve to the same file system entity, else false.

Two paths are considered to resolve to the same file system entity if two candidate entities reside on the same device at the same location. This is determined as if by the values of the POSIX stat structure, obtained as if by stat() for the two paths, having equal st_dev values and equal st_ino values.

[Note: POSIX requires that "st_dev must be unique within a Local Area Network". Conservative POSIX implementations may also wish to check for equal st_size and st_mtime values. Windows implementations may use GetFileInformationByHandle() as a surrogate for stat(), and consider "same" to be equal values for dwVolumeSerialNumber, nFileIndexHigh, nFileIndexLow, nFileSizeHigh, nFileSizeLow, ftLastWriteTime.dwLowDateTime, and ftLastWriteTime.dwHighDateTime. -- end note]

Attribute functions

[Note: A strictly limited number of attribute functions are provided because few file system attributes are portable. Even the functions provided will be impossible to implement on some file systems. --end note.]

template <class Path> const Path& initial_path();

Returns: current_path() at the time of entry to main().

[Note: These semantics turn a dangerous global variable into a safer global constant. --end note]

[Note: Full implementation requires runtime library support. Implementations which cannot provide runtime library support are encouraged to instead store the value of current_path() at the first call of initial_path(), and return this value for all subsequent calls. Programs using initial_path() are encouraged to call it immediately on entrance to main() so that they will work correctly with such partial implementations. --end note]

template <class Path> Path current_path();

Returns: The current path, as if by POSIX getcwd().

Postcondition: current_path().is_complete()

[Note: The current path as returned by many operating systems is a dangerous global variable. It may be changed unexpectedly by a third-party or system library functions, or by another thread. Although dangerous, the function is useful in dealing with other libraries.. For a safer alternative, see initial_path(). The current_path() name was chosen to emphasize that the return is a complete path, not just a single directory name. -- end note]

template <class Path> uintmax_t file_size(const Path& p);

Returns: The size in bytes of the file p resolves to, determined as if by the value of the POSIX stat structure member st_size obtained as if by POSIX stat().

template <class Path> space_info space(const Path& p);

Returns: A space_info object. The value of the space_info object is determined as if by using POSIX statvfs() to obtain a POSIX struct statvfs, and then multiplying its f_blocks, f_bfree, and f_bavail members by its f_frsize member, and assigning the results to the capacity, free, and available members respectively. Any members for which the value cannot be determined shall be set to -1.

template <class Path> std::time_t last_write_time(const Path& p);

Returns: The time of last data modification of p, determined as if by the value of the POSIX stat structure member st_mtime  obtained as if by POSIX stat().

template <class Path> void last_write_time(const Path& p, const std::time_t new_time);

Effects: Sets the time of last data modification of the file resolved to by p to new_time, as if by POSIX stat() followed by POSIX utime().

[Note: The apparent postcondition last_write_time(p) == new_time is not specified since it would not hold for many file systems due to coarse time mechanism granularity. -- end note]

Other operations functions

template <class Path> bool create_directory(const Path& dp);

Effects: Attempts to create the directory dp resolves to, as if by POSIX mkdir() with a second argument of S_IRWXU|S_IRWXG|S_IRWXO.

Throws: basic_filesystem_error<Path> if Effects fails for any reason other than because the directory already exists.

Returns: True if a new directory was created, otherwise false.

Postcondition: is_directory(dp)

template <class Path1, class Path2>
  error_code create_hard_link(const Path1& to_p, const Path2& from_p, error_code& ec);

Requires: Path1::external_string_type and Path2::external_string_type are the same type.

Effects: Establishes the postcondition, as if by POSIX link().

Returns: If the postcondition cannot be established, a system error code indicating the reason for the failure, otherwise 0.

Postcondition:

[Note: Some operating systems do not support hard links or support them only for regular files. Some operating systems limit the number of links per file. Some file systems that do not support hard links - the FAT system used on floppy discs, memory cards and flash drives, for example. Thus hard links should be avoided if wide portability is a concern. -- end note]

template <class Path1, class Path2>
  void create_hard_link(const Path1& to_p, const Path2& from_p);

Requires: Path1::external_string_type and Path2::external_string_type are the same type.

Effects: As if system_error_code ec( create_hard_link( to_p, from_p ) );

Throws: basic_filesystem_error<Path1, Path2> if ec is not zero.

template <class Path1, class Path2>
  error_code create_symlink(const Path1& to_p, const Path2& from_p, error_code& ec);

Requires: Path1::external_string_type and Path2::external_string_type are the same type.

Effects: Establishes the postcondition, as if by POSIX symlink().

Returns: If the postcondition cannot be established, a system error code indicating the reason for the failure, otherwise 0.

Postcondition: from_p resolves to a symbolic link file that contains an unspecified representation of to_p.

[Note: Some operating systems do not support symbolic links at all or support them only for regular files. Thus symbolic links should be avoided if code portability is a concern. -- end note]

template <class Path1, class Path2>
  void create_symlink(const Path1& to_p, const Path2& from_p);

Requires: Path1::external_string_type and Path2::external_string_type are the same type.

Effects: As if system_error_code ec( create_symlink( to_p, from_p ) );

Throws: basic_filesystem_error<Path1, Path2> if ec is not zero.

template <class Path> bool remove(const Path& p);

Precondition: !p.empty()

Effects:  Attempts to delete the file p resolves to, as if by POSIX remove().

Returns: The value of exists(p) prior to the establishment of the postcondition.

Postcondition: !exists(p)

Throws: basic_filesystem_error<Path> if:

[Note: A symbolic link is itself removed, rather than what it resolves to being removed. -- end note]

template <class Path1, class Path2> void rename(const Path1& from_p, const Path2& to_p);

Requires: Path1::external_string_type and Path2::external_string_type are the same type.

Effects: Renames from_p to to_p, as if by POSIX rename().

Postconditions: !exists(from_p) && exists(to_p), and the contents and attributes of the file originally named from_p are otherwise unchanged.

[Note: If from_p and to_p resolve to the same file, no action is taken. Otherwise, if to_p resolves to an existing file, it is removed. A symbolic link is itself renamed, rather than the file it resolves to being renamed. -- end note]

template <class Path1, class Path2> void copy_file(const Path1& from_fp, const Path2& to_fp);

Requires: Path1::external_string_type and Path2::external_string_type are the same type.

Effects: The contents and attributes of the file from_fp resolves to are copied to the file to_fp resolves to.

Throws: basic_filesystem_error<Path> if from_fp.empty() || to_fp.empty() ||!exists(from_fp) || !is_regular(from_fp) || exists(to_fp)

template <class Path> Path complete(const Path& p, const Path& base=initial_path<Path>());

Effects: Composes a complete path from p and base, using the following rules:

  p.has_root_directory() !p.has_root_directory()
p.has_root_name() p precondition failure
!p.has_root_name() base.root_name()
/ p
base / p

Returns: The composed path.

Postcondition: For the returned path, rp, rp.is_complete() is true.

Throws: If !(base.is_complete() && (p.is_complete() || !p.has_root_name()))

[Note: When portable behavior is required, use complete(). When operating system dependent behavior is required, use system_complete().

Portable behavior is useful when dealing with paths created internally within a program, particularly if the program should exhibit the same behavior on all operating systems.

Operating system dependent behavior is useful when dealing with paths supplied by user input, reported to program users, or when such behavior is expected by program users. -- end note]

template <class Path> Path system_complete(const Path& p);

Effects: Composes a complete path from p, using the same rules used by the operating system to resolve a path passed as the filename argument to standard library open functions.

Returns: The composed path.

Postcondition: For the returned path, rp, rp.is_complete() is true.

Throws: If p.empty().

[Note: For POSIX, system_complete(p) has the same semantics as complete(p, current_path()).

For Windows, system_complete(p) has the same semantics as complete(ph, current_path()) if p.is_complete() || !p.has_root_name() or p and base have the same root_name(). Otherwise it acts like complete(p, kinky), where kinky is the current directory for the p.root_name() drive. This will be the current directory of that drive the last time it was set, and thus may be residue left over from a prior program run by the command processor! Although these semantics are often useful, they are also very error-prone.

See complete() note for usage suggestions. -- end note]

Convenience functions

template <class Path> bool create_directories(const Path & p);

Requires: p.empty() ||
forall px: px == p || is_parent(px, p): is_directory(px) || !exists( px )

Returns: The value of !exists(p) prior to the establishment of the postcondition.

Postcondition: is_directory(p)

Throws:  basic_filesystem_error<Path> if exists(p) && !is_directory(p)

template <class Path> typename Path::string_type extension(const Path & p);

Returns: if p.leaf() contains a dot, returns the substring of p.leaf() starting at the rightmost dot and ending at the string's end. Otherwise, returns an empty string.

[Note: The dot is included in the return value so that it is possible to distinguish between no extension and an empty extension.

Implementations are permitted but not required to define additional behavior for file systems which append additional elements to extensions, such as alternate data stream or partitioned dataset names. -- end note]

template <class Path> typename Path::string_type basename(const Path & p);

Returns: if p.leaf() contains a dot, returns the substring of p.leaf() starting at its beginning and ending at the last dot (the dot is not included). Otherwise, returns p.leaf().

template <class Path>
  Path replace_extension(const Path & p, const typename Path::string_type & new_extension);

Postcondition: basename(return_value) == basename(p) && extension(return_value) == new_extension

[Note: It follows from the semantics of extension() that new_extension should include dot to achieve reasonable results. -- end note]

Additions to header <cerrno>

The header <cerrno> shall include an additional symbolic constant macro for each of the values returned by the to_errno function. The macro names shall be as defined in POSIX errno.h, with the additions below.

This codifies existing practice. The required names are only a sub-set of those defined by POSIX, and are usually already supplied in <errno.h> (as wrapped by <cerrno>) as shipped with POSIX and Windows compilers. These implementations require no changes to their underlying C headers to conform with the above requirement.

Name Meaning
EBADHANDLE Bad operating system handle.
EOTHER Other error.

Additions to header <fstream>

These additions have been carefully specified to avoid breaking existing code in common operating environments such as POSIX, Windows, and OpenVMS. See Suggestions for <fstream> implementations for techniques to avoid breaking existing code in other environments, particularly on operating systems allowing slashes in filenames.

[Note: The "do-the-right-thing" rule from Requirements on implementations does apply to header <fstream>.

The overloads below are specified as additions rather than replacements for existing functions. This preserves existing code (perhaps using a home-grown path class) that relies on an automatic conversion to const char*. -- end note]

In 27.8.1.1 Class template basic_filebuf [lib.filebuf] synopsis preceding paragraph 1, add the function:

template <class Path> basic_filebuf<charT,traits>* open(const Path& p, ios_base::openmode mode);

In 27.8.1.3 Member functions [lib.filebuf.members], add the above to the signature preceding paragraph 2, and replace the sentence:

It then opens a file, if possible, whose name is the NTBS s (“as if” by calling std::fopen(s ,modstr )).

with:

It then opens, if possible, the file that p or path(s) resolves to, “as if” by calling std::fopen() with a second argument of modstr.

In 27.8.1.5 Class template basic_ifstream [lib.ifstream] synopsis preceding paragraph 1, add the functions:

template <class Path> explicit basic_ifstream(const Path& p, ios_base::openmode mode = ios_base::in);
template <class Path> void open(const Path& p, ios_base::openmode mode = ios_base::in);

In 27.8.1.6 basic_ifstream constructors [lib.ifstream.cons] add the above constructor to the signature preceding paragraph 2, and in paragraph 2 replace

rdbuf()->open(s, mode | ios_base::in)

with

rdbuf()->open(path(s), mode | ios_base::in) or rdbuf()->open(p, mode | ios_base::in) as appropriate

In 27.8.1.7 Member functions [lib.ifstream.members] add the above open function to the signature preceding paragraph 3, and in paragraph 3 replace

rdbuf()->open(s, mode | ios_base::in)

with

rdbuf()->open(path(s), mode | ios_base::in) or rdbuf()->open(p, mode | ios_base::in) as appropriate

In 27.8.1.8 Class template basic_ofstream [lib.ofstream] synopsis preceding paragraph 1, add the functions:

template <class Path> explicit basic_ofstream(const Path& p, ios_base::openmode mode = ios_base::out);
template <class Path> void open(const Path& p, ios_base::openmode mode = ios_base::out);

In 27.8.1.9 basic_ofstream constructors [lib.ofstream.cons] add the above constructor to the signature preceding paragraph 2, and in paragraph 2 replace

rdbuf()->open(s, mode | ios_base::out)

with

rdbuf()->open(path(s), mode | ios_base::out) or rdbuf()->open(p, mode | ios_base::out) as appropriate

In 27.8.1.10 Member functions [lib.ofstream.members] add the above open function to the signature preceding paragraph 3, and in paragraph 3 replace

rdbuf()->open(s, mode | ios_base::out)

with

rdbuf()->open(path(s), mode | ios_base::out) or rdbuf()->open(p, mode | ios_base::out) as appropriate

In 27.8.1.11 Class template basic_fstream [lib.fstream] synopsis preceding paragraph 1, add the functions:

template <class Path> explicit basic_fstream(const Path& p, ios_base::openmode mode = ios_base::in|ios_base::out);
template <class Path> void open(const Path& p, ios_base::openmode mode = ios_base::in|ios_base::out);

In 27.8.1.12 basic_fstream constructors [lib.fstream.cons] add the above constructor to the signature preceding paragraph 2, and in paragraph 2 replace

rdbuf()->open(s, mode)

with

rdbuf()->open(path(s), mode) or rdbuf()->open(p, mode) as appropriate

In 27.8.1.13 Member functions [lib.fstream.members] add the above open function to the signature preceding paragraph 3, and in paragraph 3 replace

rdbuf()->open(s, mode)

with

rdbuf()->open(path(s), mode) or rdbuf()->open(p, mode) as appropriate

End of proposed text.

Path decomposition table

The table is generated by a program compiled with the Boost implementation.

Shaded entries indicate cases where POSIX and Windows implementations yield different results. The top value is the POSIX result and the bottom value is the Windows result.
 

Constructor
argument
Elements found
by iteration
string() file_
string()
root_
path()
.string()
root_
name()
root_
directory()
relative_
path()
.string()
branch_
path()
.string()
leaf()
"" "" "" "" "" "" "" "" "" ""
"." "." "." "." "" "" "" "." "" "."
".." ".." ".." ".." "" "" "" ".." "" ".."
"foo" "foo" "foo" "foo" "" "" "" "foo" "" "foo"
"/" "/" "/" "/"
"\"
"/" "" "/" "" "" "/"
"/foo" "/","foo" "/foo" "/foo"
"\foo"
"/" "" "/" "foo" "/" "foo"
"foo/" "foo","." "foo/" "foo/"
"foo\"
"" "" "" "foo/" "foo" "."
"/foo/" "/","foo","." "/foo/" "/foo/"
"\foo\"
"/" "" "/" "foo/" "/foo" "."
"foo/bar" "foo","bar" "foo/bar" "foo/bar"
"foo\bar"
"" "" "" "foo/bar" "foo" "bar"
"/foo/bar" "/","foo","bar" "/foo/bar" "/foo/bar"
"\foo\bar"
"/" "" "/" "foo/bar" "/foo" "bar"
"///foo///" "/","foo","." "///foo///" "///foo///"
"\foo\\\"
"/" "" "/" "foo///" "///foo" "."
"///foo///bar" "/","foo","bar" "///foo///bar" "///foo///bar"
"\foo\\\bar"
"/" "" "/" "foo///bar" "///foo" "bar"
"/." "/","." "/." "/."
"\."
"/" "" "/" "." "/" "."
"./" ".","." "./" "./"
".\"
"" "" "" "./" "." "."
"/.." "/",".." "/.." "/.."
"\.."
"/" "" "/" ".." "/" ".."
"../" "..","." "../" "../"
"..\"
"" "" "" "../" ".." "."
"foo/." "foo","." "foo/." "foo/."
"foo\."
"" "" "" "foo/." "foo" "."
"foo/.." "foo",".." "foo/.." "foo/.."
"foo\.."
"" "" "" "foo/.." "foo" ".."
"foo/./" "foo",".","." "foo/./" "foo/./"
"foo\.\"
"" "" "" "foo/./" "foo/." "."
"foo/./bar" "foo",".","bar" "foo/./bar" "foo/./bar"
"foo\.\bar"
"" "" "" "foo/./bar" "foo/." "bar"
"foo/.." "foo",".." "foo/.." "foo/.."
"foo\.."
"" "" "" "foo/.." "foo" ".."
"foo/../" "foo","..","." "foo/../" "foo/../"
"foo\..\"
"" "" "" "foo/../" "foo/.." "."
"foo/../bar" "foo","..","bar" "foo/../bar" "foo/../bar"
"foo\..\bar"
"" "" "" "foo/../bar" "foo/.." "bar"
"c:" "c:" "c:" "c:" ""
"c:"
""
"c:"
"" "c:"
""
"" "c:"
"c:/" "c:","."
"c:","/"
"c:/" "c:/"
"c:\"
""
"c:/"
""
"c:"
""
"/"
"c:/"
""
"c:" "."
"/"
"c:foo" "c:foo"
"c:","foo"
"c:foo" "c:foo" ""
"c:"
""
"c:"
"" "c:foo"
"foo"
""
"c:"
"c:foo"
"foo"
"c:/foo" "c:","foo"
"c:","/","foo"
"c:/foo" "c:/foo"
"c:\foo"
""
"c:/"
""
"c:"
""
"/"
"c:/foo"
"foo"
"c:"
"c:/"
"foo"
"c:foo/" "c:foo","."
"c:","foo","."
"c:foo/" "c:foo/"
"c:foo\"
""
"c:"
""
"c:"
"" "c:foo/"
"foo/"
"c:foo" "."
"c:/foo/" "c:","foo","."
"c:","/","foo","."
"c:/foo/" "c:/foo/"
"c:\foo\"
""
"c:/"
""
"c:"
""
"/"
"c:/foo/"
"foo/"
"c:/foo" "."
"c:/foo/bar" "c:","foo","bar"
"c:","/","foo","bar"
"c:/foo/bar" "c:/foo/bar"
"c:\foo\bar"
""
"c:/"
""
"c:"
""
"/"
"c:/foo/bar"
"foo/bar"
"c:/foo" "bar"
"prn:" "prn:" "prn:" "prn:" ""
"prn:"
""
"prn:"
"" "prn:"
""
"" "prn:"
"c:\" "c:\"
"c:","/"
"c:\"
"c:/"
"c:\" ""
"c:/"
""
"c:"
""
"/"
"c:\"
""
""
"c:"
"c:\"
"/"
"c:foo" "c:foo"
"c:","foo"
"c:foo" "c:foo" ""
"c:"
""
"c:"
"" "c:foo"
"foo"
""
"c:"
"c:foo"
"foo"
"c:\foo" "c:\foo"
"c:","/","foo"
"c:\foo"
"c:/foo"
"c:\foo" ""
"c:/"
""
"c:"
""
"/"
"c:\foo"
"foo"
""
"c:/"
"c:\foo"
"foo"
"c:foo\" "c:foo\"
"c:","foo","."
"c:foo\"
"c:foo/"
"c:foo\" ""
"c:"
""
"c:"
"" "c:foo\"
"foo/"
""
"c:foo"
"c:foo\"
"."
"c:\foo\" "c:\foo\"
"c:","/","foo","."
"c:\foo\"
"c:/foo/"
"c:\foo\" ""
"c:/"
""
"c:"
""
"/"
"c:\foo\"
"foo/"
""
"c:/foo"
"c:\foo\"
"."
"c:\foo/" "c:\foo","."
"c:","/","foo","."
"c:\foo/"
"c:/foo/"
"c:\foo/"
"c:\foo\"
""
"c:/"
""
"c:"
""
"/"
"c:\foo/"
"foo/"
"c:\foo"
"c:/foo"
"."
"c:/foo\bar" "c:","foo\bar"
"c:","/","foo","bar"
"c:/foo\bar"
"c:/foo/bar"
"c:/foo\bar"
"c:\foo\bar"
""
"c:/"
""
"c:"
""
"/"
"c:/foo\bar"
"foo/bar"
"c:"
"c:/foo"
"foo\bar"
"bar"

Suggestions for <fstream> implementations

The change in semantics to functions taking const char* arguments can break existing code, but only on operating systems where implementations don't implicitly accept native format pathnames or operating systems that allow slashes in filenames. Thus on POSIX, Windows, and OpenVMS, for example, there is no problem if the implementation follows encouraged behavior.

For most of the Filesystem Library, there is no existing code, so the issue preserving existing code that uses slashes in filenames doesn't arise. New code simply must use basic_path constructors with path_format_t arguments of native. To preserve existing fstream code that uses slashes in filenames, an implementation may wish to provide a mechanism such as a macro to control selection of the old behavior.

Implementations are already required by the TR front-matter to provide a mechanism such as a macro to control selection of the old behavior (useful to guarantee protection of existing code) or new behavior (useful in new code, and code being ported from other systems) for headers. Because use of the rest of the Filesystem Library is independent of use of the <fstream> additions, affected implementations are encouraged to allow disabling the <fstream> additions separately from other TR features.

An rejected alternative was to supply new fstream classes in namespace sys, inheriting from the current classes, overriding the constructors and opens taking pathname arguments, and providing the additional overloads. In Lillehammer LWG members indicated lack of support for this alternative, feeling that costs outweigh benefits.

Issues

1. Return type of certain basic_path members returning strings. [Howard Hinnant]

For member functions described as returning "const string_type" or "const external_string_type", implementations are permitted to return "const string_type&" or  "const external_string_type&" respectively.

This allows implementations to avoid unnecessary copies. Return-by-value is specified as const to ensure programs won't break if moved to a return-by-reference implementation.

For example, the Boost implementation keeps the internal representation of a pathname in the portable format, so string() returns by reference and is inlined:

const string_type & string() const { return m_path; }

Howard Hinnant comments: This may inhibit optimization if rvalue reference is accepted.  Const-qualified return types can't be moved from.  I'd rather see either the return type specified as const string_type& or string_type.

Beman Dawes comments: I can't make up my mind. Removing the const will bite users, but not very often. OTOH, excessive copying is a real concern, and if move semantics can alleviate that, I'm all for it. What does the LWG think?

2. Basic_path canonize() and normalize() removed. [Beman Dawes]

The Boost implementation has basic_path functions canonize() and normalize() which return cleaned up string representations of a pathname. They have been removed from the proposal as messy to specify and implement, not hugely useful, and possible to implement by users as non-member functions without any loss of functionality or efficiency. There was also a concern the proposal was getting a bit large.

These functions can be added later as convenience functions if the LWG so desires..

3. Filename checking functions. [Beman Dawes]

Boost has a set of predicate functions that determine if a filename is valid for a particular operating or system. These can be used as building blocks for functions that determine if an entire pathname is valid for a particular operating or file system.

Users can use these functions to ensure that pathnames are in fact portable to target operating or file systems, without having to actually test on the target systems.

These functions are not included in the proposal because of lack of time, and uncertainty as to their fit with the Standard Library. They can be added later if the LWG so desires.

Acknowledgements

This Filesystem Library is dedicated to my wife, Sonda, who provided the support necessary to see both a trial implementation and the proposal itself through to completion. She gave me the strength to continue after a difficult year of cancer treatment in the middle of it all.

Many people contributed technical comments, ideas, and suggestions to the Boost Filesystem Library. See http://www.boost.org/libs/filesystem/doc/index.htm#Acknowledgements.

Dietmar Kühl contributed the original Boost Filesystem Library directory_iterator design. Peter Dimov, Walter Landry, Rob Stewart, and Thomas Witt were particularly helpful in refining the library.

The create_directories, extension, basename, and replace_extension functions were developed by Vladimir Prus.

Howard Hinnant and John Maddock reviewed a draft of the proposal, and identified a number of mistakes or weaknesses, resulting in a more polished final document.

References

[ISO-POSIX] ISO/IEC 9945:2003, IEEE Std 1003.1-2001, and The Open Group Base Specifications, Issue 6. Also known as The Single Unix® Specification, Version 3. Available from each of the organizations involved in its creation. For example, read online or download from www.unix.org/single_unix_specification/. The ISO JTC1/SC22/WG15 - POSIX homepage is www.open-std.org/jtc1/sc22/WG15/
[Abrahams] Dave Abrahams, Error and Exception Handling, www.boost.org/more/error_handling.html

Revision History

N1841
  • Initial version, August, 2005, pre-Tremblant mailing
N1889
Revision 1
  • Missing argument name fmt added to several basic_path members.
  • is_empty() name discrepancy between synopsis and description corrected.
  • file_size() return type changed from intmax_t to uintmax_t.  Wording slightly clarified.
  • struct space_info and non-member function space() added.
  • A paragraph was added to Important design decisions mentioning the need for both portable and platform specific semantics.
N1934
Revision 2
  • Changed native path identification from constructor argument to "//:" escape prefix. Rationale: simplifies basic_path constructor interfaces, easier use for platforms needing explicit native format identification.
  • Introduced a new base class, filesystem_error, to allow users to catch a single exception type if desired, or to deal with the case where the templated type is unknown. Rename filesystem_error and wfilesystem_error accordingly.
  • Rewording basic_filesystem_error text to more closely follow the form of clause 19 of the standard.
  • Removed dual specification of certain errors in both "Reguires" and "Throws" paragraphs. Since throwing an exception is well-defined behavior, the error condition does not result in undefined behavior as implied by "Requires". (Suggested by Dave Abrahams)
  • Added a non-throwing version of create_hard_link().
  • Added two create_symlink() functions.
  • Added basic_path inserter and extractor. (Suggested by Vladimir Prus)
  • Added basic_path member and non-member swap() functions.
  • Aligned basic_path operator functions with std::basic_string practice.
  • Replaced status_flags with file_type enum and file_status class to improve encapsulation and allow for future expansion of file_status.
  • Added predicate functions overloaded on file_status (Suggested by Martin Adrian). This change, coupled with the introduction of file_status, clarifies the meaning of file types and related predicate operations, and eliminates the need for user bit manipulation, which was a source of user error.
  • Predicate function specification clarified accordingly.
  • Revised and explicitly documented policy for non-throwing versions of functions to increase consistency.
  • Added basic_directory_iterator constructor non-throwing overload (Suggested by Martin Adrian).
  • Changed symlink awareness to separately name functions to cut clutter caused by addition of non-throwing overloads.
N1975
Revision 3
  • Factored non-filesystem related error handling into separate <system_error> header. This change grew out of a Boost developer's list discussion of error handling guidelines for functions likely to use operating system API calls.
  • Added basic_path::clear() in response to user request.

© Copyright Beman Dawes, 2002-2006

Revised 2006-04-04