Doc. no. P0085R3
Audience: EWG, LEWG
Date: 01-07-2025
Project: ISO JTC1/SC22/WG21: Programming Language C++, evolution group
Reply to: Jolly Chen <Jolly.Chen@cern.ch>, Axel Naumann <Axel.Naumann@cern.ch>, Michael Jonker <Michael.Jonker@cern.ch>,

Oo... adding a coherent character sequence to begin octal-literals

Revision History

Since P0085R2

Since P0085R1

Since P0085R0

Proposal

Proposal to add 0o and 0O as an alternative (and preferred) sequence to introduce octal-literals and deprecate use of the old prefix 0 for non-zero octal-literals.

The syntax rule to interpret integer literals starting with a zero as octal-literals might be called a 'historical mistake'. It can be easily misunderstood by novice programmers and can lead to surprising errors.

To allow future generations (of developers if not compilers) to correct this feature, we propose to add the character sequence 0o and 0O as preferred sequences to introduce an octal-literal. The prefix 0o follows the model set by the prefix 0x to introduce a hex-literal, and (since c++14) 0b to introduce a binary-literal.

Additionally, we propose to deprecate the use of octal integer literals with the 0 prefix, apart from 0 itself; this is similar to C declaring the use of nonzero octal integer literals without the prefix 0o or 0O an obsolescent feature (see 6.11.5 in C2y draft). This opens the door for compilers to warn about the use of the deprecated 0 prefix, and eventually (on the scale of a few releases), potentially interpret numbers with leading zeroes as decimals.

From http://en.wikipedia.org/wiki/Octal//en.wikipedia.org/wiki/Octal: "Newer languages have been abandoning the prefix 0, as decimal numbers are often represented with leading zeroes. The prefix q was introduced to avoid the prefix o being mistaken for a zero, while the prefix 0o was introduced to avoid starting a numerical literal with an alphabetic character (like o or q), since these might cause the literal to be confused with a variable name. The prefix 0o also follows the model set by the prefix 0x used for hexadecimal literals in the C language; it is supported by Haskell,[19] OCaml,[20] Python as of version 3.0,[21] Raku,[22] Ruby,[23] Tcl as of version 9,[24] PHP as of version 8.1,[25] Rust[26] and ECMAScript as of ECMAScript 6[27] (the prefix 0 originally stood for base 8 in JavaScript but could cause confusion,[28] therefore it has been discouraged in ECMAScript 3 and dropped in ECMAScript 5[29])."

This proposal now reflects a recent corresponding change in C (N3353) that was triggered by an earlier version of this proposal.

It was observed that changes are needed in the specification of std::format and streams to support the new octal-literal prefix for input/output. This proposal does not address that and leaves them to a future proposal.

Examples

Padding

Padding leading zeros are often (attempted to be) used to nicely align numbers. This leads to surprising results if the programmer expected the number to be interpreted as a decimal. For example:

After this proposal

Following this proposal, these literals all specify the same number:
LiteralBeforeAfter
Hex0x2A0x2A
Binary0b001010100b00101010
Octal0520o52

The old octal literal 052 will remain valid but deprecated.

Effects on existing code

This proposal introduces 0o and 0O as new prefixes for octal-literals. Under the current standard, any sequence starting with 0o or 0O is illegal. Consequently, the proposed additions 0o and 0O will not break existing code.

Additionally, this proposal deprecates the existing error-prone syntax rule for non-zero integer literals starting with a zero, following the example of C (N3353). This would affect existing, traditional octal-literals (i.e. with a leading 0), for instance when defining POSIX file permissions (sometimes padded with multiple leading zeros), as seen in popular repositories (>500 stars) e.g., ROOT, Qt Creator, node-android, KDiff3.

We have received concerns that this deprecation leads to warnings for octal-literals that have the same value whether interpreted as octal or decimal. Examples of such literals are: 01, 02, 03, 04, 05, 06, 07. To keep the door open for a future proposal introducing 01...09 as decimal literals (typical use case: 1970y/January/01d), implementations might choose to not diagnose 01..07.

Technical Specification

We propose the feature test macro name __cpp_0o_octals for this feature.

To match N3353, make the following edits (relative to N5008), highlighting the insertions and removals:

5.13.2 Integer literals [lex.icon]

    octal-literal
      0
      octal-literal 'opt octal-digit
      prefixed-octal-literal
      unprefixed-octal-literal
    
    unprefixed-octal-literal:
      0
      0 'opt octal-digit-sequence
    
    prefixed-octal-literal:
      octal-prefix octal-digit-sequence
  

Before hexadecimal-prefix insert

    octal-prefix: one of
      0o 0O

Before hexadecimal-digit-sequence insert

    octal-digit-sequence:
      octal-digit
      octal-digit-sequence 'opt octal-digit
  

Under paragraph 2, replace the text:

2 The hexadecimal-digits a through f and A through F have decimal values ten through fifteen. [Example 1 : The number twelve can be written 12, 014, 0o14, 0XC, or 0b1100. The integer-literals 1048576, 1'048'576, 0X100000, 0x10'0000, and 0'004'000'000 0o0'004'000'000, all have the same value. - end example]

After paragraph 4, add the paragraph:

An unprefixed-octal-literal ([lex.icon], [gram.lex]) of the form

        0 'opt octal-digit-sequence 

is deprecated. [depr.oct]

31.12.8.4 Enum class perm [fs.enum.perms]

Table 149 - Enum class perms [tab:fs.enum.perms]
NameValue (octal)POSIX macroDefinition or notes
none0o0There are no permissions set for the file.
owner_read0o400S_IRUSRRead permission, owner
owner_write0o200S_IWUSRWrite permission, owner
owner_exec0o100S_IXUSRExecute/search permission, owner
owner_all0o700S_IRWXURead, write, execute/search by owner; owner_read | owner_write | owner_exec
group_read0o40S_IRGRPRead permission, group
group_write0o20S_IWGRPWrite permission, group
group_exec0o10S_IXGRPExecute/search permission, group
group_all0o70S_IRWXGRead, write, execute/search by group; group_read | group_write | group_exec
others_read0o4S_IROTHRead permission, others
others_write0o2S_IWOTHWrite permission, others
others_exec0o1S_IXOTHExecute/search permission, others
others_all0o7S_IRWXORead, write, execute/search by others; others_read | others_write | others_exec
all0o777owner_all | group_all | others_all
set_uid0o4000S_ISUIDSet-user-ID on execution
set_gid0o2000S_ISGIDSet-group-ID on execution
sticky_bit0o1000S_ISVTXOperating system dependent.
mask0o7777all | set_uid | set_gid | sticky_bit
unknown0xFFFFThe permissions are not known, such as when a file_status object is created without specifying the permissions

Annex D (normative) Compatibility features [depr]

Add a new section

D??. Deprecated unprefixed octal literal [depr.oct]

1 An unprefixed-octal-literal ([lex.icon], [gram.lex]) of the form

    0 'opt octal-digit-sequence 
is deprecated.

[Note 1: Use of unprefixed octal literals, except the literal 0, are deprecated because they are often confused with decimals. --end note]

[Example 1:

      int zero = 0;                     // OK
      int more_zeroes = 000;            // deprecated
      int unprefixed_octal = 042;       // deprecated
      int prefixed_octal = 0o42;        // OK
    

--end example]

Acknowledgements

Thanks to Erich Keane and Thomas Köppe for reviewing the draft. The document style was borrowed from Doc. no. N4340