Doc. no. | P0085R1 |
Audience: | EWG, LEWG |
Date: | 12-05-2025 |
Project: | ISO JTC1/SC22/WG21: Programming Language C++, evolution group |
Reply to: | Jolly Chen <Jolly.Chen@cern.ch>, Axel Naumann <Axel.Naumann@cern.ch>, Michael Jonker <Michael.Jonker@cern.ch>, |
Proposal to add 0o and 0O as an alternative (and preferred) sequence to introduce octal-literals and deprecate the old prefix 0.
The syntax rule to interpret integer literals starting with a zero as octal-literals might be called a ’historical mistake’. It can be easily misunderstood by novice programmers and can lead to surprising errors.
To allow future generations (of developers if not compilers) to correct this feature, we propose to add the character sequence 0o and 0O as preferred sequences to introduce an octal-literal. The prefix 0o follows the model set by the prefix 0x to introduce a hex-literal, and (since c++14) 0b to introduce a binary-literal. Compilers will be able to warn about the use of the deprecated 0 prefix, and eventually (on the scale of a few releases) the old syntax can be removed.
From http://en.wikipedia.org/wiki/Octal//en.wikipedia.org/wiki/Octal: "Newer languages have been abandoning the prefix 0, as decimal numbers are often represented with leading zeroes. The prefix q was introduced to avoid the prefix o being mistaken for a zero, while the prefix 0o was introduced to avoid starting a numerical literal with an alphabetic character (like o or q), since these might cause the literal to be confused with a variable name. The prefix 0o also follows the model set by the prefix 0x used for hexadecimal literals in the C language; it is supported by Haskell,[11] OCaml,[12] Perl 6,[13] Python as of version 3.0,[14] Ruby,[15] Tcl as of version 9,[16] and it is intended to be supported by ECMAScript 6[17] (the prefix 0 has been discouraged in ECMAScript 3 and dropped in ECMAScript 5[18])."
The following literals all specify the same number.
Literal | Before | After |
---|---|---|
Hex | 0x2A | 0x2A |
Binary | 0b00101010 | 0b00101010 |
Octal | 052 | 0o52 |
The old octal literal 052 will continue to remain valid but deprecated.
This proposal deprecates the existing error-prone syntax rule for integer literals starting with a zero, following the example of C (N3353). It introduces 0o as a new prefix for octal-literals. Additionally, it deprecates the use of non-zero octal integer literals without the prefix 0o or 0O, similar to C declaring this an obsolescent feature.
Under the current standard, any sequence starting with 0o is illegal. Consequently, the proposed additions 0o and 0O will not break existing code.
To match N3353,
make the following edits (relative to
N5008)
with insertions and removals marked like so:
octal-literal0octal-literal ’opt octal-digitprefixed-octal-literal unprefixed-octal-literal unprefixed-octal-literal: 0 0 ’opt octal-digit-sequence prefixed-octal-literal: octal-prefix octal-digit-sequence
Before hexadecimal-prefix insert
octal-prefix: one of 0o 0O
Before hexadecimal-digit-sequence insert
octal-digit-sequence: octal-digit octal-digit-sequence ’opt octal-digit
2 [Example 1 : The number twelve can be written 12, 014, 0o14, 0XC, or
0b1100. The
integer-literals 1048576, 1’048’576,
0X100000, 0x10’0000, and 0’004’000’000 0o0’004’000’000, all have
the same value. — end example]
Table 149 — Enum class perms [tab:fs.enum.perms]
Name | Value (octal) | POSIX macro | Definition or notes |
---|---|---|---|
none | 0o | There are no permissions set for the file. | |
owner_read | 0o400 | S_IRUSR | Read permission, owner |
owner_write | 0o200 | S_IWUSR | Write permission, owner |
owner_exec | 0o100 | S_IXUSR | Execute/search permission, owner |
owner_all | 0o700 | S_IRWXU | Read, write, execute/search by owner; owner_read | owner_write | owner_exec |
group_read | 0o40 | S_IRGRP | Read permission, group |
group_write | 0o20 | S_IWGRP | Write permission, group |
group_exec | 0o10 | S_IXGRP | Execute/search permission, group |
group_all | 0o70 | S_IRWXG | Read, write, execute/search by group; group_read | group_write | group_exec |
others_read | 0o4 | S_IROTH | Read permission, others |
others_write | 0o2 | S_IWOTH | Write permission, others |
others_exec | 0o1 | S_IXOTH | Execute/search permission, others |
others_all | 0o7 | S_IRWXO | Read, write, execute/search by others; others_read | others_write | others_exec |
all | 0o777 | owner_all | group_all | others_all | |
set_uid | 0o4000 | S_ISUID | Set-user-ID on execution |
set_gid | 0o2000 | S_ISGID | Set-group-ID on execution |
sticky_bit | 0o1000 | S_ISVTX | Operating system dependent. |
mask | 0o7777 | all | set_uid | set_gid | sticky_bit | |
unknown | 0xFFFF | The permissions are not known, such as when a file_status object is created without specifying the permissions |
octal-literal0octal-literal ’opt octal-digitprefixed-octal-literal unprefixed-octal-literal unprefixed-octal-literal: 0 0 ’opt octal-digit prefixed-octal-literal: octal-prefix octal-digit-sequence
Before hexadecimal-prefix insert
octal-prefix: one of 0o 0O
Before hexadecimal-digit-sequence insert
octal-digit-sequence: octal-digit octal-digit-sequence ’opt octal-digit
Add a new section
1 An octal literal ([lex.icon], [gram.lex]) of the form
unprefixed-octal-literalis deprecated.
In 28.5.2.2 Standard format specifiers [format.string.std], we have the following example:
21 The available integer presentation types for integral types other than bool and charT are specified in Table 102. [Example 4 :
string s0 = format("{}", 42); // value of s0 is "42" string s1 = format("{0:b} {0:d} {0:o} {0:x}", 42); // value of s1 is "101010 42 52 2a" string s2 = format("{0:#x} {0:#X}", 42); // value of s2 is "0x2a 0X2A" string s3 = format("{:L}", 1234); // value of s3 can be "1,234" // (depending on the locale)— end example]
The example shows that std::format returns a consistent base prefix for the alternate form # option with a hexadecimal type -- 0x for #x and 0X for #X. However, the same is not true for the # option with an octal type, where we would have:
std::string s1 = std::format("{:#o}", 042); // value of s1 is "042" std::string s2 = std::format("{:#o}", 0o42); // value of s2 is "042" // The option #O does not exist
To follow the deprecation of the unprefixed-octal-literal, we would prefer a behavior change in the output of std::format with #o to use the prefix 0o. By using the search term "std::format #o language:C++" on GitHub, we found that the format specifier #o was used in only 23 files and out of those, 13 are educational examples and 5 are test suites. This could indicate that a behavior change will have minimal impact on existing code. Given the improved clarity of the new prefix (with potential security implications), we recommend this backward-incompatible change.
N3353 calls out this problem, but does not attempt to solve it. As backward-compatible alternatives, we propose several options:
Add the #O option, to return 0o
Add the #O option, to return 0O
Add the ##o and ##O options, to return 0o and 0O respectively
Do Nothing
Thanks to Erich Keane for reviewing the draft. The document style was borrowed from Doc. no. N4340