ISO/IEC JTC1 SC22 WG21 N2601 = 08-0111 - 2008-04-01

Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org

C++ provides good support for numbers
expressed in octal, decimal, and hexadecimal bases.
However, it does not provide support for numbers
expressed in the sexagesimal (60) base.
This lack of support is surprising given that the sexagesimal base
is the foundation of modern time and angular measures.
Within the sexagesimal base,
time is simply expressed as hours with a radix point.
The "minute" is a "minute hour"
or a 60^{th} of an hour.
The "second" is a "second minute hour",
a 60^{th} of a minute hour,
or a 3600^{th} of an hour.
Likewise, the angular degree is so divided and named.
More pragmatically,
the time and angles can be counted as integral "seconds".
The higher order components, minutes and hours,
will be read directly from the higher order digits.

With sexagesimal numbers, the representation of time needs only three digits, one for the hour, one for the minute, and one for the second. Furthermore, no punctuation is necessary. This economy contrasts markedly with conventional time notation, which requires six digits and two punctuation marks. Likewise, the representation of angles needs only four digits, two for the degree, one for the minute, and one for the second. Again, no punctuation is necessary, and its economy contrasts with seven digits and three punctuation markers in conventional angular notation.

Given the economy of sexagesimal notation, the direct support for it in C++ is desirable.

The straightforward approach to supporting an additional number base is to extend the existing mechanism for indicating hexadecimal numbers. Unfortunately, the number of sexagesimal digits exceeds the number of decimal digits and lower-case letters. The obvious workaround is to have upper-case and lower-case letters represent different digits. While this workaround does meet the need for digits, it is inconsistent with the syntax for hexadecimal numbers. Additionally, the explicit base marker reduces the advantage in economy of sexagesimal representations. More importantly, mapping between digit value and the letter forms is less than obvious, entailing a significant cognitive burden. So, extending the hexadecimal approach is not useful.

The solution to this notational problem was well-solved by the Old Babylonian period, the first half of the second millennium BC [1] [2] [3] [4]. Rather than adopt an untested and invented number representation, we propose to represent sexagesimal numbers with historically proven notation, Babylonian cuneiform numbers.

With cuneiform digits, the base of the number is obviously sexagesimal, and additional marking of the base is unnecessary. This lack of a base marker preserves the economy of representation described above.

We propose to use the ISO 10646 [6] [7] [8] standard to represent cuneiform numbers. ISO 10646 provide characters to represent cuneiform digits [9] [10] [11] [12] [13]. However, rather than provide one character per digit, it provides to characters per digit. The first character encodes the number of tens (, , , , or ) and the second character encodes the number of units (, , , , , , , , or ).

The number represents 10×60+4×60+50+6 which is 896.

As is customary in C++,
the cuneiform digits may need to be represented with universal character names.
The example above would be represented as
`\U0001230B\U0001243C\U00012410\U0001240B`

when the program text must use only the basic character set.

The cuneiform digit representation included no explicit representation for zero units or zero tens. As such, a digit consisting only of tens followed by a digit consisting only of units could be interpreted as a single digit consisting of both tens and units. As was customary in some texts [5], we disambiguate this situation with a vertical colon in the position of the units.

Ancient cuneiform numbers represented a zero digit with an empty space. As this approach is not compatible with modern usage, we propose to explicitly represent a zero with the cuneiform diagonal colon, as was customary in later texts.

Cuneiform numbers often have variant forms.
We propose to include some of those variants,
when they might be important to minimizing either height or widths.
However, we leave out variant forms that might be confusing.
Furthermore, we leave out the ash

forms of units,
as there is no present need to standardize them.

The current Unicode standard has omitted the character for 'twenty'. Pending correction of that omission, we propose to use the "u over u u reversed over u reversed", which look much like "twenty and twenty reversed".

While fully developed cuneiform numbers used a place value system, the radix point was not explicitly marked. The present proposal is for sexagesimal representation of integers, and hence needs no radix point. This view is consistent with current C++ practice, which does not permit floating-point values in bases other than decimal.

Finally, we leave out the forms for higher-order digits, as there is no present need to standardize them.

All changes apply to 2.13.1 Integer literals [lex.icon].

Edit the grammar as follows:

integer-literal:decimal-literalinteger-suffix_{opt}octal-literalinteger-suffix_{opt}hexadecimal-literalinteger-suffix_{opt}sexagesimal-literalinteger-suffix_{opt}sexagesimal-literal:sexagesimal-literal_{opt}sexagesimal-tenssexagesimal-units_{opt}sexagesimal-upper_{opt}sexagesimal-unitssexagesimal-literal_{opt}sexagesimal-zerosexagesimal-upper:sexagesimal-literal_{opt}sexagesimal-tenssexagesimal-unitssexagesimal-literal_{opt}sexagesimal-tenssexagesimal-colonsexagesimal-upper_{opt}sexagesimal-unitssexagesimal-literal_{opt}sexagesimal-zero

Add the following character rules. Note that the glyph column will render improperly (e.g. question marks or bracket-enclosed Arabic digits) if your browser does not implement the cuneiform glyphs. The glyphs appear in references [10] and [11].

extended

characternumeric

valueHTML

glyphgraphical

formsexagesimal-units:one of`\U00012079`

1 𒁹 1 `\U0001222B`

2 𒈫 2 `\U00012408`

3 𒐈 3 `\U00012409`

4 𒐉 2/2 `\U0001243C`

4 𒐼 3/1 `\U0001243E`

4 𒐾 2/1+1 `\U0001243F`

4 𒐿 1/1+2 `\U0001240A`

5 𒐊 3/2 `\U0001240B`

6 𒐋 3/3 `\U0001240C`

7 𒐌 4/3 `\U00012442`

7 𒑂 3/3/1 `\U00012443`

7 𒑃 3/3+1 `\U0001240D`

8 𒐍 4/4 `\U00012444`

8 𒑄 3/3/2 `\U0001240E`

9 𒐎 4/4+1 `\U00012446`

9 𒑆 3/3/3 sexagesimal-tens:one of`\U0001230B`

10 𒌋 1 `\U00012310`

20 𒌐 properly 2 , but substituting 1/1+1/1 reversed `\U0001230D`

30 𒌍 3 `\U0001240F`

40 𒐏 2/2 (also ) `\U00012410`

50 𒐐 3/2 (also ) sexagesimal-colon:`\U00012471`

0 𒑱 1/1 vertical sexagesimal-zero:`\U00012472`

0 𒑲 1/1 diagonal

Append the following to paragraph 1.

A sexagesimal integer literal (base sixty) consists of a sequence of sexagesimal digits.

- "About Cuneiform Writing...", http://www.upenn.edu/museum/Games/cuneiform.html
- Duncan J. Melville, "Cuneiform numbers", http://it.stlawu.edu/~dmelvill/mesomath/Numbers.html
- "Chapter 1 History of Numbers", "Section 1.5 The Babylonians", "Lesson 3 - Larger Numbers", http://mathematics.gulfcoast.edu/mgf1107ll/Chap1Sec5Lesson3.htm
- "Large Cuneiform Numbers", http://www.mathematicsmagazine.com/7-2003/Cueniform_No_7_2003.htm
- O. Neugebauer, "On a Special Use of the Sign "Zero" in Cuneiform Astronomical Texts", http://links.jstor.org/ sici?sici=0003-0279(194112)61:4%3C213:OASUOT%3E2.0.CO;2-2
- ISO/IEC JTC1/SC2/WG2, http://std.dkuug.dk/JTC1/SC2/WG2/
- ISO 10646 2003 Amd 2 2006(E), http://standards.iso.org/ittf/ PubliclyAvailableStandards/c041419_ISO_IEC_10646_2003_Amd_2_2006(E).zip
- Unicode, http://www.unicode.org/
- Unicode 5.0 Chapter 14 Archaic Scripts, http://www.unicode.org/versions/Unicode5.0.0/ch14.pdf
- Unicode Cuneiform, http://www.unicode.org/charts/PDF/U12000.pdf
- Unicode Cuneiform Numbers and Punctuation, http://www.unicode.org/charts/PDF/U12400.pdf
- Wikipedia: Unicode Cuneiform, http://en.wikipedia.org/wiki/Unicode_cuneiform
- Wikipedia: List of Cuneiform Signs, http://en.wikipedia.org/wiki/List_of_cuneiform_signs