[SG16-Unicode] Identifiers in C++

JF Bastien cxx at jfbastien.com
Fri May 10 18:43:28 CEST 2019


Hi C++ પกٱƈѻɗﻉ ḟäṅṡ 👋!

The current list of valid identifier characters is pretty silly (see [*lex.name
<http://lex.name>*] 5.10 Identifiers or cppreference summary
<https://en.cppreference.com/w/cpp/language/identifiers>). It allows
characters such as zero-width joiner and zero-width space among a few silly
things (see how bad this can get <https://godbolt.org/z/sBJk1k>,
h/t Richard Kogelnig).

I asked where it came from, and IIUC John looked at Unicode and cobbled the
list of valid ranges manually. That ain't great.

Is this group interested in fixing things?

There's already an existing standard for this, maybe it's a thing we can
adopt as-is or use as a starting point:

https://unicode.org/reports/tr31/


Further, the tooling group was just talking about module names. I think we
should allow any valid identifier name as module name, and look at how this
could map to file names for a tooling TR's purpose.

Thanks,

J̙̘̗̘̟͐̀̎F͚̜͈̖͉̗̘̊
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.open-std.org/pipermail/unicode/attachments/20190510/783d3d42/attachment.html 


More information about the Unicode mailing list