Jarrad J. Waterloo <descender76 at gmail dot com>
|Audience||Evolution Working Group (EWG)|
Programmer’s, Businesses and Government(s) want C++ to be safer and simpler. This has led some
C++ programmers to create new programming languages or preprocessors, which again is a new language. This paper discusses using static analysis to make the
C++ language itself safer and simpler.
Following is a wishlist. Most are optional. While, they all would be of benefit. It all starts with a new repeatable module level attribute that would preferably be applied once in the
primary module interface unit and would automatically apply to it and all
module implementation unit(s). It could also be applied to a
module implementation unit but that would generally be less useful. However, it might aid in gradual migration.
export module some_module_name [[static_analysis("")]];// primary module interface unit // or module some_module_name [[static_analysis("")]];// module implementation unit
static_analysisattribute could be passed as either an environment variable and/or command line argument to compilers so it could be used by pipelines to assert the degree of conformance to the defined static analyzer without actually changing the source.
The name of the static analyzer are dotted. Unscoped or names that start with
c. are reserved for standardization.
This proposal wants to stardardize two overarching static analyzer names;
Neither is concerned about formatting or nitpicking. Both static analyzers only produce errors. These are meant for programmers, businesses and governments in which safety takes precedence. They both represent +∞. When a new version of the standard is released and adds new sub static analyzers than everyone’s code is broken, until their code is fixed. These sub static analyzers usually consist of features that have been mostly replaced with some other feature. It would be ideal if the errors produced not only say that the code is wrong but also provide a link to html page(s) maintained by the
C++ teaching group, the authors of the
C++ Core Guidelines  and compiler specific errors. These pages should provide example(s) of what is being replaced and by what was it replaced. Mentioning the version of the
C++ standard would also be helpful.
All modules can be used even if they don’t use the
static_analysis attribute as this allows gradual adoption.
modernanalyzers composed of?
These overarching static analyzers are composed of multiple static analyzers which can be used individually to allow a degree of gradual adoption.
constcharacter type” or “
constcharacter type” and their arguments were string literals.
std::string_viewmust be creatable at compile time
C++has been advocated to programmers in other programming languages who complain about memory issues. This allows us to show them what we have been saying for decades.
1985: Cfront 1.0 
C++ Core Guidelines [1:1] identifies issues that this feature helps to mitigate.
P.4: Ideally, a program should be statically type safe
P.6: What cannot be checked at compile time should be checkable at run time
P.7: Catch run-time errors early
P.8: Don’t leak any resources
P.11: Encapsulate messy constructs, rather than spreading through the code
P.12: Use supporting tools as appropriate
P.13: Use support libraries as appropriate
I.4: Make interfaces precisely and strongly typed
I.11: Never transfer ownership by a raw pointer (T*) or reference (T&)
I.12: Declare a pointer that must not be null as not_null
I.13: Do not pass an array as a single pointer
I.23: Keep the number of function arguments low
F.7: For general use, take T* or T& arguments rather than smart pointers
F.15: Prefer simple and conventional ways of passing information
F.22: Use T* or owner<T*> to designate a single object
F.23: Use a not_null<T> to indicate that “null” is not a valid value
F.25: Use a zstring or a not_null<zstring> to designate a C-style string
F.26: Use a unique_ptr<T> to transfer ownership where a pointer is needed
F.27: Use a shared_ptr<T> to share ownership
F.42: Return a T* to indicate a position (only)
F.43: Never (directly or indirectly) return a pointer or a reference to a local object
C.31: All resources acquired by a class must be released by the class’s destructor
C.32: If a class has a raw pointer (T*) or reference (T&), consider whether it might be owning
C.33: If a class has an owning pointer member, define a destructor
C.149: Use unique_ptr or shared_ptr to avoid forgetting to delete objects created using new
C.150: Use make_unique() to construct objects owned by unique_ptrs
C.151: Use make_shared() to construct objects owned by shared_ptrs
R.1: Manage resources automatically using resource handles and RAII (Resource Acquisition Is Initialization)
R.2: In interfaces, use raw pointers to denote individual objects (only)
R.3: A raw pointer (a T*) is non-owning
R.5: Prefer scoped objects, don’t heap-allocate unnecessarily
R.10: Avoid malloc() and free()
R.11: Avoid calling new and delete explicitly
R.12: Immediately give the result of an explicit resource allocation to a manager object
R.13: Perform at most one explicit resource allocation in a single expression statement
R.14: Avoid  parameters, prefer span
R.15: Always overload matched allocation/deallocation pairs
R.20: Use unique_ptr or shared_ptr to represent ownership
R.22: Use make_shared() to make shared_ptrs
R.23: Use make_unique() to make unique_ptrs
ES.20: Always initialize an object
ES.24: Use a unique_ptr<T> to hold pointers
ES.42: Keep use of pointers simple and straightforward
ES.47: Use nullptr rather than 0 or NULL
ES.60: Avoid new and delete outside resource management functions
ES.61: Delete arrays using delete and non-arrays using delete
ES.65: Don’t dereference an invalid pointer
E.13: Never throw while being the direct owner of an object
CPL.1: Prefer C++ to C
Usage of smart pointers
This static analyzer causes programmers to use 2 extra characters when using smart pointers,
(*)., since the overloaded
-> operator returns a pointer.
the main function and environment variables
A shim module is needed in order to transform main and env functions into a more C++ friendly functions. These have been asked for years.
A Modern C++ Signature for main
Desert Sessions: Improving hostile environment interactions
C/core cast produces an error.
reinterpret_castproduces an error.
const_castproduces an error.
C/core cast was replaced by
reinterpret_castis needed more for library authors than their users. For library users it usually just causes problems and questions. It is rarely used in daily
C++when coding at a higher level.
const_castis needed more for library authors than their users. It is a means for the programmer to lie to oneself. For library users it usually just causes problems and questions. It is rarely used in daily
C++when coding at a higher level.
C++ Core Guidelines [1:2] identifies issues that this feature helps to mitigate.
C.146: Use dynamic_cast where class hierarchy navigation is unavoidable
ES.48: Avoid casts
ES.49: If you must use a cast, use a named cast
ES.50: Don’t cast away const
unionkeyword produces an error.
It was replaced by
std::variant, which is safer.
C++ Core Guidelines [1:3] identifies issues that this feature helps to mitigate.
C.181: Avoid “naked” unions
mutablekeyword produces an error.
The programmer shall not lie to oneself. The
mutable keyword violates the safety of
const and is rarely used at a high level.
deletekeywords to allocate and deallocate memory produces an error.
It was replaced by
std::make_shared, which are safer.
C++ Core Guidelines [1:4] identifies issues that this feature helps to mitigate.
F.26: Use a unique_ptr<T> to transfer ownership where a pointer is needed[20:1]
F.27: Use a shared_ptr<T> to share ownership[21:1]
C.149: Use unique_ptr or shared_ptr to avoid forgetting to delete objects created using new[27:1]
C.150: Use make_unique() to construct objects owned by unique_ptrs[28:1]
C.151: Use make_shared() to construct objects owned by shared_ptrs[29:1]
R.11: Avoid calling new and delete explicitly[35:1]
R.20: Use unique_ptr or shared_ptr to represent ownership[40:1]
R.22: Use make_shared() to make shared_ptrs[41:1]
R.23: Use make_unique() to make unique_ptrs[42:1]
ES.60: Avoid new and delete outside resource management functions[47:1]
ES.61: Delete arrays using delete and non-arrays using delete[48:1]
volatilekeyword produces an error.
volatile keyword has nothing to do with concurrency. Use
C++ Core Guidelines [1:5] identifies issues that this feature helps to mitigate.
CP.8: Don’t try to use volatile for synchronization
Cstyle variadic functions
Cstyle variadic function produces an error.
Cstyle variadic function produces an error.
va_listfunctions produces errors.
C style variadic functions has been replaced by overloading, templates and variadic template functions.
C++ Core Guidelines [1:6] identifies issues that this feature helps to mitigate.
Deprecated functionality is not modern.
C++array variable, whether locally or in a class, produces an error.
std::arrayand other collections.
std::array instead of
modernanalyzers be composed of in the future?
The preprocessor directive
#include has been replaced with
Don’t add the static analyzer until
#embed is added.
NOTE: This may be impossible to implement as preprocessing occurs before compilation.
gotokeyword produces an error.
Don’t add until
continue to a label is added. Also a really easy to use finite state machine library may be needed.
C++ Core Guidelines [1:7] identifies issues that this feature helps to mitigate.
ES.76: Avoid goto
C++function pointer, whether locally or in a class, produces an error.
C++member function pointer, whether locally or in a class, produces an error.
std::function_ref instead of
C++ [member] function pointers.
std::function_ref can bind to stateful and stateless, free and member functions. It saves programmers from having to include a
void* state parameter in their function pointer types and it also saves from having to include
void* state parameter along side the function pointer type in each function where the function pointer type is used in function declarations. Neither of which could be performed with the
"use_lvalue_references" static analyzer.
By adding static analysis to the
C++ language we can make the language safer and easier to teach because we can restrict how much of the language we use. Human readable errors and references turns the compiler into a teacher freeing human teachers to focus on what the compiler doesn’t handle.
NO, otherwise we’ll be stuck with what we just have.
C++ compilers produces plenty of warnings.
C++ static analyzers produces plenty of warnings. However, when some one talks about creating a new language, then old language syntax becomes invalid i.e. errors. This is for programmers. Programmers and businesses rarely upgrade their code unless they are forced to. Businesses and Government(s) want errors, as well, in order to ensure code quality and the assurance that bad code doesn’t exist anywhere in the module. This is also important from a language standpoint because we are essentially pruning; somewhat. Keep in mind that all of these pruned features still have use now. In the future, as more constructs are built upon these pruned features, which is why they need to be part of the language, just not a part of everyday usage of the language.
Programmers and businesses rarely upgrade their code unless they are forced to. New programmers need training wheels and some of us older programmers like them too. Due to the proliferation of government regulations and oversight, businesses have acquired
software composition analysis services and tools. These services map security errors to specific versions of modules; specifically programming artifacts such as executables and libraries. As such, businesses want to know if a module is reasonably safe.
Any arguments provided after the name of the analyzer can be forwarded onto the analyzer.
[[static_analysis("some_future_analyzer", 1, true, 0.5, "Hello World")]]
In this case,
1, true, 0.5, "Hello World" would all be forwarded to the static analyzer “some_future_analyzer”. None of the current analyzers use this functionality so this just illustrates distant future work where we can define these analyzers in a standard fashion but that can’t happen until we have a code DOM. As such, how these arguments are forwarded are currently compiler specific.
Actually, I love
C++ and pointers.
C++libraries use pointers but the users of those libraries don’t need them.
void*for type erasure but the users of
function_ref, most of the time, won’t need it.
The fact is pointers, unsafe casts,
goto are the engine of C++ change. As such it would be foolish to remove them but it is also unrealistic for users/drivers of a vehicle to have to drive with nothing between them and the engine, without listening to them clamor for interior finishing.
static_analysisattribute so that static analyzers can be called?
C++reserve unscoped or names that start with
c.are for future standardization?
C++reserve the names of static analyzers in the reserved
C++static analyzer namespace?
C++recommend these reserved static analyzers and leave it to the compiler writers to appease their users that clamor for them?
First of all, let’s consider the quotes of Bjarne Stroustrup that this question are based upon.
“being defined by an ‘industry consortium.’ I am not in favor of language subsets or dialects. I am especially not fond of subsets that cannot support the standard library so that the users of that subset must invent their own incompatible foundation libraries. I fear that a defined subset of C++ could split the user community and cause acrimony” [65:1]
Does this paper create a subset? YES. Like it or not
C++ already have a couple of subsets; some positive, some quasi.
Freestanding is a subset for low level programming. This proposal primarily focus on high level programming but there is nothing preventing the creation of
[[static_analysis("freestanding")]] which enforces
C++ value categories has to some degree fractured the community into a clergy class that thoroughly understand its intracacies and a leity class that gleefully uses it.
Does this paper split the user community? YES and NO. It splits code into safer vs. less safe, high level vs. low level. However, this is performed at the module level, allowing the same programmer to decide what falls on either side of the fence. This would not be performed by an industry consortium but rather the standard. Safer modules can be used by less safe modules. Less safe modules can partly be used by safer modules, such as with the standard module. This latter impact is already minimalized because the standard frequently write their library code in
C++ fashion instead of a
“Are there any features you’d like to remove from C++?” 
Not really. People who ask this kind of question usually think of one of the major features such as multiple inheritance, exceptions, templates, or run-time type identification. C++ would be incomplete without those. I have reviewed their design over the years, and together with the standards committee I have improved some of their details, but none could be removed without doing damage. [66:1]
Most of the features I dislike from a language-design perspective (e.g., the declarator syntax and array decay) are part of the C subset of C++ and couldn’t be removed without doing harm to programmers working under real-world conditions. C++'s C compatibility was a key language design decision rather than a marketing gimmick. Compatibility has been difficult to achieve and maintain, but real benefits to real programmers resulted, and still result today. By now, C++ has features that allow a programmer to refrain from using the most troublesome C features. For example, standard library containers such as vector, list, map, and string can be used to avoid most tricky low-level pointer manipulation. [66:2]
The beauty of this proposal is it does not and it does remove features from C++. Like the standard library, it allows programmers to refrain from using the most troublesome
“Within C++, there is a much smaller and cleaner language struggling to get out” 
Both making things smaller and cleaner requires removing something. When creating a new language, removing things happens extensively at the beginning but, frequently, features have to be added back in, when programmers clamor for them. This paper cleans up a programmers use of the
C++ language, meaning less
C++ has to be taught immediately, thus making things simpler. As a programmer matures, features can be gradually added to their repertoire, just as it was added to ours. After all, isn’t
C++ larger now, than when we started programming in