Document number N1394=02-0052, for the Evolution working group

Document number N1394=02-0052 2002-Sept-10

Some Proposed Extensions to C++ Language

For Evolution Working Group

David E. Miller

1. "Finally" block

Problem(s) to be addressed

Factoring block completion code into a single place.

Easy instrumentation (retrofitting) of existing code to track function returns, as well as easier normalization of existing code.

Why this is not solved by use of simple destructors associated with the variables declared within the blocks

It often occurs that some final action depends upon the combined state of multiple variables, so the destructors associated with individual variables are to no avail.

How the problem is partially solved now

Factoring common code into a single function (including the special case of a local class’ destructor)

Repeating loose code for each possible point of exit.

For the case of the current try-catch block code, there are several approaches to executing the same code for multiple catch blocks and for the case of no exception being thrown

Factor the code into a separate function, which is called by each catch block and after the last catch block, taking into account that if a catch block does not throw, the common function must not be called twice. This can be done by using a flag existing outside the try-catch block or by executing a "goto" statement from each such catch block past the non-throw invocation of the common function.

As a variation of this approach, a local class can be declared, an instance of which can be initialized with references to the variables to be handled, with the local class’ destructor performing the needed actions.

Repeating code in each catch block, as well as after the last catch block, in a similar fashion to using a function.

Advantages and disadvantages of each approach

An advantage of putting the code into a function is the guarantee the multiple uses will not get out of sync, in the event the code needs to be changed.

A disadvantage of putting the code into a function is the inability of the function’s code to execute directly a return statement from the point of invocation, thus requiring extra code to provide for this possibility.

E.g. loose code could say, "if( condition ) return some_value;" whereas an invoked function could do no more than return a flag indicating the caller should then return, which then harks back to the loose code it was intended to avoid.

Mechanical instrumentation of existing code

It is convenient for debugging and maintenance purposes, particularly when investigating code written by others who may be long gone, to be able to track exits from functions and to be able to guarantee that a single breakpoint can be placed into each function at the point of return, rather than having to spent extra time tracking down every possible return or throw statement.

Current approach

Insert and instance of a locally defined class at the start of each function, using the destructor to execute some useful code.
Proposed approach

Add a new "finally" keyword having effect similar (not necessarily identical) to that of Java or C#.

Appended to try-catch blocks, "finally" would result in a block executed regardless of which (if any) "catch" block associated with the preceding "try" block was executed.

Exiting a try-catch block by use of a "goto" to a point after the "finally" block would bypass the "finally" block. This construction should elicit some warning message from the compiler.

Exiting a try-catch block by use of a return statement would NOT bypass the "finally" block. In the event a "finally" block executed as a result of a return statement itself executes a return statement, the value specified in a return statement executed within the "finally" block would supersede that of the original return statement. In the event of exiting multiple "finally" blocks with more than one executing a value return statement, the last return value would pertain.

As in C#, a "finally" block could be used with a "try" block lacking any "catch" block, in which case its behavior would be equivalent to a try-catch-finally block in which the sole catch block consisted of a rethrown generic exception.

try
{
}
catch(...) // This catch block could be omitted without change of effect

{

throw ;

}

finally

{

}

As in C#, multiple "finally" blocks would be allowed for a single "try" block.

Because "catch" blocks could be omitted, it becomes very easy to automate the process of instrumenting existing code by adding "try" to the function start and "finally" to the function end.

Example

void f() try
{
}
finally
{
cout << "returning from f()" << endl ; // convenient location for breakpoint
}

Impact upon backward compatibility

Additional keyword

The introduction of a new keyword would require that any code already using the new keyword be modified by a global search-and-replace operation to change the name to something else. However, since this would not require any structural change, the impact would be less than, for example, the change in "for" variable scope.

Impact upon compiler

Since the "finally" concept has already been implemented in Java, C#, and at least one C++ compiler (as a non-standard extension), albeit with somewhat different details than those in this proposal, one may expect the impact to be limited.

2. New keyword "shared"
Problem(s) to be addressed

Currently, there is no uniform way to specify that data are, or are not, shared among multiple threads, which often leads to code that inadvertently fails to lock, or doubly locks, data. In the former case, non-deterministic data corruption can occur; in the latter case, deadlocks can occur.

Proposed approach

Add a "shared" keyword having several characteristics:

Shared primitive data types are not usable, except for address purposes (pointer and reference).

Note how this differs from the "volatile" modifier, in which a volatile primitive can be used in an ordinary manner.

Non-shared object instances can be acted upon only by functions not having the "shared" modifier.

Note how this differs from the "volatile" modifier, in which a non-volatile object can be acted upon by a function expecting a volatile.

Shared object instances can be acted upon only by functions having the "shared" modifier.

This is similar to the current "volatile" modifier.

Usage examples

Class C

{

MutexObject m_lock ;

int m_int ;

void f() shared

{

Locker locker1( m_lock );

C * pThisUnshared = const_cast< C * >( this );

pThisUnshared->f();

}

void f()

{

this->m_int = ... ;

}

};

Similarities to, and differences from, current "volatile" keyword

The "volatile" keyword does not preclude a volatile primitive type from being used.

There is no way to prevent a volatile function from being inadvertently invoked on a non-volatile object, other than by preemptively overloading each such function with a non-volatile variation, which may not be a desirable approach for obvious reasons.

Impact upon backward compatibility

Additional keyword

Impact upon compiler

Since there is already a mechanism for dealing with const and volatile, adding another variation should not involve a great addition to compiler complexity.

3. Generalization of hexadecimal number specification (0X...) to allow specifying the radix (nX..., where n is 1-16)
Problem(s) to be addressed

More flexible initialization of values that might be more readily expressed in some base other than 8, 10, or 16.

Uniform specification of based numbers, rather than the ad hoc decimal, octal, hexadecimal syntaxes presently in use.

Simple way to specify bit counts and, by implication, parity, without invoking a function.

Proposed approach

The current base-16 syntax can be trivially extended with complete backward compatibility by allowing the number preceding the ‘X’ to be extended to the set {0..16}, where any non-zero value would be construed to be the radix.

The special case of a radix of one would be construed to result in the simple tally of all the ‘1’ digits of a hexadecimal number, which can be convenient for automating the specification of parity values for compilation-time, binary constants, among other things.

Impact upon backward compatibility

Since the proposed extension extends the syntax merely by allowing the numbers 1-16 in front of the ‘X’ of the current hexadecimal notation, no currently valid code should be broken.

Impact upon compiler

Based upon informal comments of some compiler writers made in the Spring, 2002, meeting, the impact upon compilers should be extremely minor.

Examples of use

10x123 equivalent to 123

8x123 equivalent to 2x001010011, equivalent to 0x53

2x111 equivalent to 0x7

4x123 equivalent to 2x011011, equivalent to 0x1B

12x12 equivalent to 14, equivalent to 0xE

1x001010011, equivalent to 4

1x1234567, equivalent to 12 (1 + 1 + 2 + 1 + 2 + 2 + 3)

1x1234567 & 1, equivalent to 0 (creates parity bit for 0x1234567)