Draft proposal for adding Multimethods to C++

Document number: 03-0046/N1463

Author: Julian Smith (jules@op59.net)

Version: $Date: 2003/04/28 01:26:23 $ $Revision: 1.13 $

Contents

Multimethods as a C++ extension

Syntax

Stroustrup's Design and Evolution of C++ suggests a syntax for multimethods in C++, which this proposal approximately follows.

A function prototype is a multimethod function if one or more of its parameters are qualified with the keyword virtual. Implementations of a multimethod function have conventional parameters without the virtual qualifier.

The parameters that correspond to the virtual parameters in the virtual function prototype must be derived from the original types, and have the same modifiers (such as const).

Virtual function declarations must occur before multimethod implementation function definitions/declarations. Otherwise the multimethod implementation functions will be treated as conventional functions.

Multimethod implementation functions have the same name as the multimethod function, but with an additional trailing underscore (but see Unresolved issues).

Here is the classic multimethods example, an Overlap() function that takes references to a Shape base class. Detecting whether two shapes overlap requires different code for each combination of shapes:

struct  Shape            {...};
struct  Square   : Shape {...};
struct  Triangle : Shape {...};

bool    Overlap( virtual Shape& a, virtual Shape& b);
    
bool    Overlap_( Square& a, Triangle& b) {...}
bool    Overlap_( Triangle& a, Square& b) {...}
bool    Overlap_( Shape& a, Square& b)    {...}
bool    Overlap_( Square& b, Shape& a)    {...}

The Overlap( virtual Shape& a, virtual Shape& b) prototype is replaced by a dispatch function with prototype Overlap( Shape& a, Shape& b). Internally, this dispatch function uses C++ RTTI to choose one of the available Overlap_() functions, based on the dynamic types of its parameters.

Dispatch algorithm

It is possible that there isn't any matching multimethod implementation function for a particular combination of dynamic types. In this case, the generated dispatch function will throw an exception.

It will also throw an exception if there is no clear best multimethod implementation for a particular combination of dynamic types. A multimethod implementation is considered best only if both the following two conditions apply:

  1. All of the best multimethod implementation's parameter types match the dynamic types.
  2. Each of the best multimethod implementation's parameter types is at least as derived as the corresponding parameter type in any other matching multimethod implementation.

This is the same as the lookup rule used by C++ at compile time, apart from compile-time C++'s use of conversion operators and special-casing of built-in types.

Note that we cannot have duplicate multimethod implementations, so the second condition implies that for each other matching multimethod implementation X, the best multimethod implementation must have at least one parameter type that is more derived than the corresponding parameter type in X.

So:

Shape&  s = *new Square;
Shape&  t = *new Triangle;
Overlap( t, t); // Throws - no matching multimethod implementation
Overlap( s, t); // Calls Overlap_( Square&, Triangle&)
Overlap( t, s); // Calls Overlap_( Triangle&, Square&)
Overlap( s, s); // Throws - these multimethod implementations
                // both match, but neither is a
                // better match than the other:
                //   Overlap_( Shape& a, Square& b)
                //   Overlap_( Square& b, Shape& a)

There is also the possibility of ambiguities caused by a dynamic type multiply-inheriting from the same class more than once (a similar error can already occur at compile time if a static type multiply-inherits from the same class more than once).

Miscellaneous issues

Classes that are used in multimethods shall be polymorphic - i.e. have at least one virtual function. This is necessary to allow the dispatcher to use RTTI on the classes.

Each virtual parameter must be one of:

If a virtual parameter is a pointer, passing a NULL pointer gives undefined behaviour.

Multimethod implementations must match the use of reference/pointer parameters in the original multimethod:

bool Overlap( virtual Shape&, virtual Shape*);
...
bool Overlap_( Square& a, Triangle* b){...} // ok

bool Overlap_( Circle& a, Ellipse& b) {...}
  // not an implementation of the multimethod,
  // because second parameter should be a pointer.

One can use non-virtual parameters in a multimethod in addition to the virtual parameters. So a multimethod could look like:

int foo(
    std::string&,
    virtual Base&,
    int x,
    virtual const Base&);

Suggested implementation requirements

Dispatch speed

The dispatch should give at worst O(Log N) dispatch speed, where N is the number of different combinations of dynamic types that have been passed to the multimethod during the whole runtime of the programme.

Use of multimethods before main()

Multimethods shall not be called before main() - i.e. static initialisation code shall not call multimethods.

This allows implementations that initialise globals before main() is entered, to use their own global initialisation support to register multimethods at runtime, avoiding the need to do whole-programme analysis at build time.

Impact on the standard

This proposal's syntax is a pure extension, which doesn't conflict with any existing C++ syntax.

Implementations that do not initialise globals before main() is entered, will have to use a custom techique to ensure that multimethod implementations are registered correctly before main() is entered.

Unresolved issues

Syntax

The use of a trailing underscore for multimethod implementations is an ugly hack, but I haven't been able to come up with a good alternative yet. In particular, it allows direct calling of multimethod implementations, which will get conventional compile-time overloading all all visible multimethod implementations. For the Cmm implementation, it also allows one to define and call a multimethod implementation that takes the base parameters, which would otherwise clash with the generated dispatch function.

bool overlaps = Overlap( shape1, shape2);
// conventional static dispatch.

bool overlaps = Overlap( virtual shape1, virtual shape2);
// call the multimethod.

Requiring the user to specify the dispatch in this way is contrary to the rest of C++ though.

An possible alternative would be to put multimethod implementations inside a namespace:

bool Overlap( virtual Shape&, virtual Shape&);
...
namespace std_mm
{
  bool Overlap( Square& a, Triangle& b) {...}
}

Should compile-time multimethod overloads be allowed? Maybe it could be used to restrict the number of possible multimethod implementations that are available for consideration at runtime. This would require that multimethod implementations are registered with more than one multimethod. I suspect that this would be a little-used feature, which would add unnecessary complexity to the proposal.

Volatile

Should virtual parameters be allowed to be qualified as volatile?

Covarient returns

By analogy with virtual functions, implementations of a multimethod that returns a reference or pointer to a type T, should probably be allowed to return a reference or pointer to anything derived from T.

Dispatch speed

The basic dispatch algorithm involves much use of RTTI, and is linear in the number of multimethod implementations available. Hence it is very slow.

The specified O(LogN) dispatch speed can be easily obtained using a std::map-style dispatch cache, mapping from arrays of type_info's to function pointers.

It is possible to get constant-time dispatch if classes are assigned small unique integer identifiers, which could be stored in the vtable. This allows a dispatch array of pointers to functions to be created for each multimethod, using the small integer identifiers as indices.

To avoid whole-programme analysis at build-time, it may be best to initialise these small integers to zero, and assign non-zero values at runtime whenever a class is first involved in a multimethod call; this would also avoid wasting integer space on classes that aren't involved in multimethod calls, which in turn would reduce the sizes of the dispatch arrays.

A dispatch function for the Overlap example using these indices would look something like:

bool Overlap( Shape& a, Shape& b)
{
    typedef bool (*fnptr)( Shape& a, Shape& b);
    
    static void fnptr** cache = NULL;

    int a_index = a._vtable->small_integer;
    int b_index = b._vtable->small_integer;
        
    if (   a_index
        && b_index
        && cache
        && ((int) cache[0]) > a_index
        && cache[ a_index]
        && ((int) cache[ a_index][0]) > b_index
        && cache[ a_index][ b_index])
    {
        // fast dispatch:
        return cache[ a_index][ b_index]( a, b);
    }
    else
    {
        /* Slow dispatch.
        Ensure _vtable->small_integer's are assigned.
        Ensure cache points to array of at least
            a_index+1 items, (storing array size in
            cache[0]).
        Ensure cache[ a_index] points to array of at
            least b_index+1 items, (storing array
            size in cache[ a_index][0]).
        Fill the cache using the slow algorithm
        Make function call. */
        ...
    }
}

The code leading to the `fast dispatch' could be implemented in a very few assembler instructions (e.g. around 10 instructions on an ARM).

Cmm implements constant time dispatch using a similar scheme to this. It can't modify the vtable layout, so unique small integers are implemented by Cmm inserting extra inline virtual functions containing their own internal static ints.

I'm not sure it would be a good thing to mandate constant time dispatch speed, because it requires changes to the vtable layout that might break backwards binary compatibility. And we have little experience of how important multimethod dispatch speed is in real applications. There may also be a possibility of large space overheads for the dispatch arrays in pathological cases.

Raw pointer to multimethod implementation function

For speed critical loops, it might be useful to provide a way for the user to get a pointer to the best multimethod implementation function for a particular set of dynamic types. This enables client code that calls a multimethod on the same parameters many times in a loop, to cache the function pointer and so avoid any dispatch overhead.

Cmm implements this as an extra dispatch function with the same name as the multimethod, but with a suffix _cmm_getimpl. Using the Overlap example, if you have one collection of shapes that you know are all squares, and you want to search for an overlap with a particular shape, you would usually do:

void Fn( std::vector< Square*> squares, Shape& s)
{
    std::vector< Square*>::iterator it;
    for( it=squares.begin(); it!=squares.end() ++ it)
    {
        if ( Overlap( **it, s)) {...}
    }
}

With the generated Overlap_cmm_get_impl function, you can avoid the multimethod dispatch overhead in each iteration:

void Fn( std::vector< Square*> squares, Shape& s)
{
    std::vector< Square*>::iterator it;
    if ( squares.empty()) return;
    
    bool (*fn)( Shape&, Shape&) =
        Overlap_cmm_get_impl( s, squares[0]);
    
    for( it=squares.begin(); it!=squares.end() ++ it)
    {
        if ( fn( **it, s)) {...}
    }
}

Use with dynamic loading of code

The Cmm implementation automatically registers and unregisters multimethod implementations when code in shared libraries/DLLs is loaded/unloaded. If support for dynamic loading of code is added to the C++ standard in the future, I think it would be a good idea to require that multimethod implementations in shared libraries/DLLs are treated in this way.

Ommitted features

Here are some features that have been suggested, with reasons why I am not currently including them in the proposal.

Multimethod member functions

It has been suggested that it should be possible to make member functions into multimethods. I don't like this idea because it adds complications for no real gain in functionality. In particular, mixing virtual member functions and multimethods would be plain confusing and not do anything that couldn't be done by a simple multimethod.

Multimethod templates

In principle, it may be possible to have multimethod templates, such as:

template< class ShapeBase>
bool Overlap( virtual ShapeBase&, virtual ShapeBase&);
// Dispatches to a Overlap_< ShapeBase>() function

However, I haven't figured out how this could work, or what it could be used for.

Multimethods resources

Bjarne Stroustrup writes about multimethods in The Design and Evolution of C++ (Section 13.8, pages 297-301).

Multimethods are not a new idea; for example, the Dylan language (see http://www.functionalobjects.com/resources/index.phtml) supports them natively, as does CLOS (Common Lisp).

If one restricts oneself to Standard C++, it is possible to approximate multimethods at the expense of verbosity in source code. See Item 31 in Scott Meyers' More Effective C++, and chapter 11 of Andrei Alexandrescu's Modern C++ Design, where template techniques are used extensively. Bill Weston has a slightly different template technique that supports non-exact matches, see http://homepage.ntlworld.com/w.weston/.

Cmm, a C++ multimethods implementation in the form of a source-code translator, is available from http://www.op59.net/cmm/readme.html.

A paper about multimethods, C++ and Cmm is on the ACCU 2003 conference CD, also available online at http://www.op59.net/accu-2003-multimethods.html.

The Frost project (http://frost.flewid.de) adds support for multimethods to the i386-version of the gcc compiler.

Acknowledgements

Thanks to Bill Weston for many interesting and useful discussions about multimethods.

Thanks to Loise Goldthwaite, Kevlin Henney and Anthony Williams for comments about this proposal on the cxxpanel@yahoogroups.co.uk list.

Cmm implementation issues

Cmm (See http://www.op59.net/cmm/readme.html) is a source-code processor which implements a multimethods language extension for C++. The description of Cmm's implementation here corresponds to cmm-0.22 (31 March 2003).

Cmm takes individual compilation units containing multimethod code that have already been run through the C++ preprocessor (e.g. #include's have been expanded), and generates legal C++ compilation units, which can then be compiled and linked together conventionally.

The generated C++ code calls some support functions that are provided as a single source file called dispatch.cpp. This contains functions that manage data structures that store all known multimethod functions and their implementations, the actual runtime dispatch functions, functions to support dispatch caching and also support for the exception types thrown for ambiguous/unmatched dispatches.

Cmm has been designed to generate multimethod code that supports dynamic loading and unloading of code, which means that all information about multimethods and their implementations are stored in dynamic data strucures. This is probably not directly relevent to this proposal, because the C++ standard doesn't concern itself with dynamic loading of code. However, it gives an example of the flexibility that the multimethods model can give.

Such implementation issues are not directly part of this proposal, but Cmm demonstrates that multimethods need not break the simple C compiler/linker model.

Generated code for multimethod dispatch

In the simple Overlap multimethod example described earlier, The virtual function was:

bool Overlap( virtual Shape&, virtual Shape&);

Consider an implementation of Overlap that is specialised for a first parameter Square and second parameter Triangle. This will look like:

// user-written multimethod implementation function
bool Overlap_( Square& a, Triangle& b) {...}

In order to perform multimethod dispatch, one has to first decide which multimethod implementations match the dynamic types, and then try to find one of the multimethod implementations which can be considered a better match then all of the others.

The first step is done by calling auxiliary functions that Cmm creates for each multimethod implementation function, which take the base parameters and return true only if each of the parameters can be dynamic_cast-ed to the multimethod implementation's parameters. Because these functions takes the virtual function's base parameters, we cannot use conventional overloading to distinguish them, and so Cmm makes the function names unique using a mangling scheme which, for simplicity, will be denoted by _XYZ in the following:

// Cmm-generated match function for the function
// bool Overlap( Square& a, Triangle& b);
bool Overlap_cmm_match_XYZ( Shape& a, Shape& b)
{
    if ( !dynamic_cast< Square*  >( &a)) return false;
    if ( !dynamic_cast< Triangle*>( &b)) return false;
    return true;
} 

This separate function is generated in the same compilation unit as the multimethod implementation, which enables the dynamic_cast to work with derived types defined in anonymous namespaces. [Actually, the Overlap_cmm_match_XYZ function takes an array of two void*'s rather than a separate parameter for each virtual type, each of which is static_cast-ed to Shape* before the dynamic_cast is attempted. This is to enable generic dispatch code to be used for different virtual functions.]

The second step requires that the inheritance tree for each dynamic type is known. The dispatch code can then compare the types taken by each matching multimethod implementation, and select the multimethod implementation for which each virtual parameter is no less derived than any other matching multimethod implementation's virtual parameter. As discussed earlier, this corresponds to the way conventional overloading works at compile time.

The information about the inheritance trees is encoded in C-style strings using a mangling scheme similar to that used by C++ compilers when generating symbol names. This allows static initialisation to be used to register multimethod implementations at runtime.

[Cmm can also register multimethod implementations at build time by requiring a separate link-style invocation, but this made builds very complicated and slow, and precludes use with dynamic loading of code. The only advantages of this scheme are that dispatch time may be slightly faster, and all multimethod implementations are usable by static initialisation code.]

Finally, the generic dispatch code calls the actual multimethod implementation via a wrapper function that takes the base parameters, casts them directly to the derived types, and calls the multimethod implementation. Again, this function name is mangled:

// Cmm-generated call function for the function
// bool Overlap( Square& a, Triangle& b);
bool Overlap_cmm_call_XYZ( Shape& a, Shape& b)
{
    return Overlap(
        *static_cast< Square*  >( &a),
        *static_cast< Triangle*>( &b));
}

This function's precondition is that the derived types are correct and so the static_cast's are legal. Using this wrapper function enables the dispatch code to work in terms of generic function pointers even if multimethod implementations use derived classes in anonymous namespace.

[The function should use dynamic_cast rather than static_cast when Derived inherits virtually from Base, but this hasn't been implemented yet.]

Registering multimethods using static initialisation

For each implementation, Cmm generates a global variable whose initialisation registers the implementation with the dispatch function:

static cmm_implementation_holder
    Overlap_XYZ(
        "7Overlap2_1_5Shape1_5Shape",
           // the multimethod
        "8Overlap_2_2_5Shape_6Square2_5Shape_8Triangle",
           // the multimethod implementation
        Overlap_cmm_implmatch_XYZ,
        Overlap_cmm_implcall_XYZ);

cmm_implementation_holder is a class defined in dispatc.h/dispatch.cpp, whose constructor de-mangles the first two textual parameters to extract complete information about the inheritance tree for each virtual parameter taken by the virtual function and the implementation function. Together with the Overlap_cmm_match functions, this is sufficient to enable multimethod dispatch to be performed.

In this example, the first mangled string means: "A function called Overlap with 2 virtual parameters, the first a class containing one item in its inheritance tree, Shape, and the second also containing the same single class in its inheritance tree, Shape". The second mangled string means: "A function called Overlap_ with 2 virtual parameters, the first one being a class with 2 items in its inheritance tree, Shape followed by Square, while the second parameter's type also contains 2 items in its inheritance tree, the first one being Shape, and the second Triangle".

This use of static initialisers to register implementations allows dynamically loaded code to automatically register new implementations with the dispatch functions. Furthermore, the destructor of the cmm_implementation_holder class unregisters the implementations, so one can load/unload code at will.

The handling of implementation functions in dynamically loaded code has been succesfully tested on OpenBSD 3.2, Slackware Linux 8.1 and Cygwin/Windows, using the dlopen() and dlclose() functions.

Dispatch caches

Figuring out which of a set of implementation functions to call for a particular set of dynamic types is very slow, so some sort of caching scheme is required. Caching is performed by the code in the dispatch.cpp library. Currently this uses a std::map to map from a std::vector of std::type_info's to a function pointer. This gives a dispatch speed of O(Log N), where N is the number of different combinations of dynamic types that have been encountered (some of which may be mapped to the same function pointer). On OpenBSD 3.2 with gcc 2.95, the dispatch time for two virtual parameters is around 10-100 times slower than a conventional virtual function call.

It would probably be better to have special cache support for multimethods with one or two virtual parameters, using a std::map with key types std::type_info[1] and std::type_info[2]. No doubt templates could be used to do this with maximum obfuscation.

Dispatch functions

The actual virtual dispatch function is very simple, because it uses code in dispatch.cpp to do all the real work. This means that it is practical to generate a separate copy in all compilation units as an inline function, looking like:

inline bool Overlap( Shape& a, Shape& b)
{
    static cmm_virtualfn&   virtualfn =
        cmm_get_virtualfn( "7Overlap2_1_5Shape1_5Shape");
    typedef bool(*cmm_fntype)( Shape&, Shape&);
    
    const void*           params[] = { &a, &b};
    const std::type_info* types[]  =
        { &typeid( a), &typeid( b)};
    
    cmm_fntype cmm_fn = reinterpret_cast< cmm_fntype>(
            cmm_lookup( virtualfn, params, types));
    
    return cmm_fn( cmm_0, cmm_1);
}

The cmm_lookup function uses types as an index into the internal std::map dispatch cache. If this fails, the actual parameters params are used in the slow lookup algorithm described earlier. It returns a generic function pointer, which has to be cast into the correct type using reinterpret_cast.

Raw pointer to implementation function

Cmm provides an extra dispatch function that doesn't actually call the implementation. Instead, it returns a pointer to the best implementation function. This enables client code that calls a multimethod on the same parameters many times in a loop, to cache the function pointer and so avoid any dispatch overhead.

This extra dispatch function has the same name as the virtual function, with a suffix _cmm_getimpl. Using the Overlap example, if you have one collection of shapes that you know are all squares, and you want to search for an overlap with a particular shape, you would usually do:

void Fn( std::vector< Square*> squares, Shape& s)
{
    std::vector< Square*>::iterator it;
    for( it=squares.begin(); it!=squares.end() ++ it)
    {
        if ( Overlap( **it, s)) {...}
    }
}

With the generated Overlap_cmm_get_impl function, you can avoid the multimethod dispatch overhead in each iteration:

void Fn( std::vector< Square*> squares, Shape& s)
{
    std::vector< Square*>::iterator it;
    if ( squares.empty()) return;
    
    bool (*fn)( Shape&, Shape&) =
        Overlap_cmm_get_impl( s, squares[0]);
    
    for( it=squares.begin(); it!=squares.end() ++ it)
    {
        if ( fn( **it, s)) {...}
    }
}

Constant-time dispatch speed

It's possible to get constant-time dispatch speed if all types are assigned a unique small integer, by looking in a multi-dimensional array using the small integers as indices. Cmm has a command-line switch that makes the generated code use this technique.

This scheme is clearly potentially wasteful of memory. If a programme has a thousand classes, then each array will have up to one thousand elements. But only the arrary rows that are actually needed will be used.

In pseudo code, the dispatch for a funtion with two virtual parameters looks like:

void foo( virtual Base& a, virtual Base& b
    int a_smallint = a.get_small_integer();
    int b_smallint = b.get_small_integer();
    fn = cache[a_smallint][b_smallint];
    return fn( a, b);

In this case, cache is essentially of type fn_ptr**.

Cmm's dispatch.cpp contains an implementation of this dispatch method that allocates the array lazily so that memory is only allocated for those rows that are actually used.

Getting a unique small integer for each dynamic type is slightly tricky. In an ideal world, the compiler and linker would conspire to make space available in the vtable, which would enable very fast lookup. Cmm can't do this though, so instead it inserts an inline virtual function body into all polymorphic classes, containing a static int to enable fast access to the unique integers:

class Base
{
    // Next function inserted by Cmm:
    virtual int cmm_get_small_integer() const
    {
        static int id=0;
        if ( !id) id =
            cmm_create_small_integer( typeid( *this));
        return id;
    }
};

The function cmm_get_small_integer() is in the Cmm library dispatch.cpp along with all of the other support function. It maintains an internal map of std::type_info's to ints so that it returns the same integer if called more than once for the same type. This is required to make things work when the C++ environment doesn't implement the static int id correctly; for example, under OpenBSD 3.2, each compilation unit that contains the inline function cmm_get_small_integer() will have its own copy.

The constant time dispatch system is not robust. It adds a virtual function to all polymorphic classes, but this only works if all code is passed through Cmm. Other code, such as code in libraries that are linked into the final executable, may break at runtime because of assuming a different vtable layout. To avoid breaking code in system libraries, Cmm doesn't insert the function into polymorphic classes defined in the std namespace, but of course this means that you cannot do constant-time multimethod dispatch on base classes that are defined in std, such as std::exception.