Document number: J16/06-0046 = WG21 N1976
Date: 2006-04-20
Author: Benjamin Kosnik <bkoz@redhat.com>
Evolution Working Group, Modules and Linkage

Dynamic Shared Objects: Survey and Issues



Introduction

Hosted implementations of C++ have long had the ability to collect object files from individual translation units into a single entity. Often, these collections of object files are called libraries, and simplify software creation and maintenance by clearly separating out dependencies, and providing interfaces between components.

For the purposes of this paper, there are two main types of libraries. The first, a static library, copies used elements from the library directly to the created executable. Thus, when an executable is created by statically linking against a library, the result is a system with two copies of the library. (The original library, and the newly-formed executable.)

The second type of library is one without duplication, and is a dynamic library. Instead of duplicating the data, when an executable is created by dynamically linking against a library, the implementation performs a magic step that allows the end result to reference the library directly, without a copy.

This paper will be concerned solely with the second type of library, and with dynamic shared objects in particular. The use of this kind of library in the C++ development community is widespread, and has been in existence for over ten years. Years of use have pointed out some of the pitfalls with parts of the C++ language and current implementations: it is the goal of this paper to provide a survey of current dynamic linking capabilities, techniques, and to explicitly quantify known issues.



Terminology

The C++ standard defines three distinct linkage types: internal, external, none. (See 3.5 - Program and linkage [basic.link])

Adopt the notion of load unit from Austern indicating a binding of individual translation units together to form a single group. (Via some undefined mechanism.)

Define symbol as a definition for a specific entity in a load unit.

Adopt visibility to determine if a symbol can be used outside of the load unit. In addition, adopt the following refinements:

An entity defined within a load unit with external visibility implies that other load units are able to use the defined symbol.

An entity defined within a load unit with internal visibility implies that other load units are not able to use the defined symbol.

Adopt the notion of load set from Austern indicating the closed set of all individual load units.



High Level Survey of Current Techniques

Windows

Dynamic linking meta-picture:

.exe -> .lib -> .dll

Where .dll is the shared object, with symbol definitions. The .lib is a stubs library, which contains a list of symbols to be resolved in the .dll at runtime. The .exe is the final executable, and it's dependencies are resolved against the .lib at link time.

Default visibility is internal. Annotations required both for marking a symbol with external visibility and for importing an external symbol.

Visibility control techniques include:

One: decoration via __declspec(dllexport) and __declspec(dllimport) on a class and member function basis, but not both. Also allowed template specializations.

Two: by a text file containing a list of symbols to export or by ordinal.

Dynamic loading via LoadLibrary, where you can set resolution to immediate or delayed.

Other notes: On class-scope visiblity decorations, all member functions, static data members, virtual functions, typeinfo, etc are visible along with the same for any base classes. Operator new inter-position is known not to work. Versioning capability on .dll include major, minor, build date.

SVR4 (Sun/Linux)

Dynamic linking meta-picture:

.exe -> .so

Where .so is the shared object, with symbol definitions.The .exe is the final executable, and it's dependencies are resolved against the .lib at runtime.

Default visibility is external. Annotation or other method required for marking a symbol with internal visibility.

Visibility control techniques include:

One: decoration via __attribute__((visibility(option))) where option is one of: hidden, default, internal, protected.

Two: #pragma interface/#pragma implementation

Three: extern template, -fno-implicit-templates, and inlining declarations

Four: #pragma GCC visibility push(hidden) /#pragma GCC visibility pop in combination with -fvisibility,-fvisibility-inlines-hidden

Five: a text file containing a list of symbols to export which is used as an input file to the linker, with optional minor version refinement. Symbols with C++ linkage (ie mangled) can be exported in namespace globs in an un-mangled syntax.

Version control techniques include:

One:Versioning of libraries happens exclusive using the SONAME, a simple string that is part of the object file format (ELF). There can be exactly one library for a given SONAME.

Two: Versioning capability on a library include major, minor, minor with refinements, aliasing and renaming. These versioning details are explicitly specified in a text file that the linker uses during library creation.

Three: Explicit version number mangling on nested namespaces, and then injecting these versioned names into the enclosing namespace via the GNU compiler extensions described by namespace associations.

Dynamic loading via dlopen/dlmopen/dlsym/dlclose, where you can set resolution to immediate or delayed.

Other notes: Operator new inter-position can be made to work. The -fvisibility and #pragma visibility options are known to not work and or have serious flaws.



Common functionality, by example

1: Sharing load unit by dynamic linking
2: Sharing load unit by dynamic loading
3: Load unit with restricted external visibility
4: Load unit with versioned external visibility

In the following examples, the following color key is used: annotations and sources in light blue are for

 SVR4 (Sun/Linux) 

sources and annotations in light red are for

 
Windows

and platform-independent code will be in gray

 
as in this neutral color



One: Sharing load unit by dynamic linking.

The fundamental example for dynamic linking. A load unit (libfoo) that contains a function (get_city) is then shared by an executables (one.exe) at runtime by compile-time linking.

foo.h

 
extern const char* get_city();
 
extern const char* __declspec(dllimport) get_city();

foo.cc

 
static const char* city = "mont tremblant";

const char* get_city()
{ return city; }
 
static const char* city = "mont tremblant";

const char* __declspec(dllexport) get_city()
{ return city; }

one.cc

 
#include <cstring>
#include "foo.h"

int check_city_one()
{
  return std::strcmp(get_city(), "chicago");
}

int main() 
{ return check_city_one(); }

On linux, the situation outlined above is constructed as follows:

g++ -shared -fPIC -O2 -g foo.cc -o libfoo.so
g++ -g -O2 -L. one.cc -lfoo -o one.exe




Two: Sharing load unit by dynamic loading.

The fundamental example for dynamic loading. A load unit (libfoo) that contains a function (get_city) is then shared by an executables (one.exe) by loading at runtime.

foo.h

 
extern const char* get_city();
 
extern const char* __declspec(dllimport) get_city();

foo.cc

 
static const char* city = "mont tremblant";

const char* get_city()
{ return city; }
 
static const char* city = "mont tremblant";

const char* __declspec(dllexport) get_city()
{ return city; }

two.cc

 
#include <dlfcn.h>
#include <cstring>
#include <stdexcept>
#include "foo.h"

const char*
mangle(const char* unmangled)
{
  // GNU
  const char* mangled = "_Z8get_cityv";
  return mangled;
}

void
dynamic_open(void*& h)
{
  dlerror();
  void* tmp = dlopen("./libfoo.so", RTLD_LAZY);
  if (!tmp) 
    {
      try 
        {
          // Throws std::logic_error on NULL string.
          std::string error(dlerror());
          throw std::runtime_error(error);
        }
      catch (const std::logic_error&)
        { }
    }
  h = tmp;
}

void
get_and_execute_dynamic_symbol(void*& h)
{
  dlerror();

  typedef void (*function_type) (void);
  function_type fn;
  fn = reinterpret_cast<function_type>(dlsym(h, mangle("get_city")));

  try 
    {
      std::string error(dlerror());
      throw std::runtime_error(error);
    }
  catch (const std::logic_error&)
    { }

  fn();
}

void
dynamic_close(void*& h)
{
  if (dlclose(h) != 0)
    {
      try 
        {
          std::string error(dlerror());
          throw std::runtime_error(error);
        }
      catch (const std::logic_error&)
        { }
    }
}

int main() 
{ 
  void* h;
  dynamic_open(h);
  get_and_execute_dynamic_symbol(h);
  dynamic_close(h);
  return 0;
}
 
#include <windows.h>
#include <cstring>
#include <stdexcept>
#include "foo.h"

const char*
mangle(const char* unmangled)
{
  // Microsoft
  const char* mangled = "?get_city@@YAXXZ";
  return mangled;
}

void
dynamic_open(HINSTANCE& h)
{
  HINSTANCE tmp;
  tmp = LoadLibrary("./libfoo.so");
  if (!tmp) 
    {
      throw std::runtime_error(error);
    }
  h = tmp;
}

void
get_and_execute_dynamic_symbol(HINSTANCE& h)
{
  typedef void (*function_type) (void);
  function_type fn;
  fn = reinterpret_cast<function_type>(GetProcAddress(h, mangle("get_city")));

  if (!fn) 
    {
      throw std::runtime_error(error);
    }

  (fn)();
}

void
dynamic_close(HINSTANCE& h)
{
  if (FreeLibrary(h) != 0)
    {
      throw std::runtime_error(error);
    }
}

int main() 
{ 
  HINSTANCE h;
  dynamic_open(h);
  get_and_execute_dynamic_symbol(h);
  dynamic_close(h);
  return 0;
}

On linux, the situation outlined above is constructed as follows:

g++ -shared -fPIC -O2 -g foo.cc -o libfoo.so
g++ -g -O2 two.cc -ldl -o two.exe




Three: Load unit with restricted external visibility.

There are at least three common techniques available for specifying visibility of individual entities in a load unit. Three will be detailed here: the use of compiler-specific pragmas, the use of vendor-specific decorations on types and declarations, and finally the use of vendor-specific link maps.

3a: Pragmas.

container.cc

 
#pragma GCC visibility push(hidden)
//#pragma GCC visibility push(default)

typedef int value_type;

class foo
{
  value_type v;

 public:
  foo();

  virtual ~foo() { }

  value_type&
  get_vector();
};

foo::foo()
{
  value_type apple;
  v = 1;
}

value_type&
foo::get_vector()
{ return v; }

void
swap_foo()
{
  value_type empty;
  foo f;
}

#pragma GCC visibility pop

On linux, the situation outlined above is constructed as follows:

g++ -g -c container.cc

Regardless of any other compiler flags (ie -fvisibility=hidden) the defined symbols will have the visibility as noted in the pragma.



3b: Decoration.

container.cc

 
#ifdef VIS_EXTERNAL
#define VIS __attribute__ ((visibility("default")))
#else
#define VIS __attribute__ ((visibility("hidden")))
#endif

typedef int value_type;

class VIS foo
{
  value_type v;

 public:
  foo();

  virtual ~foo() { }

  value_type&
  get_vector();
};

foo::foo()
{
  value_type apple;
  v = 1;
}

value_type&
foo::get_vector()
{ return v; }


void VIS
swap_foo()
{
  value_type empty;
  foo f;
}

On linux, the situation outlined above is constructed as follows:

g++ -g -c container.cc

Regardless of any other compiler flags (ie -fvisibility=hidden) the defined symbols will have the visibility as noted in the pragma.



3b: Lists.

container.cc

 
typedef int value_type;

class foo
{
  value_type v;

 public:
  foo();

  virtual ~foo() { }

  value_type&
  get_vector();
};

foo::foo()
{
  value_type apple;
  v = 1;
}

value_type&
foo::get_vector()
{ return v; }


void
swap_foo()
{
  value_type empty;
  foo f;
}

container.ver

 
{
  global:
        _Z8swap_foov;
  local: *;
};

On linux, the situation outlined above is constructed as follows:

g++ -g -Wl,--version-script=container.ver -shared container.cc -o container.so

Other compiler flags (ie -fvisibility=hidden) will prevail over the visibility as noted in the link map.



Four: Load unit with versioned external visibility.

On occasion, it is possible to safely extend class declarations and other entities over time. In order to have this work, there must be a way to attach a version to symbols in a load unit with external visibility.

For instance, this class:

container.h

 
struct foo
{
  foo();

  int
  get_value();

  private:
  int v;
};

container.cc

 
#include "container.h"

foo::foo()
{ v = 1; }

int
foo::get_value()
{ return v; }

container.ver

 
VERSION_1.0
{
  global:
        _ZN3foo9get_valueEv;
  local: *;
};

On linux, the situation outlined above is constructed as follows:

g++ -g -Wl,--version-script=container.ver -shared container.cc -o container.so

At this point, the only externally-visible symbol is foo::get_value(), and this symbol is versioned with the tag VERSION_1.0.

After a period of use, a new feature is added and as part of this, a new member function is added to struct foo, say foo::get_second_value(), and this symbol is versioned with the tag VERSION_1.1. A new version of container.so is generated, with both the member functions defined with the corresponding version tag. This allows newer code to use the new member function, but allows a graceful system response if this newer code is run in an environment without the newer container.so.

container.ver

 
VERSION_1.0
{
  global:
        _ZN3foo9get_valueEv;
  local: *;
};

VERSION_1.1
{
  global:
        _ZN3foo16get_second_valueEv;
  local: *;
} VERSION_1.0;

This is constructed in the same manner as the first container.so file.



Known Problems, by example

1: Overriding global operator new
2: Order of initialization
3: Exceptions across load units
4: Vague linkage and duplicate symbol resolution



Problem One: Overriding global operator new.

Standard defined behavior for global scope operator new signatures allow users to provide custom definitions and override the default. What happens if there are two user-defined operator new definitions: which one is picked? This has been referred to as the operator new "inter-position" issue, but could be generalized to load-unit allocation/deallocation problems. It is important that allocation and deallocation mechanisms match across load unit boundaries.

There are other issues with memory management and multiple load units. A big question is how to keep the allocator equality requirements (ie, an instance of an allocator is equal to another allocator iff one can free the other's allocation.)

foo.cc

 
#include <cstdio>
#include <new>
#include <tr1/array>

std::string*
get_string()
{ return new std::string("olive street beach"); }

void
dispose_string(std::string* s)
{ delete s; }

// Fixed external storage.
typedef std::tr1::array<char, 256> array_type;
static array_type _M_array;

void* operator new(std::size_t __n) throw (std::bad_alloc)
{
  puts("operator new");
  static std::size_t __array_used;
  if (__array_used + __n > _M_array.size())
    std::bad_alloc();
  void* __ret = _M_array.begin() + __array_used;
  __array_used += __n;
  return __ret;
}

void operator delete(void*) throw() 
{
  // Does nothing.
  puts("operator delete");
}

foo.ver

 
VERSION_1.0
{
  global:
	_Z10get_stringv;
	_Z14dispose_stringPSs;
  local: *;
};

test.cc

 
#include <new>
#include <vector>
#include <string>

extern std::string* get_string();
extern void dispose_string(std::string*);

int main()
{
  typedef std::vector<int> vector_type;
  vector_type* v;
  try
    {
      v = new vector_type(100);
    }
  catch (const std::exception& e)
    {
      puts(e.what());
      throw;
    }
  catch (...)
    { throw; }
  delete v;

  std::string* s = get_string();
  dispose_string(s);

  return 0;
}

Construct the example as follows:

g++ -shared -fPIC -O0 -g foo.cc -o libfoo.so
g++ -g -O0 -L. test.cc -lfoo -o problem1.exe

And then run the resulting executable:

%./problem1.exe
operator new
operator new
operator delete
operator delete
operator new
operator new
operator delete
operator delete

As suspected, the operator new definitions in libfoo are used in problem1.exe as well, overriding the default definitions in the standard library. Although a trivial example, the problem is real: any load unit that redefines operator new could, when added to a new load set, change underlying allocations.

One way around this is to limit the visibility of the operator new and operator delete definitions to within the libfoo.so load unit.

g++ -shared -fPIC -O0 -g foo.cc -Wl,--version-script=foo.ver -o libfoo.so
g++ -g -O0 -L. test.cc -lfoo -o problem1.exe

And then run the resulting executable:

%./problem1.exe
operator new
operator delete

By limiting the visibility, the operator new and delete definitions can be bound to a specific load unit.



Problem Two: Order of initialization.

How are global objects supposed to be initialized and finalized in scenarios with multiple accesses and accesses that can be opened and closed at will? In addition, using static local objects may run into issues with initialization.

foo.cc

 
#include <cstdio>

struct A
{
   A()
   { puts("A ctor"); }

   ~A()
   { puts("A dtor"); }
};

void f()
{ static A obj; } 

test.cc

 
#include <cstdio>

extern void f();

struct B
{
   B()
   { puts("B ctor"); }

   ~B()
   { puts("B dtor"); }
};

static B foo;

int main()
{
   f();
   return 0;
}
On linux, the situation outlined above is constructed as follows:

g++ -shared -fPIC -O2 -g foo.cc -o libfoo.so
g++ -g -O2 -L. test.cc -lfoo -o two.exe

And then run the resulting executable:

%./one.exe
B ctor
A ctor
A dtor
B dtor

This ordering is correct. However, on windows:

B ctor
A ctor
B dtor
A dtor

... which demonstrates the issue of ordering objects across different load units.



Problem Three: Exceptions across load units.

Compiler-generated information for virtual functions and typeinfo has vague linkage that is difficult for the programmer to control given the language facilities available in standard C++. Because of this, default visibility and the order of binding of symbols all impact the ability to throw and catch exceptions across load units.

There is an underspecification of this necessary compiler-generated magic with respect to multiple load units. Symptomatic of this include use of typeid (typeinfo) and throwing exceptions across load units, inlining template member functions defined in multiple load units (each with unique addresses), and the use of inheritance in multiple load units (do base class definitions and implicitly-generated data have to be visible across load units?)

error_handling.h

 
#include <stdexcept>

struct insert_error : public std::runtime_error
{
  insert_error(const std::string&);
};

void check_insert();

error_handling.cc

 
#include "error_handling.h"

insert_error::insert_error(const std::string& s) : std::runtime_error(s) { };

void check_insert()
{
  // Do something, assume it's wrong.
  throw insert_error("check_insert: something happened");
}

error_handling.ver

 
VERSION_1.0
{
  global:
        _ZN12insert_errorC*;
#        _ZTS12insert_error;
        _Z12check_insertv;
  local: *;
};

test.cc

 
#include "error_handling.h"
#include <iostream>

int main()
{
  try
    {
      check_insert();
    }
  catch (const insert_error& e)
    {
      // 1: Expect catch here.
    }
  catch (const std::exception& e)
    {
      // 2: Visibility issues may lead to catch here.
      std::cout << "caught object of type: " << typeid(e).name() << std::endl;
    }
  catch (...)
    {
      // 3: Catch all.
      throw;
    }

  return 0;
}

Construct the example as follows:

g++ -shared -fPIC -O2 -g error_handling.cc -Wl,--version-script=error_handling.ver -o libfoo.so
g++ -g -O2 -L. test.cc -lfoo -o problem3.exe

And then run the resulting executable:

%./problem3.exe
caught object of type: 12insert_error

Without typeinfo name sharing between libfoo and problem3.exe, the execution path is non-intuitive and in error. The typeinfo information is generated with vague linkage in both libfoo and problem3.exe. By allowing the export of _ZTS12insert_error from error_handling.ver, both will end up using the vague typeinfo name stored in the executable, and exception handling will work as expected.



Problem Four: Vague linkage and duplicate symbol resolution.

Template classes with member functions can have instantiations in multiple files. To prevent duplicate symbols, many compilers have implemented vague linkage semantics that coalesce multiple, equivalent symbol names across translations units into one definition. Picking one version across multiple dynamic load units is tricky. Depending on the order in which different load units are initialized, the definition that is picked for all the other load units in a given load set may change, and may end up being different than expected or planned when the individual load units were constructed. Symptomatic of this problem are multi-ABI binaries, where different compilers define the same symbol in a given load set. Also implicated are template designs that depend on macro defines to change behavior.

The end result is similar, in both cases: load units that have porous boundaries, that end up using symbols defined elsewhere in the load set to resolve symbols within the original load unit.

container.h

 
template<typename T>
  class container
  {
    T data;

    void
    do_private();
    
  public:

    container(T value = T()) : data(value) { }

    void 
    do_public() 
    { return do_private(); }

    T
    get_data() { return data; }
  };


template<typename T>
void container<T>::do_private()
{
#ifdef OLD_VERSION
  // Clear.
  data = T();
#else
  // Multiply.
  data *= 2;
#endif
}

typedef container<int> container_type;

foo.cc

 
#include "container.h"
#include <iostream>

void foo()
{
  container_type obj(4);
  obj.do_public();
  std::cout << obj.get_data() << std::endl;
}

test.cc

 
#include "container.h"

extern void foo();

int main()
{
  container_type obj(2);
  
  // Use do_public, get weak definition.
  obj.do_public();

  // Call external function on the rest.
  foo();

  return 0;
}

Construct the example as follows:

g++ -g -O2 -fPIC -shared foo.cc -o libfoo.so
g++ -DOLD_VERSION -g -O2 -L. test.cc -lfoo -o problem4.exe

And then run the resulting executable:

%./problem4.exe 
0

Both libfoo.so and problem4.exe have vague linkage for container<int>::do_private(). (Other options include no definition, leading to undefined symbols or both having definitions, and duplicate symbol errors.)As a result, it is system-defined and order dependent which definition will be picked for both uses. On linux, when problem4.exe is loaded, its symbol is used for all other uses, including the one in libfoo.so. This is probably not what was intended by the author of libfoo.so. In addition, the behavior of libfoo.so now depends on optimization options: without optimization, the un-inlined function will pick up the symbol from the executable, and with optimization that results in inlining (try -O3) the behavior of libfoo will change.



Impact on Existing Standard by Chapter

  • 01. Runtime/execution model needs to be modified to address load unit.

  • 02. Are new keywords needed to express the idea of controlling visibility of specific entities in a given load unit. Ie, visible or invisible, public or private, hidden or "exported"? Multiple layers of visibility (ie versioning)?

  • 03. ODR scope object lifetime storage duration startup/termination -- what order? linkage

  • 05. Operators new/delete, typeid, dynamic_cast

  • 07. Static, extern vs. load units (linkage). Namespace-scope visibility?

  • 09. Class linkage changes. What if not all members of a class are visible. Differing visibility between nested and enclosing classes, or local and enclosing classes. Will nested classes and nested namespaces have the same semantics?

  • 10. Issues with vtable, typeinfo visibility across multiple load units. Do base class vtables have to be visible in order to use a derived class in a different load unit?

  • 12. Tons of stuff, including constructors, destructors, operator new and delete.

  • 15. Detail throw/catch exceptions across multiple load units.

  • 18. Most everything.

  • 19-27. How does this impact C++ standard library?



Solution Space

Going from least to most ambitious.

  • Nothing, go with the status quo.

  • Attempt a TR, or "best practices" document with suggestions for clients and vendors.

  • Come up with standard terminology, select a common subset of what's possible and figure out syntax for expressing it portably. Suspect that many C++ features will fall by the wayside.

  • Come up with standard terminology, and figure out syntax for expressing it. Includes specification for C++-specific requirements like throwing exceptions, templates, template specializations, and vague linkage.

  • Come up with new C++ constructs, for example modules, annotation rules for namespaces, etc.



Acknowledgements

Contributors to the Modules and Linkage discussions at Mont Treblant (2005) and Berlin(2006), not limited to: David Vandevoorde, Mat Marcus, Judy Ward, PremAnand Rao, Doug Harrison, Bronek Kozicki, Eugene Gershnik, and Thomas Witt.



References

Matthew Austern. Toward standardization of dynamic libraries. Technical Report N1400=02-0058, Sep 25, 2002.

Pete Becker. Draft Proposal for Dynamic Libraries in C++. Technical Report N1428=03-0010, March 3, 2003.

Daveed Vandevoorde. Modules in C++ (Revision 2). Technical Report N1778=05-0038, January 2005.

John R. Levine. Linkers and Loaders. Morgan Kaufmann, January 15, 2000.

Ulrich Drepper. How To Write Shared Libraries. January, 2005.

Ulrich Drepper provided feedback and corrections on previous drafts of this document.

Doug Harrison. DllHelper_0.90.

Jonathan H. Lundquist. 270. Order of initialization of static data members of class templates. CWG Defects.

Mark Mitchell. 362. Order of initialization in instantiation units. CWG Defects.

Sun Microsystems. Linker and Library Guide. 2002.

Microsoft. DLLs.

Microsoft. Walkthrough: Creating and Using a Dynamic Link Library.

Apple. Overview of the C++ Runtime Environment.

ACE shared object wrappers, ie. ACE_Shared_Object.

Mat Marcus. Typeinfo comparison code easily breaks shared libs. http://gcc.gnu.org/PR23628

Ralf W. Grosse-Kunstleve, David Abrahams, Jason Merrill. Minimal GCC/Linux shared lib + EH bug example. http://mail.python.org/pipermail/python-dev/2002-May/023988.html
http://gcc.gnu.org/ml/gcc/2002-05/msg00882.html
http://sources.redhat.com/ml/libc-alpha/2002-05/msg00222.html

Doug Harrison. Order in court! http://groups.google.com/group/microsoft.public.vc.mfc/msg/438bdfa1dc683ce1?hl=en&

Doug Harrison. com_ptr_t as static object in a DLL. http://groups.google.com/group/microsoft.public.vc.mfc/msg/21cfdeb16358e755?hl=en&

Static variable in template function across compilation units. http://groups.google.com/group/comp.lang.c++.moderated/msg/50091d17e98a36a0?hl=en