C++ – GNU LD symbolic versioning and C++ binary backward compatibility

GNU LD symbolic versioning and C++ binary backward compatibility… here is a solution to the problem.

GNU LD symbolic versioning and C++ binary backward compatibility

I’m reading how to version symbols in an ELF shared library using GCC’s ld version script, and I know that different versions of the same symbol can be exported using a command like this:

__asm__(".symver original_foo,foo@VERS_1.1");

This is useful if the semantics of the function change, but the library should still export the old version so that older applications that use the library can still use the new version.

But for the

C++ library, the symbol vtable for MyClass is exported. If I change the class later by adding more virtual functions, how will I export the original class with the original vtable symbol in addition to the new version of the vtable?

EDIT: I did a test case and it seems to work by renaming all symbols of one class to all symbols of another class. This seems to work as I would like, but does it guarantee work or am I just lucky? The code is as follows:

EDIT2: I CHANGED THE NAME OF THE CLASS TO (HOPEFULLY) LESS CONFUSING AND SPLIT THE DEFINITION INTO 2 FILES.

EDIT3: It seems to work for clang++ as well. I will clarify the overall question I would like to ask:

Does this technique ensure binary backward compatibility of classes in C++ shared libraries on Linux, regardless of differences in virtual functions? If not, why not? (A counterexample would be nice).

libtest.h:

struct Test {
    virtual void f1();
    virtual void doNewThing();
    virtual void f2();
    virtual void doThing();
    virtual void f3();
    virtual ~Test();
};

libtest_old.h:

// This header would have been libtest.h when test0 was theoretically developed.

struct Test {
    virtual void f3();
    virtual void f1();
    virtual void doThing();
    virtual void f2();
    virtual ~Test();
};

libtest.cpp:

#include "libtest.h"
#include <cstdio>

struct OldTest {
    virtual void f3();
    virtual void f1();
    virtual void doThing();
    virtual void f2();
    virtual ~OldTest();
};

__asm__(".symver _ZN7OldTestD1Ev,_ZN4TestD1Ev@LIB_0");
__asm__(".symver _ZN7OldTestD0Ev,_ZN4TestD0Ev@LIB_0");
__asm__(".symver _ZN7OldTest7doThingEv,_ZN4Test7doThingEv@LIB_0");
__asm__(".symver _ZN7OldTestD2Ev,_ZN4TestD2Ev@LIB_0");
__asm__(".symver _ZTI7OldTest,_ZTI4Test@LIB_0");
__asm__(".symver _ZTV7OldTest,_ZTV4Test@LIB_0");
__asm__(".symver _ZN7OldTest2f1Ev,_ZN4Test2f1Ev@LIB_0");
__asm__(".symver _ZN7OldTest2f2Ev,_ZN4Test2f2Ev@LIB_0");
__asm__(".symver _ZN7OldTest2f3Ev,_ZN4Test2f3Ev@LIB_0");

void OldTest::doThing(){
    puts("OldTest doThing");
}
void OldTest::f1(){
    puts("OldTest f1");
}
void OldTest::f2(){
    puts("OldTest f2");
}
void OldTest::f3(){
    puts("OldTest f3");
}
OldTest::~OldTest(){

}

void Test::doThing(){
    puts("New Test doThing from Lib1");
}
void Test::f1(){
    puts("New f1");
}
void Test::f2(){
    puts("New f2");
}
void Test::f3(){
    puts("New f3");
}
void Test::doNewThing(){
    puts("Test doNewThing, this wasn't in LIB0!");
}
Test::~Test(){

}

libtest.map:

LIB_0 {
global:
    extern "C++" {
        Test::doThing*;
        Test::f*;
        Test::Test*;
        Test::? Test*;
        typeinfo?for? Test*;
        vtable?for? Test*
    };
local:
    extern "C++" {
        *OldTest*;
        OldTest::*;
    };
};

LIB_1 {
global:
    extern "C++" {
        Test::doThing*;
        Test::doNewThing*;
        Test::f*;
        Test::Test*;
        Test::? Test*;
        typeinfo?for? Test*;
        vtable?for? Test*
    };
} LIB_0;

Makefile:

all: libtest.so.0 test0 test1

libtest.so.0: libtest.cpp libtest.h libtest.map
    g++ -fPIC -Wl,-s -Wl,--version-script=libtest.map libtest.cpp -shared -Wl,-soname,libtest.so.0 -o libtest.so.0

test0: test0.cpp libtest.so.0
    g++ test0.cpp -o test0 ./libtest.so.0

test1: test1.cpp libtest.so.0
    g++ test1.cpp -o test1 ./libtest.so.0

test0.cpp:

#include "libtest_old.h"
#include <cstdio>

 in a real-world scenario, these symvers would not be present and this file
 would include libtest.h which would be what libtest_old.h is now.

__asm__(".symver _ZN4TestD1Ev,_ZN4TestD1Ev@LIB_0");
__asm__(".symver _ZN4TestD0Ev,_ZN4TestD0Ev@LIB_0");
__asm__(".symver _ZN4Test7doThingEv,_ZN4Test7doThingEv@LIB_0");
__asm__(".symver _ZN4Test2f1Ev,_ZN4Test2f1Ev@LIB_0");
__asm__(".symver _ZN4Test2f2Ev,_ZN4Test2f2Ev@LIB_0");
__asm__(".symver _ZN4Test2f3Ev,_ZN4Test2f3Ev@LIB_0");
__asm__(".symver _ZN4TestD2Ev,_ZN4TestD2Ev@LIB_0");
__asm__(".symver _ZTI4Test,_ZTI4Test@LIB_0");
__asm__(".symver _ZTV4Test,_ZTV4Test@LIB_0");

struct MyClass : public Test {
    virtual void test(){
        puts("Old Test func");
    }
    virtual void doThing(){
        Test::doThing();
        puts("Override of Old Test::doThing");
    }
};

int main(void){
    MyClass* mc = new MyClass();

mc->f1();
    mc->f2();
    mc->f3();
    mc->doThing();
    mc->test();

delete mc;

return 0;
}

Test 1.cpp:

#include "libtest.h"
#include <cstdio>

struct MyClass : public Test {
    virtual void doThing(){
        Test::doThing();
        puts("Override of New Test::doThing");
    }
    virtual void test(){
        puts("New Test func");
    }
};

int main(void){
    MyClass* mc = new MyClass();

mc->f1();
    mc->f2();
    mc->f3();
    mc->doThing();
    mc->doNewThing();
    mc->test();

delete mc;

return 0;
}

Solution

The vtable symbol and/or version is very unimportant to both the API and the ABI. What matters is which vtable index has which semantics. The name and/or version of the vtable does not matter.

You can achieve backward compatibility by using some lightweight runtime mechanisms to retrieve a specific version of a particular interface. Let’s say you have:

class MyThing: public VersionedInterface {...};  V1
class MyThingV1: public MyThing {...};
class MyThingV2: public MyThingV1 {...};

You may have some features for creating MyThings:

VersionedInterface *createMyThing();

And this VersionedInterface then you need to ask for the version of the interface you want (your code understanding):

// Old code will ask for MyThing:
VersionedInterface *vi = createMyThing();    
MyThing *myThing = static_cast<MyThing*>(vi->getInterface("MyThing"));

 New code may ask for MyThingV2:
VersionedInterface *vi = createMyThing();    
MyThingV2 *myThing = static_cast<MyThingV2*>(vi->getInterface("MyThingV2"));
 New code may or may not get the newer interface:
if (!myThing) 
{
     We did not get the interface version we wanted.
     We can either consciously fall back to an older version or simply fail.
    ...
}

The class VersionedInterface only provides the getInterface() function:

class VersionedInterface
{
public:
    virtual ~VersionedInterface() {}
    virtual VersionedInterface *getInterface(const char *interfaceName) = 0;    
};

The advantage of this approach is that it allows arbitrary changes (reordering functions, inserting and deleting functions, changing function prototypes) to the vtable in a clean and portable way.

You can extend the getInterface() function to accept a numeric version, and in fact you can use it to retrieve other interfaces of an object.

You can later add interfaces to the object without breaking the existing binary code. This is the main advantage. Of course, there is a cost to getting boilerplate code for the interface. Of course, maintaining multiple versions of the same interface has its own costs. Careful consideration should be given to whether such efforts were worth it.

Related Problems and Solutions