Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opentelemetry API singletons (TraceProvider etc.) do not play well in shared library uses that hide symbol visibility #1520

Closed
astitcher opened this issue Jul 26, 2022 · 27 comments · Fixed by #1604
Labels
bug Something isn't working

Comments

@astitcher
Copy link
Contributor

Describe your environment
We are using the opentelemetry tracing API (and SDK) inside a Linux C++ shared library (.so file) built with the standard gcc toolchain. This library is provides messaging using the AMQP protocol (the library is Qpid Proton C++). We create tracing spans internal to the library to represent the lifecycle of messages sent and received by the library. The contexts associated with the spans are propagated with the messages so that distributed tracing is achieved.

The intention is that any application using our library should be able to create its own tracing spans that are naturally related to the library generated spans (by default using the active span as the parent span). This requires that the TracingProvider accessible to the library and to the application using it is the same TracingProvider.

Steps to reproduce
Our library is built with -fvisibility=hidden. This means that any symbol that should be exported from the library needs some extra annotation (__attribute__((visibility("default"))) for gcc/clang). We use this to carefully control the visible API/ABI from our library and to be sure that internal details aren't visible outside.

However the way that the singleton insideProvider is implemented in a header file means that it is defined as a static symbol inside the inline member function Provider::GetProvider. This means that there are duplicate symbols for every time the provider.h header file is included in for example our library and an application that uses it.

If the symbols are all visible to the linker at link and runtime then the symbol will be effectively deduplicated and there will be only one used - to my understanding the application symbol takes precedence, but as long as there is only one it will work correctly.

However when the library symbols are hidden by default as in our library there is no way for the runtime loader to know that there should only be one version of the singleton and what happens is that the library and the application end up with different TracerProviders.

This is a serious problem for us as we ship our library with the symbols exported explicitly for very good reason and so we can't just use Provider::GetTracerProvider in the library and application and have it work correctly together.

What is the expected behavior?
The way I would expect this to work is that the singletons are not defined in header files and so duplicated in every object file that includes them, but rather the header file only declares Provider::GetProvider and it is defined in one of the opentelemetry-cpp libraries (the common lib?). I understand that this moves the singleton out of what you think of as the 'API' and into the 'SDK'. But I really think that architecturally singletons are actuall implementation artifacts not API artifiacts.

Additional context
I note that this is not an issue at all if you statically link everything as in that case the final link phase takes care of the issue.

I also note that this should be an issue using DLLs on windows as by symbols are not exported by default (afair) and also need to be explicitly marked to be exported (using __declspec(dllexport)

@astitcher astitcher added the bug Something isn't working label Jul 26, 2022
@astitcher astitcher changed the title Opentelemetry API singletons (TraceProvider etc.) do not play well in shared library uses that hide symbol visibliilty Opentelemetry API singletons (TraceProvider etc.) do not play well in shared library uses that hide symbol visibility Jul 26, 2022
@lalitb
Copy link
Member

lalitb commented Jul 26, 2022

The issue has been discussed earlier for Windows - Support building DLL for windows. The header-only API design is the core issue here, and there are discussions on removing this constraint (no decision and timelines yet).

Just to add, moving Provider::GetProvider definition to SDK won't work here. The instrumentation libraries would be built against otel API (with no SDK) and are expected to work with the application built with/without otel SDK. As per the specs, the TracerProvider global state needs to be maintained in API implementation.

@owais, @marcalff @esigo @ThomsonTan - Just in case you have more to add here. I think @marcalff saw similar issue here #1409 (though this was SDK).

@owent
Copy link
Member

owent commented Jul 27, 2022

If your project use opentelemetry-cpp as internal component, you can also build opentelemetry-cpp as static library with -fvisibility=hidden and link it into your dynamic library, the symbols will also not be visible outside.
If any headers of opentelemetry-cpp will be exported as part of your product, I think symbols should also be visible so that users will use the same configure and version of symbols as when building it.

As mentioned by @lalitb .The header-only API design is the core issue of not exporting symbols, and
@meastp had a approach in #1105 and https://github.com/meastp/opentelemetry-cpp/tree/otel_support_shared_dll before. Maybe this will be helpful.

@astitcher
Copy link
Contributor Author

@owent What you are suggesting with static libraries doesn't help because as @lalitb already said the issue is the header file only design for the API.

@astitcher
Copy link
Contributor Author

astitcher commented Jul 27, 2022

@lalitb I looked at #1105 and indeed the issue and solution are the same - is there any movement on adding these files to a library? Would the approach there be creating a new API shared library? If that would be an otel build time decision we could work with that because we will be building otel packages ourselves to install before our own library package.
Equally I think that annotating the header files with the extra gcc __attribute__ or visual studio __declspec will also work without changing the design.
I'm happy to work on either of these if it will progress this issue, but it looked like the previous issue stalled with no progress to merging - why was that?

@lalitb
Copy link
Member

lalitb commented Jul 27, 2022

Equally I think that annotating the header files with the extra gcc attribute or visual studio __declspec will also work without changing the design.

@astitcher - Would it be possible for you to prototype and see if this works, which will help us make a decision on the right approach?

@marcalff
Copy link
Member

marcalff commented Jul 27, 2022

Thanks @astitcher for the detailed problem description.

Also discussed during the opentelemetry-cpp meeting today.

The same issue was reported recently (#1409) and fixed (#1420)
for some singletons present in the SDK,
because the code used the general pattern of declaring singletons
in header only code, for the most part.

There is still some areas to fix in the SDK, see (#1429),
but moving singletons to *.cc files in the SDK only works
because there are libraries associated with the SDK.

The same coding pattern (header only singleton) is also used in the API,
with some extra constraints that the API should be header only.

I tried below to summarize the problem and constraints,
to have a fact based discussion about possible solutions.

All opinions are my own.

I) History

For some history about this decision, from the meeting notes:

Oct 7th, 2019:

Bogdan - can we have pure C++ header API? Ryan/Max - singleton/global would be hard.

Feb 24th, 2020:

[Evgeny] Does library provide access to globals/singletons, e.g. TraceFactory
Current requirements.md and this comment suggest no exported symbols

#38 (comment)

Multiple libraries, built separately, how do they share/access the same singleton?

[Ryan] The archives will have multiple symbols but the linker will resolve it. Bigger problems around late binding.

[Evgeny] Alternative: abandon header-only approach and put the singleton business into a *.cc file.

[Ryan] We decided early on to try to be header-only to be easier to install.

Takeaway from these early discussions:

The challenge of implementing singletons in header only code has been
identified more that 2.5 years ago already

The specific issue about shared libraries ("Bigger problems around late
binding") has been identified more that 2 years ago, and is not fully resolved
to this day.

A header only library is indeed easier to install, but note the wording
"try to", and the related fallback "Alternative: abandon header-only
approach": a header only API is a design goal, not an absolute requirement.

II) Desirable architectural qualities

II-A) Singletons MUST be singletons.

I think we all agree on this, otherwise everything will fall apart.

No matter what the final technical solution is, singletons must really be
singletons for opentelemetry-cpp to work.

II-B) Instrumenting a library MUST depend only on API header files.

When instrumenting library X, the library owner has no control,
and should not be concerned with, whether the library will be deployed:

  • in a binary using opentelemetry-cpp (SDK) or not
  • in a binary using a given SDK, exporter, processor, etc
  • in a binary using another SDK, exporter, processor, etc

The best way to achieve this is to compile the instrumented library
only against API header files, to make it independent of an SDK
implementation.

This is what hooks like the trace provider singleton are for in the first
place, isolate the instrumentation using the API from the implementation in
the SDK.

The desired goal here is for library authors to ship only one library,
instead of releasing one library with, one without, opentelemetry.

II-C) Deployment ease of use.

Assume an existing application composed of:

  • a library A,
  • a library B,
  • a library C,
  • the main application code.

The main application links against libraries A, B and C.

When instrumenting libraries with opentelemetry:

  • not all libraries have to be instrumented at once. For example, A is not
    instrumented.
  • multiple libraries, for example B and C, can be instrumented
    independently. When libraries are provided by third parties, "if" and
    "how" to instrument library B must be totally independent of library C.

To deploy the application, several choices are then possible:

a) Not using opentelemetry

Link:

  • non instrumented library A
  • instrumented library B
  • instrumented library C
  • the main application code, unchanged
    to produce an application that does not use opentelemetry-cpp.
    Some code might be instrumented, but is not in used at runtime.

In this deployment, linking the final application does not change,
as there are no additional libraries needed.

What the "header only API" property provides, is the possibility to link
the application without changes, in this case alone.

This is desirable to facilitate distributing libraries with/without
instrumentation, with minimal (in this case, zero) disruptions to the
application using the libraries and not using opentelemetry.

b) Using opentelemetry

Link:

  • non instrumented library A
  • instrumented library B
  • instrumented library C
  • the main application code, which configures the opentelemetry-cpp SDK,
  • the opentelemetry-cpp SDK libraries
    to produce an application that does use opentelemetry at runtime.

In this case, the fact that the API is header only is irrelevant,
because the application has to be modified to link with more libraries anyway.

II-D) Non intrusive architectural constraints

No matter what technical choices are made in the opentelemetry-cpp space,
adopting opentelemetry in an existing application should not invalidate
architectural and design choices made in the application.

In particular, if a given application has reasons of its own to compile code
with -fvisibility=hidden, opentelemetry should not get in the way and cause
this application decision to be changed to accommodate opentelemetry-cpp.

III) Current state

As of now, singletons like the global trace provider are:

  • considered part of the API
  • declared in header only files
  • duplicated in different compilation units during the build
  • resolved at link time IF the symbols are visible to the linker.

For applications using -fvisibility=hidden, symbols are not visible,
and singletons are in fact duplicated, causing bugs.

With the code as is, II-C-a) is satisfied at the expense of II-D)

IV) Possible solutions

IV-A) Do not support -fvisibility=hidden.

This "solution" is listed for completeness only to point out that it is not
acceptable.

For opentelemetry in general, and opentelemetry-cpp in particular, to be
successful, adopting opentelemetry should not cause major disruptions
in the application space, and changing how the entire application is build
is disruptive.

Disclosure:

In the application we (at work) instrument,
-fvisibility=hidden is used as well.

Changing this is not an option, making II-D) a critical point.

IV-B) Make header-only API and -fvisibility=hidden work together.

If feasible, this would be the best technical solution, as it
satisfies both II-C) and II-D)

Adding explicit visibility directives to the singleton globals
declared in the API, to make these symbols visible even
when the code is built with -fvisibility=hidden by default,
could be a technical solution here.

My main concern is:

  • technical feasibility at all on all platforms
  • support from various compilers

In particular, what if this is not possible for a given (old) version of a
compiler in a given platform, that is otherwise supported today ?

Is dropping support for compiler X an acceptable outcome ?

IV-C) Revisit the "header-only API" principle.

To take an example with the tracer provider singleton,
today the API has:

  • a getter function, trace::Provider::GetTracerProvider()
  • a setter function, trace::Provider::SetTracerProvider()
  • a singleton variable provider, defined inside trace::Provider::GetProvider()

As a side comment, not only the provider singleton should be unique,
but the mutex that protects it should be unique as well, to prevent races.

Looking at each part:

  • the getter function is definitely part of the API, to be used when
    instrumenting a library, per II-B)

  • the setter function should not be considered part of the API in the first
    place. An instrumented library should never call this setter.
    Logically, it belongs to the SDK, to be called by the application owner
    to configure the SDK.

  • the singleton itself is right at the boundary between the API and the SDK.
    As such, observing a different set of rules for the API/SDK boundary,
    if required for technical reasons, seems valid (to me).

The fact that the setter function is defined and implemented in the API
does not cause harm otherwise, so leaving the setter in the API is just fine.

Moving the singleton implementation in a opentelemetry-api-singletons library
will solve the technical problem here.

Consequences are that:

  • the "header-only API" principle applies to all the code except singletons,
    which must be documented to avoid confusion

  • applications linking with instrumented libraries now need an extra
    library to link

The later point is not desirable, but this is a compromise if IV-B can not
be implemented.

In this solution, II-D) is supported at the expense of II-C-a)

V) Decision needed.

I think it is time to revisit the header-only API principle,
with regards to singletons.

If the pattern can be amended with visibility directives to make it work all
the time, great.

The first step is technical investigations, to see if solution IV-B is
feasible.

Otherwise, opentelemetry-cpp will have to provide an API library for
singletons, per IV-C.

The trade off between II-C and II-D is a major one.

Not supporting II-C has major impact on everyone not using opentelemetry.

Not supporting II-D has major impact on everyone using opentelemetry.

@lalitb
Copy link
Member

lalitb commented Jul 27, 2022

Thanks for summarizing it nicely @marcalff . Appreciated :)

Agree on investigating the header-only approach with annotations (IV-B above) first. As long as it works on all supported platforms (Windows, Mac, Linux), and commonly used compilers as documented here - https://github.com/open-telemetry/opentelemetry-cpp#supported-development-platforms - this should be preferred.

  • Linux (gcc/g++): - @astitcher - Would you like to try using this extra gcc __attribute__ on Linux, and see if it works fine. Or else if someone can test it? I can test it sometime next week.
  • Windows (VS): - @owent, @meastp - As you have done some work on this earlier, do you think the header-only approach will work with __declspec for Windows, or if you can test it out?
  • Mac(clang/llvm) - Do we have this issue with Mac - If anyone wants to test it, else I can test it out sometime next week.

@astitcher
Copy link
Contributor Author

@marcalff - Thanks for very neatly summarizing all of this and explaining the history - I think I'm in the camp that really considers the singletons to really be part of the SDK as they are actually implementation artifacts, but I totally get that requiring a new library to link with the API is highly undesirable at this point.
I will spend a bit of time this week to try and rough out the code using __attribute__((visibility(default))) to see if that works.
My previous experience suggests that this syntax is good for every version of gcc and clang that matters (as in supports at least C++11). I also think that using a simple macro that either expands to that or to __declspec(dllexport) for VS should work. One thing that I think makes this work simpler is that the symbol export is unconditionally true for every use of the header file, which is unlike the common use where you want to export the symbol when actually compiling the library itself but import the symbol when using the header to define the API that the library exports.

@sirzooro
Copy link

I will spend a bit of time this week to try and rough out the code using __attribute__((visibility(default))) to see if that works.

Methods can stay inline, this should not break anything. Only singleton member variables need exactly one copy. Try to mark them with __attribute__((weak, visibility(default))). You also will need to add extra if in methods to check if symbol is loaded, i.e. check if symbol address is non-null before trying to use it. Example code:

class Singleton
{
private:
  __attribute__((weak, visibility(default)))
  static Singleton* instance;
public:
  static Singleton* get()
  {
    if (nullptr != &instance)
      return instance;
    else
      return nullptr;
  }
};

This solution also require to provide non-weak symbol somewhere, so API also would need library with definitions of all singleton members. It would be linked with app binary only.

@owent
Copy link
Member

owent commented Jul 28, 2022

To my understanding, ELF ABI said it will use the first weak symbol when there are more than one weak symbol, or use the strong one when there several weak symbols and one strong symbol with the same name. So __attribute__((visibility("default"))) will work on Linux, and also macOS because it has the simular ABI.

But on Windows, with PE ABI, every dll and exe will have their own copy of codes and data written in headers , except they are declared as __declspec(dllimport) , but when a function or variable is declared __declspec(dllimport) in header, there must be one compile unit(which will generate .obj by MSVC or .o by GCC/Clang) which declare it as __declspec(dllexport). Because there are only headers in API, we don't have any compile unit which can hold these __declspec(dllexport) symbols now.

Example 1:

Here is a example with __attribute__((visibility("default"))) on MinGW64/Windows:

test_dll.h

#pragma once

#include <iostream>

struct foo {
  __attribute__((visibility("default"))) static void print_addr() {
    __attribute__((visibility("default"))) static foo inst;
    inst.print();
    std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
  }

  void print() {
    std::cout<< "Instance address: "<< this<< std::endl;
  }
};

test_dll.cpp

#include "test_dll.h"

#include <iostream>

__attribute__((dllexport)) void call_print() {
  foo::print_addr();
}

test_main.cpp

#include "test_dll.h"

__attribute__ ((dllimport)) void call_print();

int main() {
  foo::print_addr();
  call_print();
  return 0;
}

Compile and Run

$ g++ test_dll.cpp -shared -o test_dll.dll
In file included from test_dll.cpp:1:
test_dll.h: In static member function 'static void foo::print_addr()':
test_dll.h:7:56: warning: 'visibility' attribute ignored [-Wattributes]
    7 |     __attribute__((visibility("default"))) static foo inst;
      |                                                       ^~~~

$ g++ test_main.cpp -o test_dll.exe -L. -ltest_dll
In file included from test_main.cpp:1:
test_dll.h: In static member function 'static void foo::print_addr()':
test_dll.h:7:56: warning: 'visibility' attribute ignored [-Wattributes]
    7 |     __attribute__((visibility("default"))) static foo inst;
      |                                                       ^~~~


$ ./test_dll.exe
Instance address: 0x22e66cb1410
print_addr address: 1
Instance address: 0x22e66b834a0
print_addr address: 1

$ cat /etc/os-release
NAME=MSYS2
ID=msys2
PRETTY_NAME="MSYS2"
ID_LIKE="cygwin arch"
HOME_URL="https://www.msys2.org"
BUG_REPORT_URL="https://github.com/msys2/MSYS2-packages/issues"

$ gcc -v
Using built-in specs.
COLLECT_GCC=C:\msys64\mingw64\bin\gcc.exe
COLLECT_LTO_WRAPPER=C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.1.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../gcc-12.1.0/configure --prefix=/mingw64 --with-local-prefix=/mingw64/local --build=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --with-native-system-header-dir=/mingw64/include --libexecdir=/mingw64/lib --enable-bootstrap --enable-checking=release --with-arch=x86-64 --with-tune=generic --enable-languages=c,lto,c++,fortran,ada,objc,obj-c++,jit --enable-shared --enable-static --enable-libatomic --enable-threads=posix --enable-graphite --enable-fully-dynamic-string --enable-libstdcxx-filesystem-ts --enable-libstdcxx-time --disable-libstdcxx-pch --enable-lto --enable-libgomp --disable-multilib --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-libiconv --with-system-zlib --with-gmp=/mingw64 --with-mpfr=/mingw64 --with-mpc=/mingw64 --with-isl=/mingw64 --with-pkgversion='Rev2, Built by MSYS2 project' --with-bugurl=https://github.com/msys2/MINGW-packages/issues --with-gnu-as --with-gnu-ld --disable-libstdcxx-debug --with-boot-ldflags=-static-libstdc++ --with-stage1-ldflags=-static-libstdc++
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.1.0 (Rev2, Built by MSYS2 project)

As the result above the static foo inst is not a real singleton.And __attribute__((visibility("default"))) is just ignored on Windows.

BTW: __attribute__((__dllimport__)) and __attribute__((__dllexport__)) are the alternatives for GCC/clang to __declspec(dllimport) and __declspec(dllexport) of MSVC on Windows, but I do not find any __attribute__((visibility(default))) alternative for MSVC.
If we decalre static foo inst as __attribute__((weak, visibility("default"))) static foo inst;, it wouldn't compile and shows these message below:

test_dll.h:7:61: error: weak declaration of 'inst' must be public
    7 |     __attribute__((weak, visibility("default"))) static foo inst;

Example 2:

In example 2, I change the position of singleton variable.

test_dll.h

#pragma once

#include <iostream>

struct foo {
  __attribute__((visibility("default"))) static void print_addr() {
    inst.print();
    std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
  }

  void print() {
    std::cout<< "Instance address: "<< this<< std::endl;
  }

  __attribute__((weak, visibility("default"))) static foo inst;
};


__attribute__((weak, visibility("default"))) foo foo::inst;

And the rest files and compiling commands are the same as above, the result is here:

$ ./test_dll.exe
Instance address: 0x7ff7750f7040
print_addr address: 1
Instance address: 0x7ff87b247020
print_addr address: 1

The singleton variable is also not a real singleton.

Example 3:

I also test __declspec(selectany) of MSVC from #1525

test_dll.h

#pragma once

#include <iostream>

struct foo {
  static void print_addr() {
    inst.print();
    std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
  }

  void print() {
    std::cout<< "Instance address: "<< this<< std::endl;
  }

  static foo inst;
};


__declspec(selectany) foo foo::inst;

test_dll.cpp

#include "test_dll.h"

#include <iostream>

__declspec(dllexport) void call_print() {
  foo::print_addr();
}

test_main.cpp

#include "test_dll.h"

__declspec(dllimport) void call_print();

int main() {
    foo::print_addr();
    call_print();
    return 0;
}

Compile and Run

cl /nologo /O2 /MD /Zi /Z7 /LD test_dll.cpp
cl /nologo /O2 /MD /Zi /Z7 test_main.cpp /link test_dll

d:\workspace\test\vcconsole\testdll>test_main.exe
Instance address: 00007FF6551F9150
print_addr address: 00007FF6551F1140
Instance address: 00007FF6551F9150
print_addr address: 00007FF6551F1140

The good news is this flag works well on MSVC with /link test_dll , but when we link test_dll with /link test_dll.lib, it doesn't work.

cl /nologo /O2 /MD /Zi /Z7 /LD test_dll.cpp
cl /nologo /O2 /MD /Zi /Z7 test_main.cpp /link test_dll.lib

d:\workspace\test\vcconsole\testdll>test_main.exe
Instance address: 00E08130
print_addr address: 00E0114A
Instance address: 796A8120
print_addr address: 796A1122

And cmake seems always use .lib to link.

Example 4:

After learning from #1525 and https://gcc.gnu.org/onlinedocs/gcc/Microsoft-Windows-Variable-Attributes.html , I also test __attribute__((selectany)) of gcc.

test_dll.h

#pragma once

#include <iostream>

struct foo {
  __attribute__((visibility("default"))) static void print_addr() {
    inst.print();
    std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
  }

  void print() {
    std::cout<< "Instance address: "<< this<< std::endl;
  }

  __attribute__((visibility("default"), weak)) static foo inst;
};


__attribute__((visibility("default"), weak, selectany)) foo foo::inst;

The rest files and compiling commands are the same as Example 1: above, the result is here:

$ ./test_dll.exe
Instance address: 0x7ff735ac30b0
print_addr address: 1
Instance address: 0x7ff8d19c3070
print_addr address: 1

The singleton variable is still not a real singleton, or have I missed something? @marcalff

@sirzooro
Copy link

I have just recalled that MSVC has special pragma which tells compiler to add specified library to linker library list. This is potential workaround to automatically link extra API lib with singleton definitions on Windows and pretend that API is still header-only.
#pragma comment(lib, "foo.lib")

@owent
Copy link
Member

owent commented Jul 28, 2022

I have just recalled that MSVC has special pragma which tells compiler to add specified library to linker library list. This is potential workaround to automatically link extra API lib with singleton definitions on Windows and pretend that API is still header-only. #pragma comment(lib, "foo.lib")

This just tell MSVC to link foo.lib ,but I think a head-only library do not generate any .lib file?

marcalff added a commit to marcalff/opentelemetry-cpp that referenced this issue Jul 28, 2022
Fix header only API for singletons (open-telemetry#1520)
marcalff added a commit to marcalff/opentelemetry-cpp that referenced this issue Jul 28, 2022
marcalff added a commit to marcalff/opentelemetry-cpp that referenced this issue Jul 28, 2022
@marcalff
Copy link
Member

Example 4:

After learning from #1525 and https://gcc.gnu.org/onlinedocs/gcc/Microsoft-Windows-Variable-Attributes.html , I also test __attribute__((selectany)) of gcc.

Thanks @owent for the tests.

For gcc on windows, could you try this:

struct foo {
  __attribute__((selectany)) static void print_addr() {
    inst.print();
    std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
  }

  void print() {
    std::cout<< "Instance address: "<< this<< std::endl;
  }

  static foo inst;
};

__attribute__((selectany)) foo foo::inst;

My understanding is that:

  • __attribute__((visibility("default"), weak)) is for the ELF ABI, so Linux
  • __declspec(selectany) is for the windows ABI, and basically means a weak symbol

For GCC on the Windows OS, I would expect to use either:

  • __declspec(selectany), which could be understood by GCC
  • __attribute__((selectany)), also possibly accepted by GCC
    because the choice comes from the ABI, not the compiler (in my understanding)

For Clang on windows, not sure

The good news is that we seem to have a working solution so far for:

  • GCC on linux
  • MSVC on windows

Still to be tested:

  • Clang on various OS
  • MacOS

@marcalff
Copy link
Member

marcalff commented Jul 29, 2022

My findings so far, about the position of the singleton variable ...

static variable inside a function or method

class Foo {
  void bar() {
    static xxx singleton = yyy;
  }
};

Here the visibility of the singleton also depends on the visibility of the class and method,
making things much more complicated.

To avoid.

Member, in class

class Foo {
  static xxx singleton = yyy;
};

This creates a LOT of trouble when adding annotations.
I don't recall the URL, but from memory the GCC doc mentioned explicitly that attributes don't work well with members defined in class.

To avoid.

Member, out of class

class Foo {
  static xxx singleton;
};

xxx Foo::singleton = yyy;

This is what I ended up using in #1525

Seems to be the best solution.

Note that depending on platforms the declaration and/or the definition needs to have special attributes.

@astitcher
Copy link
Contributor Author

I didn't get time this week to rough out any actual code, but I did enough messing/research to have come to different conclusion to others here:

  • At least for Linux/Unix the symbols should not be weak - they should be strong symbols - weak symbols don't have well defined semantics whereas strong symbols do exactly what we want which is to have the application symbol interpose over the shared library symbol (in the same way that if you define malloc() in a C program it will get used by the C library instead of the built in ```malloc()`` in the C library - this is the default for the Unix loader/linker by design - although it is a design with a footguns included!
  • In my coding experiments it's not necessary to break out the static function variables into member statics you merely have to mark the static accessor viz:
class Provider {
public:
  static std::shared_ptr<TracerProvider> GetTracerProvider() noexcept {
    return GetProvider();
  }

  static void SetTracerProvider(std::shared_ptr<TracerProvider> tp) noexcept {
    GetProvider() = tp;
  }

private:
  __attribute__((visibility("default"))) static std::shared_ptr<TracerProvider> &GetProvider() noexcept {
    static std::shared_ptr<TracerProvider> provider(new NoopTracerProvider);
    return provider;
  }
};
  • So this is a smaller diff (although it's not obvious how much that matters.
    Hopefully I will have time on Monday to flesh out something a little more fully baked for Linux/gcc/clang.

@owent
Copy link
Member

owent commented Jul 30, 2022

attribute((selectany))

Example 4:

After learning from #1525 and https://gcc.gnu.org/onlinedocs/gcc/Microsoft-Windows-Variable-Attributes.html , I also test __attribute__((selectany)) of gcc.

Thanks @owent for the tests.

For gcc on windows, could you try this:

struct foo {
  __attribute__((selectany)) static void print_addr() {
    inst.print();
    std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
  }

  void print() {
    std::cout<< "Instance address: "<< this<< std::endl;
  }

  static foo inst;
};

__attribute__((selectany)) foo foo::inst;

My understanding is that:

  • __attribute__((visibility("default"), weak)) is for the ELF ABI, so Linux
  • __declspec(selectany) is for the windows ABI, and basically means a weak symbol

For GCC on the Windows OS, I would expect to use either:

  • __declspec(selectany), which could be understood by GCC
  • __attribute__((selectany)), also possibly accepted by GCC
    because the choice comes from the ABI, not the compiler (in my understanding)

For Clang on windows, not sure

The good news is that we seem to have a working solution so far for:

  • GCC on linux
  • MSVC on windows

Still to be tested:

  • Clang on various OS
  • MacOS

__attribute__((selectany)) is a variable attribue, not a function attribute.It won't compile.

owent@OWENTOU-PC2 MINGW64 /d/workspace/test/vcconsole/testdll
$ cat test_dll.h
#pragma once

#include <iostream>

struct foo {
   __attribute__((selectany)) static void print_addr() {
    inst.print();
    std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
  }

  void print() {
    std::cout<< "Instance address: "<< this<< std::endl;
  }

  static foo inst;
};


__attribute__((selectany)) foo foo::inst;


owent@OWENTOU-PC2 MINGW64 /d/workspace/test/vcconsole/testdll
$ g++ test_dll.cpp -shared -o test_dll.dll
In file included from test_dll.cpp:1:
test_dll.h:6:54: error: 'selectany' attribute applies only to initialized variables with external linkage
    6 |    __attribute__((selectany)) static void print_addr() {
      |                                                      ^

GCC accept both __declspec(selectany) and __attribute__((selectany)), they have the same result for Example 4: above. I have tested both but only post one. I could be wrong, but I prefer to define static variable inside a function, because the initialization of it will be delay to the first call of the function.or the initialization of static member variables and global variables are undefined and there may be some problems if we have them depend each other. But it won't compile if I add __attribute__((weak, visibility("default"))) to a static variable inside a function with GCC on Windows(MinGW).

@owent
Copy link
Member

owent commented Jul 30, 2022

Example 1

Yes

I didn't get time this week to rough out any actual code, but I did enough messing/research to have come to different conclusion to others here:

  • At least for Linux/Unix the symbols should not be weak - they should be strong symbols - weak symbols don't have well defined semantics whereas strong symbols do exactly what we want which is to have the application symbol interpose over the shared library symbol (in the same way that if you define malloc() in a C program it will get used by the C library instead of the built in ```malloc()`` in the C library - this is the default for the Unix loader/linker by design - although it is a design with a footguns included!
  • In my coding experiments it's not necessary to break out the static function variables into member statics you merely have to mark the static accessor viz:
class Provider {
public:
  static std::shared_ptr<TracerProvider> GetTracerProvider() noexcept {
    return GetProvider();
  }

  static void SetTracerProvider(std::shared_ptr<TracerProvider> tp) noexcept {
    GetProvider() = tp;
  }

private:
  __attribute__((visibility("default"))) static std::shared_ptr<TracerProvider> &GetProvider() noexcept {
    static std::shared_ptr<TracerProvider> provider(new NoopTracerProvider);
    return provider;
  }
};
  • So this is a smaller diff (although it's not obvious how much that matters.
    Hopefully I will have time on Monday to flesh out something a little more fully baked for Linux/gcc/clang.

Yes, it's unnesessary to add __attribute__((visibility("default"))) to static variable inside function, the problem is the same as Example 1 above. Each dll/exe will has it's own copy of these static variables on Windows. The symbol lookup rules are different bewteen ELF ABI(Linux) and PE ABI(Windows).

@astitcher
Copy link
Contributor Author

For it appears we have a C++ language issue which maybe means that the Windows solution doesn't "work":

Defining the static member out of line in a header file is an ODR violation and so technically is not allowed by the language. viz:

class Provider {
public:
  static std::shared_ptr<TracerProvider> GetTracerProvider() noexcept {
    return GetProvider();
  }

  static void SetTracerProvider(std::shared_ptr<TracerProvider> tp) noexcept {
    GetProvider() = tp;
  }

private:
  static std::shared_ptr<TracerProvider> provider;
  static std::shared_ptr<TracerProvider> &GetProvider() noexcept {
    return provider;
  }
};

__attribute__((visibility("default"))) std::shared_ptr<TracerProvider> Provider::provider=std::make_shared<NoopTracerProvider>();

This code works (and I assume the variant using __declspec(selectany) works for visual studio. However the language doesn't actually allow this because you have defined provider in each header file that is included which violates the language's 'One-Definition-Rule'. In practice it probably works, but I think it is undefined what happens if you violate ODR.

Curiously the solution with a static inside the member function doesn't violate ODR from the language perspective (afaict).

I'm not sure if this practically matters, but it is worth knowing because the ODR rule is there precisely to avoid exactly the kind of issue we're trying to solve here.

@astitcher
Copy link
Contributor Author

Also worth noting that "weak" symbol references are really not what we are trying to achieve here so I don't know why people keep on using that terminology.

We are trying to ensure that we only pick out a single symbol reference from multiple of the same name (and size) potentially in different objects.

That is not what weak references are primarily for: The major point of a weak reference is to be able to override one implementation of something with a more specialized implementation, or to allow the weak implementation not to exist at all - neither of these fits what we need here.

@meastp
Copy link
Contributor

meastp commented Aug 15, 2022

hi all, sorry for the late reply (was away on holiday)

Just a quick note to say that I think the only way to support Windows + DLL is by building the API itself as a dll / library (see explanation by @owent for details). I have been successfully testing this approach in my fork/branch as mentioned in #1105.

I suggested a hybrid version (#1105) where the API can be compiled as either header-only or static/dynamic library, but it had a lot of maintenance overhead and was (rightly) rejected as a viable solution.

@meastp
Copy link
Contributor

meastp commented Aug 29, 2022

hi all - is there a status update on this issue? personally, I think just converting the API to a "regular" library would be easier - especially to attract more contributors with a common model :)

I'm kind of biased, though, because I would ideally like to avoid maintaining a fork just to be able to compile on Windows as a DLL :)

@esigo
Copy link
Member

esigo commented Aug 29, 2022

hi all - is there a status update on this issue? personally, I think just converting the API to a "regular" library would be easier - especially to attract more contributors with a common model :)

I'm kind of biased, though, because I would ideally like to avoid maintaining a fork just to be able to compile on Windows as a DLL :)

The plan is to limit the scope to Linux for now: #1525 (comment)

@meastp
Copy link
Contributor

meastp commented Aug 30, 2022

@esigo oh :(

@astitcher
Copy link
Contributor Author

@meastp I think that supporting Windows requires a DLL so it will come later . It is possible to make the library inclusion 'automatic' by using a #pragma but I'm not sure that is any real benefit as there will still be a library needed.

@meastp
Copy link
Contributor

meastp commented Aug 31, 2022

@astitcher yes, I think it does require a dll. Is the plan still to split the API into a header-only library and a static/dll library for everything that needs to be in a static/dll library to work properly?

Library inclusion is not a problem, I wouldn't worry about it at this stage. A working DLL is more important ;)

@marcalff
Copy link
Member

There are two separate issues to address here.

For windows, the opentelemetry-cpp code does not support building a windows DLL (as far as I understand), so this needs to be resolved first, independently of uniqueness issues for singletons. I did not investigate first hand, but from discussions with peers there are specific concerns about memory allocation and de allocations to be addressed.

For all platforms, singletons must be singletons, so that the opentelemetry-cpp shared library / DDL for the SDK can be used by many components, not just one.

Work is in progress to resolve the second part, for all platforms that currently can be built as shared libraries (so, no windows yet)

Once the first part is resolved (windows DDL), this will be re evaluated to also address singletons issues for windows.

marcalff added a commit to marcalff/opentelemetry-cpp that referenced this issue Sep 9, 2022
This fix is for gcc and clang only.
@marcalff
Copy link
Member

Fixed by #1604

@lalitb

Please close, I don't have permissions to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
7 participants