-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opentelemetry API singletons (TraceProvider etc.) do not play well in shared library uses that hide symbol visibility #1520
Comments
The issue has been discussed earlier for Windows - Support building DLL for windows. The header-only API design is the core issue here, and there are discussions on removing this constraint (no decision and timelines yet). Just to add, moving @owais, @marcalff @esigo @ThomsonTan - Just in case you have more to add here. I think @marcalff saw similar issue here #1409 (though this was SDK). |
If your project use opentelemetry-cpp as internal component, you can also build opentelemetry-cpp as static library with As mentioned by @lalitb .The header-only API design is the core issue of not exporting symbols, and |
@lalitb I looked at #1105 and indeed the issue and solution are the same - is there any movement on adding these files to a library? Would the approach there be creating a new API shared library? If that would be an otel build time decision we could work with that because we will be building otel packages ourselves to install before our own library package. |
@astitcher - Would it be possible for you to prototype and see if this works, which will help us make a decision on the right approach? |
Thanks @astitcher for the detailed problem description. Also discussed during the opentelemetry-cpp meeting today. The same issue was reported recently (#1409) and fixed (#1420) There is still some areas to fix in the SDK, see (#1429), The same coding pattern (header only singleton) is also used in the API, I tried below to summarize the problem and constraints, All opinions are my own. I) HistoryFor some history about this decision, from the meeting notes: Oct 7th, 2019:Bogdan - can we have pure C++ header API? Ryan/Max - singleton/global would be hard. Feb 24th, 2020:[Evgeny] Does library provide access to globals/singletons, e.g. TraceFactory Multiple libraries, built separately, how do they share/access the same singleton? [Ryan] The archives will have multiple symbols but the linker will resolve it. Bigger problems around late binding. [Evgeny] Alternative: abandon header-only approach and put the singleton business into a *.cc file. [Ryan] We decided early on to try to be header-only to be easier to install. Takeaway from these early discussions:The challenge of implementing singletons in header only code has been The specific issue about shared libraries ("Bigger problems around late A header only library is indeed easier to install, but note the wording II) Desirable architectural qualitiesII-A) Singletons MUST be singletons.I think we all agree on this, otherwise everything will fall apart. No matter what the final technical solution is, singletons must really be II-B) Instrumenting a library MUST depend only on API header files.When instrumenting library X, the library owner has no control,
The best way to achieve this is to compile the instrumented library This is what hooks like the trace provider singleton are for in the first The desired goal here is for library authors to ship only one library, II-C) Deployment ease of use.Assume an existing application composed of:
The main application links against libraries A, B and C. When instrumenting libraries with opentelemetry:
To deploy the application, several choices are then possible: a) Not using opentelemetryLink:
In this deployment, linking the final application does not change, What the "header only API" property provides, is the possibility to link This is desirable to facilitate distributing libraries with/without b) Using opentelemetryLink:
In this case, the fact that the API is header only is irrelevant, II-D) Non intrusive architectural constraintsNo matter what technical choices are made in the opentelemetry-cpp space, In particular, if a given application has reasons of its own to compile code III) Current stateAs of now, singletons like the global trace provider are:
For applications using With the code as is, II-C-a) is satisfied at the expense of II-D) IV) Possible solutionsIV-A) Do not support -fvisibility=hidden.This "solution" is listed for completeness only to point out that it is not For opentelemetry in general, and opentelemetry-cpp in particular, to be Disclosure: In the application we (at work) instrument, Changing this is not an option, making II-D) a critical point. IV-B) Make header-only API and -fvisibility=hidden work together.If feasible, this would be the best technical solution, as it Adding explicit visibility directives to the singleton globals My main concern is:
In particular, what if this is not possible for a given (old) version of a Is dropping support for compiler X an acceptable outcome ? IV-C) Revisit the "header-only API" principle.To take an example with the tracer provider singleton,
As a side comment, not only the provider singleton should be unique, Looking at each part:
The fact that the setter function is defined and implemented in the API Moving the singleton implementation in a opentelemetry-api-singletons library Consequences are that:
The later point is not desirable, but this is a compromise if IV-B can not In this solution, II-D) is supported at the expense of II-C-a) V) Decision needed.I think it is time to revisit the header-only API principle, If the pattern can be amended with visibility directives to make it work all The first step is technical investigations, to see if solution IV-B is Otherwise, opentelemetry-cpp will have to provide an API library for The trade off between II-C and II-D is a major one. Not supporting II-C has major impact on everyone not using opentelemetry. Not supporting II-D has major impact on everyone using opentelemetry. |
Thanks for summarizing it nicely @marcalff . Appreciated :) Agree on investigating the header-only approach with annotations (IV-B above) first. As long as it works on all supported platforms (Windows, Mac, Linux), and commonly used compilers as documented here - https://github.com/open-telemetry/opentelemetry-cpp#supported-development-platforms - this should be preferred.
|
@marcalff - Thanks for very neatly summarizing all of this and explaining the history - I think I'm in the camp that really considers the singletons to really be part of the SDK as they are actually implementation artifacts, but I totally get that requiring a new library to link with the API is highly undesirable at this point. |
Methods can stay inline, this should not break anything. Only singleton member variables need exactly one copy. Try to mark them with class Singleton
{
private:
__attribute__((weak, visibility(default)))
static Singleton* instance;
public:
static Singleton* get()
{
if (nullptr != &instance)
return instance;
else
return nullptr;
}
}; This solution also require to provide non-weak symbol somewhere, so API also would need library with definitions of all singleton members. It would be linked with app binary only. |
To my understanding, ELF ABI said it will use the first weak symbol when there are more than one weak symbol, or use the strong one when there several weak symbols and one strong symbol with the same name. So But on Windows, with PE ABI, every dll and exe will have their own copy of codes and data written in headers , except they are declared as Example 1:Here is a example with test_dll.h#pragma once
#include <iostream>
struct foo {
__attribute__((visibility("default"))) static void print_addr() {
__attribute__((visibility("default"))) static foo inst;
inst.print();
std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
}
void print() {
std::cout<< "Instance address: "<< this<< std::endl;
}
}; test_dll.cpp#include "test_dll.h"
#include <iostream>
__attribute__((dllexport)) void call_print() {
foo::print_addr();
} test_main.cpp#include "test_dll.h"
__attribute__ ((dllimport)) void call_print();
int main() {
foo::print_addr();
call_print();
return 0;
} Compile and Run$ g++ test_dll.cpp -shared -o test_dll.dll
In file included from test_dll.cpp:1:
test_dll.h: In static member function 'static void foo::print_addr()':
test_dll.h:7:56: warning: 'visibility' attribute ignored [-Wattributes]
7 | __attribute__((visibility("default"))) static foo inst;
| ^~~~
$ g++ test_main.cpp -o test_dll.exe -L. -ltest_dll
In file included from test_main.cpp:1:
test_dll.h: In static member function 'static void foo::print_addr()':
test_dll.h:7:56: warning: 'visibility' attribute ignored [-Wattributes]
7 | __attribute__((visibility("default"))) static foo inst;
| ^~~~
$ ./test_dll.exe
Instance address: 0x22e66cb1410
print_addr address: 1
Instance address: 0x22e66b834a0
print_addr address: 1
$ cat /etc/os-release
NAME=MSYS2
ID=msys2
PRETTY_NAME="MSYS2"
ID_LIKE="cygwin arch"
HOME_URL="https://www.msys2.org"
BUG_REPORT_URL="https://github.com/msys2/MSYS2-packages/issues"
$ gcc -v
Using built-in specs.
COLLECT_GCC=C:\msys64\mingw64\bin\gcc.exe
COLLECT_LTO_WRAPPER=C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/12.1.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../gcc-12.1.0/configure --prefix=/mingw64 --with-local-prefix=/mingw64/local --build=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --with-native-system-header-dir=/mingw64/include --libexecdir=/mingw64/lib --enable-bootstrap --enable-checking=release --with-arch=x86-64 --with-tune=generic --enable-languages=c,lto,c++,fortran,ada,objc,obj-c++,jit --enable-shared --enable-static --enable-libatomic --enable-threads=posix --enable-graphite --enable-fully-dynamic-string --enable-libstdcxx-filesystem-ts --enable-libstdcxx-time --disable-libstdcxx-pch --enable-lto --enable-libgomp --disable-multilib --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-libiconv --with-system-zlib --with-gmp=/mingw64 --with-mpfr=/mingw64 --with-mpc=/mingw64 --with-isl=/mingw64 --with-pkgversion='Rev2, Built by MSYS2 project' --with-bugurl=https://github.com/msys2/MINGW-packages/issues --with-gnu-as --with-gnu-ld --disable-libstdcxx-debug --with-boot-ldflags=-static-libstdc++ --with-stage1-ldflags=-static-libstdc++
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.1.0 (Rev2, Built by MSYS2 project) As the result above the
Example 2:In example 2, I change the position of singleton variable. test_dll.h#pragma once
#include <iostream>
struct foo {
__attribute__((visibility("default"))) static void print_addr() {
inst.print();
std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
}
void print() {
std::cout<< "Instance address: "<< this<< std::endl;
}
__attribute__((weak, visibility("default"))) static foo inst;
};
__attribute__((weak, visibility("default"))) foo foo::inst; And the rest files and compiling commands are the same as above, the result is here: $ ./test_dll.exe
Instance address: 0x7ff7750f7040
print_addr address: 1
Instance address: 0x7ff87b247020
print_addr address: 1 The singleton variable is also not a real singleton. Example 3:I also test test_dll.h#pragma once
#include <iostream>
struct foo {
static void print_addr() {
inst.print();
std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
}
void print() {
std::cout<< "Instance address: "<< this<< std::endl;
}
static foo inst;
};
__declspec(selectany) foo foo::inst; test_dll.cpp#include "test_dll.h"
#include <iostream>
__declspec(dllexport) void call_print() {
foo::print_addr();
} test_main.cpp#include "test_dll.h"
__declspec(dllimport) void call_print();
int main() {
foo::print_addr();
call_print();
return 0;
} Compile and Runcl /nologo /O2 /MD /Zi /Z7 /LD test_dll.cpp
cl /nologo /O2 /MD /Zi /Z7 test_main.cpp /link test_dll
d:\workspace\test\vcconsole\testdll>test_main.exe
Instance address: 00007FF6551F9150
print_addr address: 00007FF6551F1140
Instance address: 00007FF6551F9150
print_addr address: 00007FF6551F1140 The good news is this flag works well on MSVC with cl /nologo /O2 /MD /Zi /Z7 /LD test_dll.cpp
cl /nologo /O2 /MD /Zi /Z7 test_main.cpp /link test_dll.lib
d:\workspace\test\vcconsole\testdll>test_main.exe
Instance address: 00E08130
print_addr address: 00E0114A
Instance address: 796A8120
print_addr address: 796A1122 And cmake seems always use Example 4:After learning from #1525 and https://gcc.gnu.org/onlinedocs/gcc/Microsoft-Windows-Variable-Attributes.html , I also test test_dll.h#pragma once
#include <iostream>
struct foo {
__attribute__((visibility("default"))) static void print_addr() {
inst.print();
std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
}
void print() {
std::cout<< "Instance address: "<< this<< std::endl;
}
__attribute__((visibility("default"), weak)) static foo inst;
};
__attribute__((visibility("default"), weak, selectany)) foo foo::inst; The rest files and compiling commands are the same as Example 1: above, the result is here: $ ./test_dll.exe
Instance address: 0x7ff735ac30b0
print_addr address: 1
Instance address: 0x7ff8d19c3070
print_addr address: 1 The singleton variable is still not a real singleton, or have I missed something? @marcalff |
I have just recalled that MSVC has special pragma which tells compiler to add specified library to linker library list. This is potential workaround to automatically link extra API lib with singleton definitions on Windows and pretend that API is still header-only. |
This just tell MSVC to link |
Fix header only API for singletons (open-telemetry#1520)
Fixed asan build
Thanks @owent for the tests. For gcc on windows, could you try this:
My understanding is that:
For GCC on the Windows OS, I would expect to use either:
For Clang on windows, not sure The good news is that we seem to have a working solution so far for:
Still to be tested:
|
My findings so far, about the position of the singleton variable ... static variable inside a function or method
Here the visibility of the singleton also depends on the visibility of the class and method, To avoid. Member, in class
This creates a LOT of trouble when adding annotations. To avoid. Member, out of class
This is what I ended up using in #1525 Seems to be the best solution. Note that depending on platforms the declaration and/or the definition needs to have special attributes. |
I didn't get time this week to rough out any actual code, but I did enough messing/research to have come to different conclusion to others here:
|
owent@OWENTOU-PC2 MINGW64 /d/workspace/test/vcconsole/testdll
$ cat test_dll.h
#pragma once
#include <iostream>
struct foo {
__attribute__((selectany)) static void print_addr() {
inst.print();
std::cout<< "print_addr address: "<< foo::print_addr<< std::endl;
}
void print() {
std::cout<< "Instance address: "<< this<< std::endl;
}
static foo inst;
};
__attribute__((selectany)) foo foo::inst;
owent@OWENTOU-PC2 MINGW64 /d/workspace/test/vcconsole/testdll
$ g++ test_dll.cpp -shared -o test_dll.dll
In file included from test_dll.cpp:1:
test_dll.h:6:54: error: 'selectany' attribute applies only to initialized variables with external linkage
6 | __attribute__((selectany)) static void print_addr() {
| ^ GCC accept both |
Yes
Yes, it's unnesessary to add |
For it appears we have a C++ language issue which maybe means that the Windows solution doesn't "work": Defining the static member out of line in a header file is an ODR violation and so technically is not allowed by the language. viz:
This code works (and I assume the variant using Curiously the solution with a static inside the member function doesn't violate ODR from the language perspective (afaict). I'm not sure if this practically matters, but it is worth knowing because the ODR rule is there precisely to avoid exactly the kind of issue we're trying to solve here. |
Also worth noting that "weak" symbol references are really not what we are trying to achieve here so I don't know why people keep on using that terminology. We are trying to ensure that we only pick out a single symbol reference from multiple of the same name (and size) potentially in different objects. That is not what weak references are primarily for: The major point of a weak reference is to be able to override one implementation of something with a more specialized implementation, or to allow the weak implementation not to exist at all - neither of these fits what we need here. |
hi all, sorry for the late reply (was away on holiday) Just a quick note to say that I think the only way to support Windows + DLL is by building the API itself as a dll / library (see explanation by @owent for details). I have been successfully testing this approach in my fork/branch as mentioned in #1105. I suggested a hybrid version (#1105) where the API can be compiled as either header-only or static/dynamic library, but it had a lot of maintenance overhead and was (rightly) rejected as a viable solution. |
hi all - is there a status update on this issue? personally, I think just converting the API to a "regular" library would be easier - especially to attract more contributors with a common model :) I'm kind of biased, though, because I would ideally like to avoid maintaining a fork just to be able to compile on Windows as a DLL :) |
The plan is to limit the scope to Linux for now: #1525 (comment) |
@esigo oh :( |
@meastp I think that supporting Windows requires a DLL so it will come later . It is possible to make the library inclusion 'automatic' by using a |
@astitcher yes, I think it does require a dll. Is the plan still to split the API into a header-only library and a static/dll library for everything that needs to be in a static/dll library to work properly? Library inclusion is not a problem, I wouldn't worry about it at this stage. A working DLL is more important ;) |
There are two separate issues to address here. For windows, the opentelemetry-cpp code does not support building a windows DLL (as far as I understand), so this needs to be resolved first, independently of uniqueness issues for singletons. I did not investigate first hand, but from discussions with peers there are specific concerns about memory allocation and de allocations to be addressed. For all platforms, singletons must be singletons, so that the opentelemetry-cpp shared library / DDL for the SDK can be used by many components, not just one. Work is in progress to resolve the second part, for all platforms that currently can be built as shared libraries (so, no windows yet) Once the first part is resolved (windows DDL), this will be re evaluated to also address singletons issues for windows. |
This fix is for gcc and clang only.
Describe your environment
We are using the opentelemetry tracing API (and SDK) inside a Linux C++ shared library (.so file) built with the standard gcc toolchain. This library is provides messaging using the AMQP protocol (the library is Qpid Proton C++). We create tracing spans internal to the library to represent the lifecycle of messages sent and received by the library. The contexts associated with the spans are propagated with the messages so that distributed tracing is achieved.
The intention is that any application using our library should be able to create its own tracing spans that are naturally related to the library generated spans (by default using the active span as the parent span). This requires that the TracingProvider accessible to the library and to the application using it is the same TracingProvider.
Steps to reproduce
Our library is built with
-fvisibility=hidden
. This means that any symbol that should be exported from the library needs some extra annotation (__attribute__((visibility("default")))
for gcc/clang). We use this to carefully control the visible API/ABI from our library and to be sure that internal details aren't visible outside.However the way that the singleton inside
Provider
is implemented in a header file means that it is defined as a static symbol inside the inline member functionProvider::GetProvider
. This means that there are duplicate symbols for every time theprovider.h
header file is included in for example our library and an application that uses it.If the symbols are all visible to the linker at link and runtime then the symbol will be effectively deduplicated and there will be only one used - to my understanding the application symbol takes precedence, but as long as there is only one it will work correctly.
However when the library symbols are hidden by default as in our library there is no way for the runtime loader to know that there should only be one version of the singleton and what happens is that the library and the application end up with different
TracerProvider
s.This is a serious problem for us as we ship our library with the symbols exported explicitly for very good reason and so we can't just use
Provider::GetTracerProvider
in the library and application and have it work correctly together.What is the expected behavior?
The way I would expect this to work is that the singletons are not defined in header files and so duplicated in every object file that includes them, but rather the header file only declares
Provider::GetProvider
and it is defined in one of the opentelemetry-cpp libraries (the common lib?). I understand that this moves the singleton out of what you think of as the 'API' and into the 'SDK'. But I really think that architecturally singletons are actuall implementation artifacts not API artifiacts.Additional context
I note that this is not an issue at all if you statically link everything as in that case the final link phase takes care of the issue.
I also note that this should be an issue using DLLs on windows as by symbols are not exported by default (afair) and also need to be explicitly marked to be exported (using
__declspec(dllexport)
The text was updated successfully, but these errors were encountered: