-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
thread_local variables cause data races (UB) when used with userver #242
Comments
We now provide However, third-party libraries still may and do use |
We've disabled LTO by default due to ongoing issues with third-party libraries. For example, when linking Jemalloc statically, the following code causes a data race: void* smth = malloc(...);
engine::Yield();
free(smth); We are currently discussing the possibility of implementing a "fiber-safe optimizations" flag in Clang. This will take several months. |
Disable lto by default #242 I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/ Tests: протестировано CI 8d6da950fac0d2dd020b334ce68867c312dc16a2 Pull Request resolved: #516
We've started the discussion on the topic at https://discourse.llvm.org/t/misoptimization-of-tls-and-attibute-const-in-stackful-coroutines-ffunction-thread-migration/80081 Also, there's a chance that |
Minimal reproducer:
Let's say that after the context switch, the task is scheduled on another thread.
It's expected then that after
Yield
, the new instance ofx
(which is local to the current thread) will be incremented. But that's not what happens! The compiler can't imagine the situation that the address of athread_local
variable changes, so it only loads the address ofx
once and caches it across the suspension point. Then it keeps using the variable of another thread, causing a data race, because another thread may use it at the same time (it has every right to do so, because it is its own instance of thethread_local
variable).This bug has been documented with little reaction from compiler maintainers:
Recent versions of Clang (at least Clang-14) do this optimization under LTO even when variable access happens inside a function.
Marking the variable
atomic
,volatile
, addingasm volatile("" ::: "memory");
does not help.The only thing that helps is to mark the function that accesses the variable as
noinline
, as we've done in bug core: prevent clang 14 LTO UB thread_local access. We, however, cannot expect all the user and library code to do that.A reproducer test has been added in feat engine: demonstrate that thread_local variables don't work with userver.
The text was updated successfully, but these errors were encountered: