-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UB due to lack of a full fence in full_fence (on x86) #70
Comments
The MSRV of the |
Filed #71 to resolve this.
When this crate was first released, inline assembly was not stable. And for a long time after that, we used MSRV that older than 1.59. Since we have recently raised the MSRV of some smol-rs crates to 1.63, I think this is an appropriate time to address this issue.
The C++20 memory model states that the SC operation "establish a single total modification order of all atomic operations that are so tagged". https://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering While it may be possible to "merge two atomic operations when there are only pure operations in the middle" as described in the article you linked to, I don't think it is allowed to completely remove independent SC operations (unless the compiler can prove that there are never any SC operations that could be synchronized with it by inspecting all programs). (It may be possible to replace SC CAS with SC fence or other SC RMW, but it doesn't lose the property we need here.) So, unless the C++ memory model changes the property mentioned above in the future, I expect that no wrong code will be generated here, but inline assembly is the "definitely correct" choice here, and I agree that using inline assembly is definitely preferable. |
Ah, if it's just an MSRV issue that is great! And thanks @taiki-e for getting the ball rolling with that PR. :) If the
Yeah, but the way that order interacts with the other orders isn't always entirely how one would expect, I think... so it's certainly not obvious to me that compilers are not allowed to drop the access. In particular this order AFAIK does not participate when determining synchronizes-with edges -- but an SC fence has the effects of a release-acquire fence and can hence establish synchronizes-with edges. You have to study the formalizations of this model, I don't think the English prose can be precise enough for subtle questions like this. |
LLVM certainly does not agree with your reasoning: this becomes a NOP. pub fn test() {
let x = AtomicUsize::new(0);
x.load(Ordering::SeqCst);
} Stores also get removed. I think they just didn't implement the pass that removes dead RMWs. |
Oh, that is interesting. I can reproduce it, even with a very old rustc, even if I used compiler_fence around SC load/store. https://godbolt.org/z/xsvG5Ke58 |
I was recently made aware of this code:
event-listener/src/notify.rs
Lines 562 to 577 in 0ea4641
As the comment says, this is UB -- and the myth that no sane compiler is going to optimize atomics is a myth. However, I admit I don't know enough about this specific case to say whether this is a risk for this crate.
What I am wondering, why doesn't this use inline assembly? That is the intended mechanism to force the compiler to generate code in a particular way, and it is clearly sound here. The alternative is even mentioned in the comment, but unfortunately without an explanation of why it was discarded.
The text was updated successfully, but these errors were encountered: