-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AtomicPtr without losing pointer provenance #70765
Conversation
r? @kennytm (rust_highfive has picked a reviewer for you, use r? to override) |
I think we should make This in general LGTM, but I find that |
Well, transmute says it "reinterprets" the value and that it is "transmute is semantically equivalent to a bitwise move of one type into another". The problem is that most people think a ptr-to-int cast also just reinterprets the value, but unfortunately, that is wrong. If anything we should adjust the docs for casts, whatever would be the right place for that. However, it also also hard to say anything precise here while LLVM is still full of bugs in this area. |
@RalfJung: If I understand correctly, this change is only about preventing false-positives in Miri's leak checker, right? That is, casting to an integer is fine correctness-wise? If this is the case, then I'm concerned that Miri could run into this same issue on other projects. As @kennytm pointed out, the I have no objection to this PR itself, but I think it would be a good idea to open an issue on Miri. It might make sense for Miri to store some extra information during int-ptr casts to allow the leak checker to avoid false positives (though I'm not sure how difficult this would be). |
Well. The full answer is complicated. Casting and transmuting to an integer are not the same operation even if we entirely ignore Miri, and pretending that they are has lead to hard to fix and still outstanding bugs in multiple compilers:
So yes, this is dangerously subtle, but it is a sad fact of low-level languages such as C, C++ and Rust today that we already have to make a difference between casts and transmutes from pointers to integers (independent of whatever Miri is doing), and that hardly anyone is aware of that fact. In light of this, I find it very odd that a type named However, as the paper I linked above shows, using integers here (that carry pointer values, still with provenance) is not entirely free of problems either (there are some arithmetic optimizations on integers that go wrong if integers are allowed to actually be pointers). Really what we would want is to be able to atomically load and store pointers in memory without changing their type ever -- but LLVM's
That might work for some special cases like this one, at the expense of making the core Miri engine notably more complicated (as now there would be things that do not have provenance in the Abstract Machine, but do have provenance in Miri). For a case that is already harder than the one here, imagine that the program stored the pointer cast to an integer and with the last bit of the pointer (guaranteed to be zero due to alignment) used for storing some boolean flag. For more extreme cases, imagine the XOR of two pointers is stored together with one pointer (so the other can be restored by another XOR). Or maybe the program is using some clever packing scheme to fit more pointers into the same memory. This is a never-ending game of whack-a-mole, which I am not sure is worth pursuing. I like having a clear specification for Miri's leak checker: it will consider pointers, but not integers, when checking which allocations are live. In that sense I lean towards calling this a "wontfix" issue on the Miri side, but could be convinced otherwise. (I should mention that another option would be to make ptr-to-int casts a NOP again in Miri, and properly preserve provenance until e.g. an arithmetic operation happens. This is different from an approach where we do make the cast lose provenance as far as the program execution is concerned, and preserve provenance solely for the purpose of leak checking. This alternative approach is not currently feasible, but we have some plans that could enable this in the future. However, that would make Miri most likely differ from the actual behavior of the Rust Abstract Machine, where I do not think we can get away with such an approach if we want to use LLVM correctly as a backend.)
Fair. I have several responses to that. First of all, even though Miri is already used on a dozen or more libraries for UB testing, as far as we know none had this particular problem so far. Secondly, if it becomes a problem for a library, it is legitimate to disable the leak check locally if they prefer that over adjusting their code to preserve provenance. And finally, the code this is patching here is somewhat special as it affects every program running on a Windows target. |
I am very much in favor of mentioning this in the documentation somewhere, though I don't quite know where. Maybe in the nomicon too? We could, perhaps, even make it a clippy lint? As someone who does write a decent amount of unsafe code, I would be perfectly happy changing usize casts to transmutes if it helps tools that check for correctness, I just need to know that that's necessary in the first place! |
I am honestly somewhat worried about promoting the "transmute to |
I am giving up on this approach for now. A better solution to enable leak checks on Windows might be to implement the modern RwLock APIs, that side-steps the issue. Atomic pointer operations remain a problem independent of that, but maybe it's better to make Miri track provenance some more than to introduce |
This came up while debugging why Miri kept reporting memory leaks on Windows. The leaked memory came from libstd static
sys:common::mutex::Mutex
. When such a mutex gets initialized, allocation happens (on Windows XP, which is also the code path that Miri uses), and then the pointer to that allocation is cast to an integer and stored in static memory. When doing leak checking, Miri cannot tell that there is a pointer here, it just sees an integer, and thus the memory that the integer "points to" is considered leaked.The easiest fix would be to transmute the pointer to an integer instead of casting it. That avoids dropping the provenance which Miri uses to later do leak checking.
However, since this could happen with anyone handling pointers atomically, I wanted to use
AtomicPtr
instead -- which, as it turns out, internally also performed ptr-to-int casts, so I replaced those by transmutes.Ideally we'd call
atomic_store
and friends at pointer type, but that leads to a codegen error:If that codegen check can be relaxed to also accept pointers, we could avoid transmutes and casts entirely, but that goes beyond my LLVM knowledge.
(In principle whether or not provenance is dropped should also make a difference for optimizations in LLVM, but currently LLVM is buggy in that area and does not make this distinction.)