Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AtomicPtr without losing pointer provenance #70765

Closed
wants to merge 3 commits into from

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Apr 4, 2020

This came up while debugging why Miri kept reporting memory leaks on Windows. The leaked memory came from libstd static sys:common::mutex::Mutex. When such a mutex gets initialized, allocation happens (on Windows XP, which is also the code path that Miri uses), and then the pointer to that allocation is cast to an integer and stored in static memory. When doing leak checking, Miri cannot tell that there is a pointer here, it just sees an integer, and thus the memory that the integer "points to" is considered leaked.

The easiest fix would be to transmute the pointer to an integer instead of casting it. That avoids dropping the provenance which Miri uses to later do leak checking.

However, since this could happen with anyone handling pointers atomically, I wanted to use AtomicPtr instead -- which, as it turns out, internally also performed ptr-to-int casts, so I replaced those by transmutes.

Ideally we'd call atomic_store and friends at pointer type, but that leads to a codegen error:

error[E0511]: invalid monomorphization of `atomic_load` intrinsic: expected basic integer type, found `*mut ()`
    --> /home/r/src/rust/rustc.2/src/libcore/sync/atomic.rs:2277:19
     |
2277 |         SeqCst => intrinsics::atomic_load(dst),
     |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If that codegen check can be relaxed to also accept pointers, we could avoid transmutes and casts entirely, but that goes beyond my LLVM knowledge.

(In principle whether or not provenance is dropped should also make a difference for optimizations in LLVM, but currently LLVM is buggy in that area and does not make this distinction.)

@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 4, 2020
@kennytm
Copy link
Member

kennytm commented Apr 4, 2020

I think we should make lock a union of c::SRWLOCK and AtomicPtr<ReentrantMutex> to get rid of all those casting.

This in general LGTM, but I find that ptr as usize having subtly different behavior than transmute(ptr): usize is too easy to misuse, in particular transmute() does not mention that it preserves provenance.

@RalfJung
Copy link
Member Author

RalfJung commented Apr 4, 2020

I find that ptr as usize having subtly different behavior than transmute(ptr): usize is too easy to misuse, in particular transmute() does not mention that it preserves provenance.

Well, transmute says it "reinterprets" the value and that it is "transmute is semantically equivalent to a bitwise move of one type into another".

The problem is that most people think a ptr-to-int cast also just reinterprets the value, but unfortunately, that is wrong. If anything we should adjust the docs for casts, whatever would be the right place for that. However, it also also hard to say anything precise here while LLVM is still full of bugs in this area.

@Aaron1011
Copy link
Member

Aaron1011 commented Apr 5, 2020

@RalfJung: If I understand correctly, this change is only about preventing false-positives in Miri's leak checker, right? That is, casting to an integer is fine correctness-wise?

If this is the case, then I'm concerned that Miri could run into this same issue on other projects. As @kennytm pointed out, the transmute cast distinction seems extremely subtle, and does not seem to be clearly documented at the moment. I think it will be difficult to convince other projects to add unsafe code if the only purpose to avoid certain Miri false-positives, when using a cast is both safe and correct.

I have no objection to this PR itself, but I think it would be a good idea to open an issue on Miri. It might make sense for Miri to store some extra information during int-ptr casts to allow the leak checker to avoid false positives (though I'm not sure how difficult this would be).

@RalfJung
Copy link
Member Author

RalfJung commented Apr 6, 2020

If I understand correctly, this change is only about preventing false-positives in Miri's leak checker, right? That is, casting to an integer is fine correctness-wise?

Well. The full answer is complicated. Casting and transmuting to an integer are not the same operation even if we entirely ignore Miri, and pretending that they are has lead to hard to fix and still outstanding bugs in multiple compilers:

So yes, this is dangerously subtle, but it is a sad fact of low-level languages such as C, C++ and Rust today that we already have to make a difference between casts and transmutes from pointers to integers (independent of whatever Miri is doing), and that hardly anyone is aware of that fact.

In light of this, I find it very odd that a type named AtomicPtr would internally lose the provenance of the pointers it is working with.

However, as the paper I linked above shows, using integers here (that carry pointer values, still with provenance) is not entirely free of problems either (there are some arithmetic optimizations on integers that go wrong if integers are allowed to actually be pointers). Really what we would want is to be able to atomically load and store pointers in memory without changing their type ever -- but LLVM's atomicrmw only works with integer types, probably because LLVM devs thought of ptr-int-casts as NOPs, so they saw no need for truly ptr-typed atomic instructions.

I have no objection to this PR itself, but I think it would be a good idea to open an issue on Miri. It might make sense for Miri to store some extra information during int-ptr casts to allow the leak checker to avoid false positives (though I'm not sure how difficult this would be).

That might work for some special cases like this one, at the expense of making the core Miri engine notably more complicated (as now there would be things that do not have provenance in the Abstract Machine, but do have provenance in Miri). For a case that is already harder than the one here, imagine that the program stored the pointer cast to an integer and with the last bit of the pointer (guaranteed to be zero due to alignment) used for storing some boolean flag. For more extreme cases, imagine the XOR of two pointers is stored together with one pointer (so the other can be restored by another XOR). Or maybe the program is using some clever packing scheme to fit more pointers into the same memory.

This is a never-ending game of whack-a-mole, which I am not sure is worth pursuing. I like having a clear specification for Miri's leak checker: it will consider pointers, but not integers, when checking which allocations are live. In that sense I lean towards calling this a "wontfix" issue on the Miri side, but could be convinced otherwise.

(I should mention that another option would be to make ptr-to-int casts a NOP again in Miri, and properly preserve provenance until e.g. an arithmetic operation happens. This is different from an approach where we do make the cast lose provenance as far as the program execution is concerned, and preserve provenance solely for the purpose of leak checking. This alternative approach is not currently feasible, but we have some plans that could enable this in the future. However, that would make Miri most likely differ from the actual behavior of the Rust Abstract Machine, where I do not think we can get away with such an approach if we want to use LLVM correctly as a backend.)

If this is the case, then I'm concerned that Miri could run into this same issue on other projects.

Fair. I have several responses to that. First of all, even though Miri is already used on a dozen or more libraries for UB testing, as far as we know none had this particular problem so far. Secondly, if it becomes a problem for a library, it is legitimate to disable the leak check locally if they prefer that over adjusting their code to preserve provenance. And finally, the code this is patching here is somewhat special as it affects every program running on a Windows target.

@jonhoo
Copy link
Contributor

jonhoo commented Apr 10, 2020

I am very much in favor of mentioning this in the documentation somewhere, though I don't quite know where. Maybe in the nomicon too? We could, perhaps, even make it a clippy lint? As someone who does write a decent amount of unsafe code, I would be perfectly happy changing usize casts to transmutes if it helps tools that check for correctness, I just need to know that that's necessary in the first place!

@RalfJung
Copy link
Member Author

RalfJung commented Apr 10, 2020

I am honestly somewhat worried about promoting the "transmute to usize" approach too much, because at least some variants of solving the "type punning" problem described in §8 of this paper actually involve making ptr-to-usize transmutes return poison/undef...

@joelpalmer joelpalmer added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 21, 2020
@RalfJung
Copy link
Member Author

I am giving up on this approach for now. A better solution to enable leak checks on Windows might be to implement the modern RwLock APIs, that side-steps the issue.

Atomic pointer operations remain a problem independent of that, but maybe it's better to make Miri track provenance some more than to introduce transmutes in the code here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants