-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
miri engine: basic support for pointer provenance tracking #54461
Conversation
This comment has been minimized.
This comment has been minimized.
The mono hash map should almost certainly go in librustc_data_structures |
It is awfully specific in its API -- quirky and limited to what |
As a general note, I'd like to not merge such changes until multiple people have looked at the problem, and decided that the new data structure is needed. EDIT: I looked at it - we planned to do this ages ago, but we actually settled on using arenas instead - could you use a |
I considered using an arena, but then when stack frames get popped or boxes dropped, their allocations would not get removed. I could try recycling allocations from the arena, but at that point I'd also write a new data structure. |
Okay, why not use |
You mean, we should use |
@RalfJung Oops, I read your PR description again, so this is for allocating Why not use a slightly different map than |
I don't understand how that would help at all? It'd still be I know you stroked that part of your comment, but my plans are described at https://www.ralfj.de/blog/2018/08/07/stacked-borrows.html :D |
@RalfJung Sorry, I keep getting confused. Something workable could be keeping a map from |
That's not enough information. I need to track every pointer, so I'd need another relocation-like map somewhere.
I think that would get really messy, this is nicer.
|
Adding new unsafe code is not nice, IMO. However, if you want to use unsafe code... |
Not sure where you think a We could have a |
Having talked out the various options with @RalfJung, I have to say that |
This comment has been minimized.
This comment has been minimized.
c82f505
to
5af85da
Compare
@eddyb @oli-obk @nikomatsakis This is currently blocking progress on my main internship project. How do we proceed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed two typos
src/librustc_mir/const_eval.rs
Outdated
fn static_with_default_tag( | ||
alloc: &'_ Allocation | ||
) -> Cow<'_, Allocation<Self::PointerTag>> { | ||
// We do not use a tag so we can just cheapyl forward the reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*cheaply
/// the appropriate tags on each pointer. | ||
/// | ||
/// This should avoid copying if no work has to be done! If this returns an owned | ||
/// allocation (because a copy had to be done to add the tags), machibe memory will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*machine
@RalfJung my vote is to land as is; we can always revisit it |
I do agree with @eddyb that it's a good idea to be very cautious with introducing new unsafe abstractions. However, this one feels quite self-contained, and I don't see any other solution to the problem that doesn't involve large scale refactorings (I think there are various approaches one could take if we did want to do large-scale refactoring, but we could also do that as a follow-up, and it's not clear that such a thing would be good). Alternative solutions I can see:
Other than that, I don't think there are other options. |
Rebased and fixed typos. |
79ab37b
to
1d65fe6
Compare
src/librustc_mir/const_eval.rs
Outdated
fn static_with_default_tag( | ||
alloc: &'_ Allocation | ||
) -> Cow<'_, Allocation<Self::PointerTag>> { | ||
// We do not use a tag so we can just cheapyl forward the reference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cheapyl
2eb29e5
to
1cf357d
Compare
Thanks to @oli-obk I was able to get rid of the unsafe code here. :) The actual map used by memory to manage its allocations is now abstracted away through the |
0293a45
to
9a9dbff
Compare
This comment has been minimized.
This comment has been minimized.
Rebased, and I hope tidy is also happy now. |
write!(msg, "└{0:─^1$}┘ ", target, relocation_width as usize).unwrap(); | ||
pos = i + self.pointer_size(); | ||
} | ||
trace!("{}", msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you'll probably want to do tag-printing here at some point, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think so, but I am not sure what the best way is to do that (and to avoid printing useless ()
).
// Downcasts only change the layout | ||
assert_eq!(base.extra, None); | ||
assert!(base.meta.is_none()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the assert_eq
was on purpose, to make sure that base.extra
was debug printed if the assertion failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid an Eq
bound on tags. But I now added that to be able to put them into HashMap
so I guess I could change these all back.
@bors r+ |
📌 Commit bc9435d has been approved by |
miri engine: basic support for pointer provenance tracking This enriches pointers with a new member, `tag`, that can be used to do provenance tracking. This is a new type parameter that propagates up through everything. It defaults to `()` (no tag), which is also the value used by CTFE -- but miri will use another type. The only actually interesting piece here, I think, is what I had to do in the memory's `get`. The problem is that `tcx` (storing the allocations for statics) uses `()` for provenance information. But the machine might need another tag. The machine has a function to do the conversion, but if a conversion actually happened, we need to store the result of this *somewhere* -- we cannot return a pointer into `tcx` as we usually would. So I introduced `MonoHashMap` which uses `RefCell` to be able to insert new entries even when we just have a shared ref. However, it is important that we can also return shared refs into the map without holding the `RefCell` opan. This is achieved by boxing the values stored in the map, so their addresses remain stable even when the map's table gets reallocated. This is all implemented in `mono_hash_map.rs`. NOTE: This PR also contains the commits from #54380 (comment). Only the [last two commits](https://github.com/rust-lang/rust/pull/54461/files/8e74ee0998a5b11f28d61600dbb881c7168a4a40..HEAD) are new.
☀️ Test successful - status-appveyor, status-travis |
Tested on commit rust-lang/rust@2243fab. Direct link to PR: <rust-lang/rust#54461> 💔 miri on windows: test-pass → build-fail (cc @oli-obk @RalfJung @eddyb, @rust-lang/infra). 💔 miri on linux: test-pass → build-fail (cc @oli-obk @RalfJung @eddyb, @rust-lang/infra).
Uh, no, that's odd. It should just have added a bunch of ZST in a couple places. How does one go about debugging such things? |
Oh wait. I got rid of interning vtables... that could be it. I left a FIXME to add a cache. I guess I should put implementing that cache further up on my list.^^ I am not sure if vtables really should be put into |
This enriches pointers with a new member,
tag
, that can be used to do provenance tracking. This is a new type parameter that propagates up through everything. It defaults to()
(no tag), which is also the value used by CTFE -- but miri will use another type.The only actually interesting piece here, I think, is what I had to do in the memory's
get
. The problem is thattcx
(storing the allocations for statics) uses()
for provenance information. But the machine might need another tag. The machine has a function to do the conversion, but if a conversion actually happened, we need to store the result of this somewhere -- we cannot return a pointer intotcx
as we usually would.So I introduced
MonoHashMap
which usesRefCell
to be able to insert new entries even when we just have a shared ref. However, it is important that we can also return shared refs into the map without holding theRefCell
opan. This is achieved by boxing the values stored in the map, so their addresses remain stable even when the map's table gets reallocated. This is all implemented inmono_hash_map.rs
.NOTE: This PR also contains the commits from #54380 (comment). Only the last two commits are new.