-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_target_uint => read_target_ptr outside rustc_middle (nfc) #84826
read_target_uint => read_target_ptr outside rustc_middle (nfc) #84826
Conversation
This function has no external users at the moment, so lock it down. This logic can be derived from std methods, so it's no real loss.
Some changes occured to rustc_codegen_cranelift cc @bjorn3 |
Most public callers of read_target_uint are using it for the purpose of acquiring a pointer-sized uint. Using read_target_ptr instead makes this usage more clear, and exploits the knowledge of the caller to make the function into a clear "table lookup". This enables making read_target_uint private, and is intended to help simplify refactoring the interpreter in rustc_middle. In cg_clif, one usage of read_target_uint was to obtain a u32, which has been specialized to match the exact numeric width.
3229bc7
to
c04c0b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// as the modules using this don't support wider pointer values | ||
/// FIXME(jubilee): Move this out of here and closer to the things using it! | ||
#[inline] | ||
pub fn read_target_ptr(endian: Endian, ptr_size: usize, bytes: &[u8]) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably take Size
instead of usize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size is a u64, so on a 32-bit host that is over-weighting this input (even tho' the output should be u64 regardless). Might not matter after inlining, I guess.
I thought about adding an enum to rustc_target so that can be matched on directly for more type-strictness, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just standard to use Size
to describe such a property of the target, and the consistency helps avoid bugs. Also 32-bit host performance shouldn't be a significant concern (even tho memory usage in data structures may be).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@workingjubilee, this code doesn't seem terribly idiomatic, and if your concern is input overhead, why not just call bytes.len()
? Besides, it's definitely going to be optimized away during inlining, and if you call bytes.len()
then it's easier for the optimizer to see what the variable means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@workingjubilee Oh, you're probably right! Even taking an input seems a bit silly, now that I think of it. I must be reading too many C codebases lately. 😅
); | ||
read_target_uint(endianness, bytes).unwrap() | ||
read_target_ptr(endianness, ptr_size, bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name makes it seem like it reads the whole pointer instead of just the addend, leaving the target to which the addend is relative out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read_target_usize
perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
r? @oli-obk (highfive ignored the previous re-assignment) |
(Endian::Little, &[a, b, c, d]) => u32::from_le_bytes([a, b, c, d]), | ||
(Endian::Big, &[a, b, c, d]) => u32::from_be_bytes([a, b, c, d]), | ||
(_, _) => unreachable!(), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has become significantly more complex when the complexity was hidden behind a function before. Why was this change made?
I'm afraid I don't see where your future changes are going. To me this PR just makes things more complex without any (to me) visible gain. While I do appreciate having more concrete functions for usize reads, that can be achieved by adding a method that calls the current one and trims the result. Can you talk a bit more about your motivation for this change and the future plans? As it is I would prefer not to accept this PR |
(2, Endian::Little) => u16::from_le_bytes(bytes.try_into().unwrap()) as u64, | ||
(2, Endian::Big) => u16::from_be_bytes(bytes.try_into().unwrap()) as u64, | ||
(4, Endian::Little) => u32::from_le_bytes(bytes.try_into().unwrap()) as u64, | ||
(4, Endian::Big) => u32::from_be_bytes(bytes.try_into().unwrap()) as u64, | ||
(8, Endian::Little) => u64::from_le_bytes(bytes.try_into().unwrap()), | ||
(8, Endian::Big) => u64::from_be_bytes(bytes.try_into().unwrap()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like explicitly handling several possibilities using host integers, when it can be done by e.g. copying bytes
into an [0u8; 8]
, reversing the ..bytes.len()
slice of the array if Endian::Big
, and then using u64::fom_le_bytes
.
@oli does miri already have some uniform (over integer sizes) code for doing this?
Actually, why not just keep using read_target_uint
? It seems like the correct abstraction, am I missing something? If you don't want it to part of the rustc_mir::mir::interpret
interface, we just have to move it around a bit, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does miri already have some uniform (over integer sizes) code for doing this?
Yes, as you noted, read_target_uint is that abstraction. I do see the point in an abstraction returning u64 for usize purposes, but imo that should just be a wrapper around the existing functions.
As to where to place them... if rustc_middle doesn't need them they can go to rustc_mir, otherwise idk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I definitely was overthinking this, in retrospect. Anyways, my main thought was that it should probably exist in, say, rustc_target
or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this probably isn't acceptable without changes, on review. I'm not sure why this @workingjubilee person submitted it. Perhaps to start a conversation? Couldn't they have just poked you on Zulip or something? :^)
But if I had to guess, I would surmise that they were thinking about how to minimize enmeshment between the codegen backends and rustc_middle
and had wound up tinkering with this for a bit as a straw example of what reducing that entanglement might look like. Though it's obvious to me, thinking about it now, that such an effort is probably doomed per se as of course the codegen backends have to depend on rustc_middle
! However, probably some pieces of the miri interpreter and such could be factored out, or if that would expose too much, then alternatively pieces of rustc_middle
like this that are just widely useful and could go into more "utility-oriented" crates, minimizing the weight of what is in the critical path.
/// as the modules using this don't support wider pointer values | ||
/// FIXME(jubilee): Move this out of here and closer to the things using it! | ||
#[inline] | ||
pub fn read_target_ptr(endian: Endian, ptr_size: usize, bytes: &[u8]) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@workingjubilee, this code doesn't seem terribly idiomatic, and if your concern is input overhead, why not just call bytes.len()
? Besides, it's definitely going to be optimized away during inlining, and if you call bytes.len()
then it's easier for the optimizer to see what the variable means.
/// as the modules using this don't support wider pointer values | ||
/// FIXME(jubilee): Move this out of here and closer to the things using it! | ||
#[inline] | ||
pub fn read_target_ptr(endian: Endian, ptr_size: usize, bytes: &[u8]) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@workingjubilee Oh, you're probably right! Even taking an input seems a bit silly, now that I think of it. I must be reading too many C codebases lately. 😅
(2, Endian::Little) => u16::from_le_bytes(bytes.try_into().unwrap()) as u64, | ||
(2, Endian::Big) => u16::from_be_bytes(bytes.try_into().unwrap()) as u64, | ||
(4, Endian::Little) => u32::from_le_bytes(bytes.try_into().unwrap()) as u64, | ||
(4, Endian::Big) => u32::from_be_bytes(bytes.try_into().unwrap()) as u64, | ||
(8, Endian::Little) => u64::from_le_bytes(bytes.try_into().unwrap()), | ||
(8, Endian::Big) => u64::from_be_bytes(bytes.try_into().unwrap()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I definitely was overthinking this, in retrospect. Anyways, my main thought was that it should probably exist in, say, rustc_target
or something like that.
☔ The latest upstream changes (presumably #85376) made this pull request unmergeable. Please resolve the merge conflicts. |
@workingjubilee Ping from triage, what's next steps here? |
@workingjubilee Ping from triage. Still has a merge conflict |
Ah sorry. Missed pings! Going to close this and maybe return to a slightly sideways idea. |
Uses a more narrow function for the actual purposes the callers have put it to, and locks down some previously public functions into private ones. A small refactoring with no functional changes, but hopefully makes future ones easier down the line.
I want to be sure the refactoring in cg_clif is correct, and if any further improvement can be made there, or if there's an obvious better place for read_target_ptr to live, so
r? @bjorn3