Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_target_uint => read_target_ptr outside rustc_middle (nfc) #84826

Closed

Conversation

workingjubilee
Copy link
Member

Uses a more narrow function for the actual purposes the callers have put it to, and locks down some previously public functions into private ones. A small refactoring with no functional changes, but hopefully makes future ones easier down the line.

I want to be sure the refactoring in cg_clif is correct, and if any further improvement can be made there, or if there's an obvious better place for read_target_ptr to live, so

r? @bjorn3

This function has no external users at the moment, so lock it down.
This logic can be derived from std methods, so it's no real loss.
@rust-highfive
Copy link
Contributor

Some changes occured to rustc_codegen_cranelift

cc @bjorn3

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 2, 2021
Most public callers of read_target_uint are using it for the purpose
of acquiring a pointer-sized uint. Using read_target_ptr instead
makes this usage more clear, and exploits the knowledge of the caller
to make the function into a clear "table lookup".

This enables making read_target_uint private, and is intended to help
simplify refactoring the interpreter in rustc_middle.

In cg_clif, one usage of read_target_uint was to obtain a u32, which
has been specialized to match the exact numeric width.
@workingjubilee workingjubilee force-pushed the refactor-target-uint branch from 3229bc7 to c04c0b9 Compare May 2, 2021 16:28
Copy link
Member

@bjorn3 bjorn3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to cg_clif seem correct. I don't personally see the point of this, but I will leave it to @oli-obk to decide.

r? @oli-obk

/// as the modules using this don't support wider pointer values
/// FIXME(jubilee): Move this out of here and closer to the things using it!
#[inline]
pub fn read_target_ptr(endian: Endian, ptr_size: usize, bytes: &[u8]) -> u64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably take Size instead of usize.

Copy link
Member Author

@workingjubilee workingjubilee May 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size is a u64, so on a 32-bit host that is over-weighting this input (even tho' the output should be u64 regardless). Might not matter after inlining, I guess.

I thought about adding an enum to rustc_target so that can be matched on directly for more type-strictness, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just standard to use Size to describe such a property of the target, and the consistency helps avoid bugs. Also 32-bit host performance shouldn't be a significant concern (even tho memory usage in data structures may be).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee, this code doesn't seem terribly idiomatic, and if your concern is input overhead, why not just call bytes.len()? Besides, it's definitely going to be optimized away during inlining, and if you call bytes.len() then it's easier for the optimizer to see what the variable means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee Oh, you're probably right! Even taking an input seems a bit silly, now that I think of it. I must be reading too many C codebases lately. 😅

);
read_target_uint(endianness, bytes).unwrap()
read_target_ptr(endianness, ptr_size, bytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name makes it seem like it reads the whole pointer instead of just the addend, leaving the target to which the addend is relative out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_target_usize perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
@bjorn3
Copy link
Member

bjorn3 commented May 2, 2021

r? @oli-obk

(highfive ignored the previous re-assignment)

@rust-highfive rust-highfive assigned oli-obk and unassigned bjorn3 May 2, 2021
(Endian::Little, &[a, b, c, d]) => u32::from_le_bytes([a, b, c, d]),
(Endian::Big, &[a, b, c, d]) => u32::from_be_bytes([a, b, c, d]),
(_, _) => unreachable!(),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has become significantly more complex when the complexity was hidden behind a function before. Why was this change made?

@oli-obk
Copy link
Contributor

oli-obk commented May 2, 2021

I'm afraid I don't see where your future changes are going. To me this PR just makes things more complex without any (to me) visible gain.

While I do appreciate having more concrete functions for usize reads, that can be achieved by adding a method that calls the current one and trims the result.

Can you talk a bit more about your motivation for this change and the future plans? As it is I would prefer not to accept this PR

Comment on lines +599 to +604
(2, Endian::Little) => u16::from_le_bytes(bytes.try_into().unwrap()) as u64,
(2, Endian::Big) => u16::from_be_bytes(bytes.try_into().unwrap()) as u64,
(4, Endian::Little) => u32::from_le_bytes(bytes.try_into().unwrap()) as u64,
(4, Endian::Big) => u32::from_be_bytes(bytes.try_into().unwrap()) as u64,
(8, Endian::Little) => u64::from_le_bytes(bytes.try_into().unwrap()),
(8, Endian::Big) => u64::from_be_bytes(bytes.try_into().unwrap()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like explicitly handling several possibilities using host integers, when it can be done by e.g. copying bytes into an [0u8; 8], reversing the ..bytes.len() slice of the array if Endian::Big, and then using u64::fom_le_bytes.

@oli does miri already have some uniform (over integer sizes) code for doing this?

Actually, why not just keep using read_target_uint? It seems like the correct abstraction, am I missing something? If you don't want it to part of the rustc_mir::mir::interpret interface, we just have to move it around a bit, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does miri already have some uniform (over integer sizes) code for doing this?

Yes, as you noted, read_target_uint is that abstraction. I do see the point in an abstraction returning u64 for usize purposes, but imo that should just be a wrapper around the existing functions.

As to where to place them... if rustc_middle doesn't need them they can go to rustc_mir, otherwise idk

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I definitely was overthinking this, in retrospect. Anyways, my main thought was that it should probably exist in, say, rustc_target or something like that.

Copy link
Member Author

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this probably isn't acceptable without changes, on review. I'm not sure why this @workingjubilee person submitted it. Perhaps to start a conversation? Couldn't they have just poked you on Zulip or something? :^)

But if I had to guess, I would surmise that they were thinking about how to minimize enmeshment between the codegen backends and rustc_middle and had wound up tinkering with this for a bit as a straw example of what reducing that entanglement might look like. Though it's obvious to me, thinking about it now, that such an effort is probably doomed per se as of course the codegen backends have to depend on rustc_middle! However, probably some pieces of the miri interpreter and such could be factored out, or if that would expose too much, then alternatively pieces of rustc_middle like this that are just widely useful and could go into more "utility-oriented" crates, minimizing the weight of what is in the critical path.

/// as the modules using this don't support wider pointer values
/// FIXME(jubilee): Move this out of here and closer to the things using it!
#[inline]
pub fn read_target_ptr(endian: Endian, ptr_size: usize, bytes: &[u8]) -> u64 {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee, this code doesn't seem terribly idiomatic, and if your concern is input overhead, why not just call bytes.len()? Besides, it's definitely going to be optimized away during inlining, and if you call bytes.len() then it's easier for the optimizer to see what the variable means.

/// as the modules using this don't support wider pointer values
/// FIXME(jubilee): Move this out of here and closer to the things using it!
#[inline]
pub fn read_target_ptr(endian: Endian, ptr_size: usize, bytes: &[u8]) -> u64 {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee Oh, you're probably right! Even taking an input seems a bit silly, now that I think of it. I must be reading too many C codebases lately. 😅

Comment on lines +599 to +604
(2, Endian::Little) => u16::from_le_bytes(bytes.try_into().unwrap()) as u64,
(2, Endian::Big) => u16::from_be_bytes(bytes.try_into().unwrap()) as u64,
(4, Endian::Little) => u32::from_le_bytes(bytes.try_into().unwrap()) as u64,
(4, Endian::Big) => u32::from_be_bytes(bytes.try_into().unwrap()) as u64,
(8, Endian::Little) => u64::from_le_bytes(bytes.try_into().unwrap()),
(8, Endian::Big) => u64::from_be_bytes(bytes.try_into().unwrap()),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I definitely was overthinking this, in retrospect. Anyways, my main thought was that it should probably exist in, say, rustc_target or something like that.

@bors
Copy link
Collaborator

bors commented May 19, 2021

☔ The latest upstream changes (presumably #85376) made this pull request unmergeable. Please resolve the merge conflicts.

@crlf0710 crlf0710 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 5, 2021
@crlf0710
Copy link
Member

crlf0710 commented Jun 5, 2021

@workingjubilee Ping from triage, what's next steps here?

@JohnCSimon
Copy link
Member

@workingjubilee Ping from triage. Still has a merge conflict

@workingjubilee
Copy link
Member Author

Ah sorry. Missed pings! Going to close this and maybe return to a slightly sideways idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants