-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of repr struct alignment RFC 1358. #39999
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @eddyb (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks pretty solid, I mostly have an issue with naming of abi_align
that conflicts with the existing semantics.
src/librustc/ty/layout.rs
Outdated
pub align: Align, | ||
|
||
/// Alignment of struct fields, for comparison with LLVM type. | ||
pub abi_align: Align, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is confusing IMO, since each Align
tracks two values ("preferred" and "ABI-required") - fields_align
is probably better (if at all needed?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is hard :) The purpose off this field is to keep track of what LLVM thinks the struct's alignment is. This is used in a few different ways:
- Calculating manual field padding when Rust and LLVM have different ideas about alignment
- Telling alloca alignment of over aligned types (I wanted the be conservative and only specify alloca alignment when they the type is over aligned)
- Sanity checking Rust and LLVM type sizes and alignment in https://github.com/bitshifter/rust/blob/6b9c5d1ff6df1e0d6d181fb5049c3108359b47ee/src/librustc_trans/type_of.rs#L120-L133
Originally I called this field base_align
which you also suggested in another comment, so perhaps I should go back to that. Other options might be llvm_align
, default_align
, orig_align
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't be talking about LLVM in librustc
, ideally. How about primitive_align
? That is, the alignment obtained from primitives' alignment, without any user annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
primitive_align
works for me.
src/librustc/ty/layout.rs
Outdated
@@ -541,6 +545,9 @@ pub struct Struct { | |||
pub memory_index: Vec<u32>, | |||
|
|||
pub min_size: Size, | |||
|
|||
/// Size of padding needed by the struct due to requested repr alignment | |||
pub padding: Vec<Size>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this be computed on-demand based on the sizes of the fields and the alignments we can get out of LLVM types? Keep in mind constant enum variant values already have to deal with recomputing padding based on the values at hand, I'd rather generalize that algorithm than have this vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You comment on this later so I guess you are OK with it?
It could be done on demand, but I thought it would be more overhead as it's needed in a couple of places (https://github.com/bitshifter/rust/blob/6b9c5d1ff6df1e0d6d181fb5049c3108359b47ee/src/librustc_trans/adt.rs#L265-L266 and https://github.com/bitshifter/rust/blob/6b9c5d1ff6df1e0d6d181fb5049c3108359b47ee/src/librustc/ty/layout.rs#L775-L776).
If we keep the vector I was thinking of putting it in an Option, since I think most people won't need repr(align)
so it would generally be all zeros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's fine if an on-demand iterator happens to use it.
src/librustc/ty/layout.rs
Outdated
@@ -560,13 +567,27 @@ impl<'a, 'gcx, 'tcx> Struct { | |||
repr: &ReprOptions, kind: StructKind, | |||
scapegoat: Ty<'gcx>) -> Result<Struct, LayoutError<'gcx>> { | |||
let packed = repr.packed; | |||
let repr_align = repr.align; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO copying fields out of the struct like that is confusing, both should be referred as repr.foo
.
src/librustc/ty/layout.rs
Outdated
pub fn memory_index_with_padding_fields(&self, index: usize) -> usize { | ||
let index = self.memory_index[index] as usize; | ||
self.padding.iter().take(index).fold(index, |acc, &pad| | ||
acc + if pad.bytes() > 0 { 1 } else { 0 }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this seems like an okay use of the padding vector.
src/librustc/ty/layout.rs
Outdated
} | ||
|
||
/// An untagged union. | ||
#[derive(PartialEq, Eq, Hash, Debug)] | ||
pub struct Union { | ||
pub align: Align, | ||
pub abi_align: Align, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, although field_align
might not work as well here. member_align
might work for both.
src/librustc/ty/layout.rs
Outdated
@@ -1513,6 +1603,30 @@ impl<'a, 'gcx, 'tcx> Layout { | |||
} | |||
} | |||
} | |||
|
|||
/// Returns alignment before repr alignment is applied | |||
pub fn abi_align(&self, dl: &TargetDataLayout) -> Align { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe content_align
? Or, given that it seems to be recursive, base_align
? (user-specified alignment is still ABI alignment, in the sense that it affects padding, whereas preferred alignment is for allocations, e.g. to speed up SSE).
src/librustc/ty/mod.rs
Outdated
@@ -1363,6 +1364,7 @@ pub struct ReprOptions { | |||
pub packed: bool, | |||
pub simd: bool, | |||
pub int: Option<attr::IntType>, | |||
pub align: Option<u64>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use an u16
and default it to 0
(we don't support larger alignments anyway - if you want to, you can use a pair of bytes Align
instead of just a byte, bringing the max alignment from 2^15
to 2^255
).
src/librustc_typeck/check/mod.rs
Outdated
@@ -1294,6 +1298,33 @@ pub fn check_simd<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, sp: Span, def_id: DefId | |||
} | |||
} | |||
|
|||
fn has_align<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, root_def_id: DefId, def_id: DefId) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it could easily go into infinite recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was assuming that structs couldn't reference themselves but probably that's assuming too much :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, this only applies to nested struct
s? That explains why no regressions, however, you probably want to handle it so you don't infinitely recurse in the compiler when some other part of the compiler wants to report a non-fatal error about it (see AdtDef::calculate_sized_constraint_inner
, you probably don't need the cache though, just the stack to check for cycles).
src/libsyntax/attr.rs
Outdated
if let Ok(align) = u64::from_str(&*value.as_str()) { | ||
if align.is_power_of_two() { | ||
// rustc::ty::layout::Align restricts align to <= 32768 | ||
if align <= 32768 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh there it is 😄.
src/libsyntax/attr.rs
Outdated
} | ||
} else { | ||
span_err!(diagnostic, item.span, E0584, | ||
"alignment representation hint is not a u64"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we could probably use a single error message for an invalid #[repr(align)]
(also, "hint" is probably the wrong thing to call it?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure what the done thing was, lots of specific errors versus one general one. The "hint" part was just from copying the language used in other error messages. I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased with master and got conflicts on these error numbers, so this all got a bit messed up. Still working on the other feedback.
src/libsyntax/attr.rs
Outdated
if unrecognised { | ||
// Not a word we recognize | ||
span_err!(diagnostic, item.span, E0552, | ||
"unrecognized representation hint"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the one above it seem like they serve the same function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it's because there are different code paths for name and named value and for a while I had different error messages. I'll collapse to a single message.
src/libsyntax/attr.rs
Outdated
@@ -877,10 +886,38 @@ pub fn find_repr_attrs(diagnostic: &Handler, attr: &Attribute) -> Vec<ReprAttr> | |||
if let Some(h) = hint { | |||
acc.push(h); | |||
} | |||
} else if let Some((name, value)) = item.name_value_str() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you want to support only #[repr(align = "16")]
but not #[repr(align(16))]
?
@alexcrichton used to think otherwise, but I hope he can be persuaded to only allow the latter form, as it's already implemented and thus likelier to be stabilized first. In fact, we can stabilize it for specific attributes (like this one) and leave the general case unstable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My historical position was that #[repr(align(16))]
is grammatically invalid (e.g. would simply not parse), but if we've changed the grammar since then then that seems fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, [repr(align(16))]
is much nicer. Do you know what else is using that form so I can see how it's implemented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure anything else does that already, but for the record, it's a MetaItemKind::List
(.meta_item_list()
on Attribute
) with a single NestedMetaItemKind::Literal
element (.literal()
on NestedMetaItem
), that's a LitKind::Int(16, LitIntType::Unsuffixed)
.
☔ The latest upstream changes (presumably #39765) made this pull request unmergeable. Please resolve the merge conflicts. |
6b9c5d1
to
a78d0a9
Compare
@eddyb hopefully I've address all your feedback. I made a couple of additional changes:
|
src/librustc/ty/layout.rs
Outdated
@@ -546,7 +546,8 @@ pub struct Struct { | |||
|
|||
pub min_size: Size, | |||
|
|||
/// Size of padding needed by the struct due to requested repr alignment | |||
/// Size of padding needed by the struct due to requested repr alignment. | |||
/// This is created as required and will be empty if the structs fields require no padding. | |||
pub padding: Vec<Size>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tempted to say that we should have offsets and field sizes, which is more general, and then creating a LLVM type can compute how much padding to use between each two consecutive fields. Check out rust-lang/miri#146 (comment), where I have a potential use for this: removing the implied discriminant field of enum
variant Struct
s, which in an Vec<(field_offset, field_size)>
encoding just be the first field starting e.g. 4
bytes in and thus the LLVM type having a padding field there.
We might also be able to always emit padding fields (sometimes zero-sized) and get LLVM to play along.
cc @rust-lang/compiler Anyone think this is terrible idea? IMO it might be better than memory_index_with_padding_fields
which makes field access O(n)
instead of O(1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean replacing the Struct
offset
Vec
with (field_offset, field_size)
tuple and calculating padding on demand when the LLVM type is created? This on it's own wouldn't remove the need for memory_index_with_padding
though and translating the index to include padding fields would get more expensive. If you stored the adjusted memory index also then that could be avoided.
I did think about always emitting padding even for primitive aligned types but I wasn't sure what impact that might have on LLVM internals, does that mean a lot more work for it to do? Is it possible to emit a zero sized field? Generally I've taken a cautious approach and only changed behaviour if repr align is being used. It would simplify some things if you could assume that every field has a padding field though.
BTW, the memory_index_with_padding_fields
is now only O(n)
if repr align is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, [0 x i8]
or {}
(LLVM's 0-tuple, our ()
) both work to that effect (LLVM ZST).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so if we emit all padding fields we can make memory_index_with_padding_fields
always O(1)
(basically index * 2) and calculate padding on demand. Will wait and see what @rust-lang/compiler says.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did have a couple of unsuccessful attempts at always emitting padding fields. I basically wasn't able to get to the point where I could actually build the compiler, I think it was libcore that I was failing on. I'm not sure how to best debug that. In any case it seemed like it might be a non-trivial change, it seems somewhat independent to the repr align changes, but would make the repr align stuff cleaner. Was there any comment from the compiler team about it?
src/librustc/ty/layout.rs
Outdated
#[inline] | ||
pub fn memory_index_with_padding_fields(&self, index: usize) -> usize { | ||
if self.padding.is_empty() { | ||
index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops. This should be self.memory_index[index]
.
☔ The latest upstream changes (presumably #40008) made this pull request unmergeable. Please resolve the merge conflicts. |
#![feature(attr_literals)] | ||
#![allow(dead_code)] | ||
|
||
#[repr(align(16))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these attributes be feature-gated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear: I know there is the attr_literals
feature gate, but I mean should align
specifically be feature-gated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I have the rustc experience to answer that question :) Happy to add a feature gate if it's needed.
Continuing #39999 (comment): at the compiler teem meeting this week @nikomatsakis seemed fine with the change to always emit padding fields, as long as it had no impact on ABI (and AFAICT, it can't). @bitshifter Can you push the branch that you couldn't get working somewhere? Maybe I can help. |
@eddyb sure, I haven't spent a huge amount of time on it so probably I'm just doing something silly. I've committed my current WIP to bitshifter@00eb421. What I'm hitting at the moment is some of my asserts around expected offsets and sizes in |
@eddyb I've pushed some changes to that branch so the padding fields are hopefully correct now. I've added a lot of asserts and debug output to Just as I was writing this comment I was trying to attach rust-gdb to the compiler's pid, which seemed to have worked, and I have a stack trace http://pastebin.com/3b8E8Xez. I might take a break from this for now but that trace looks like it should point me in the right direction. If you have any insights or tips for debugging this let me now. I guess in particular I don't have a good understanding of the different phases of building the compiler. Stage 0 is just downloaded right? Stage 1 appears to start self hosting? At this point it must have built some of the compiler and has started building other libs with that I assume. |
More invesitgation, log output for
That source line is in The crash is inside |
@bitshifter Oh I haven't gotten a chance to look at it, I still need to bring my own branch to a point where I can rebase yours on it to remove any suspicions of ABI confusion. But if it's debuginfo, that could be missing changes in |
The pointer for |
@bitshifter Can you log (just before this) |
@eddyb log the |
@bitshifter |
@eddyb Ah, that's helpful. I've pushed a fix which seems to get me past that problem. The next issue is:
Don't know if the |
It seems that |
Constants are cast so you wouldn't see that. Seems like more field index confusion? |
@eddyb Maybe something to do with
Perhaps the The code it's failing to compile is most likely the
|
@bitshifter We should not be emitting vector types, they're useless for this 😞 |
@eddyb I guess this code is trying to not add extra fields for padding. I wonder what the motivation for using a vector here was. Maybe I could just remove the vector branch. |
@eddyb I think I understand what this code is trying to do - to create space for the tagged enum variants using a type that has the same alignment as the max enum variant alignment. I'm not sure why this is necessary though? This code is really old, I think the vector code path was added 3 years ago in 8658a43. The layout should store the maximum variant's alignment and I think this is the alignment that gets used. That's the case for repr aligned variants anyway. Can you see any problem with just emitting i8's for the enum contents? I'm trying it now, so far it appears to work, but perhaps there are arch differences with this approach. This is all assuming that this is the cause of the problem on ARM of course. |
@bitshifter It's just a belief that larger elements are better, and I think vectors were used to get 16-byte elements but I don't know more. You can only really do it for |
@eddyb If the alignment doesn't matter, perhaps the easiest thing to do is just remove that assert. |
@bitshifter Hmm now that I think more about it, I believe the problem is comparing an alignment from LLVM with |
@eddyb that would make sense, except for some reason this assert isn't firing on x86_64. I really don't understand why not. |
@bitshifter Just give an |
@eddyb I did try that, it still didn't assert... I'll try changing the assert to primitive align anyway and see what happens. |
@bitshifter Sorry, I meant something like |
Hopefully will fix assert on ARM where vector types are being used as the fill type for enums containing repr aligned types greater than the largest possible native type, thus don't match the Layout's alignment and triggers an assert.
@bors r+ |
📌 Commit 946f8e6 has been approved by |
@eddyb weirdly, the 64 bit aligned Anyway, I don't see any problem with the change I made, so hopefully it passes homu tests. I don't think alignment should be important here, as it should be handled by the |
@bitshifter I am dumb, 64 bytes is AVX512. I should've said 1024 bytes or something. |
I did try 1024 yesterday with the enum sample I pasted on an earlier comment
…On 22 Apr 2017 3:46 p.m., "Eduard-Mihai Burtescu" ***@***.***> wrote:
@bitshifter <https://github.com/bitshifter> I am dumb, 64 bytes is
AVX512. I should've said 1024 bytes or something.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#39999 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAISFMhQH2IUEbnUTE1NdZ-Ny4Iz9prwks5ryZQngaJpZM4MHQVz>
.
|
Implementation of repr struct alignment RFC 1358. The main changes around rustc::ty::Layout::struct: * Added abi_align field which stores abi alignment before repr align is applied * align field contains transitive repr alignment * Added padding vec which stores padding required after fields The main user of this information is rustc_trans::adt::struct_llfields which determines the LLVM fields to be used by LLVM, including padding fields. A possible future optimisation would be to put the padding Vec in an Option, since it will be unused unless you are using repr align.
☀️ Test successful - status-appveyor, status-travis |
I should have edited the PR description before it got merged, it's pretty out of date :) There's some follow up things that should probably be done in the future:
What would the process be for tracking these, create some new issues? |
@bitshifter I'd put them in the tracking issue, if there is one for |
Allow #[repr(align(x))] on enums (#57996) Tracking issue: #57996 Implements an extension of [RFC 1358](https://github.com/rust-lang/rfcs/blob/master/text/1358-repr-align.md) behind a feature flag (`repr_align_enum`). Originally introduced here for structs: #39999. It seems like only HIR-level changes are required, since enums are already aware of their alignment (due to alignment of their limbs). cc @bitshifter
Allow #[repr(align(x))] on enums (#57996) Tracking issue: #57996 Implements an extension of [RFC 1358](https://github.com/rust-lang/rfcs/blob/master/text/1358-repr-align.md) behind a feature flag (`repr_align_enum`). Originally introduced here for structs: #39999. It seems like only HIR-level changes are required, since enums are already aware of their alignment (due to alignment of their limbs). cc @bitshifter
The main changes around rustc::ty::Layout::struct:
The main user of this information is rustc_trans::adt::struct_llfields
which determines the LLVM fields to be used by LLVM, including padding
fields.
A possible future optimisation would be to put the padding Vec in an Option, since it will be unused unless you are using repr align.