Skip to content

Commit bdd4bda

Browse files
committed
Auto merge of #62072 - eddyb:generator-memory-index, r=tmandry
rustc: correctly transform memory_index mappings for generators. Fixes #61793, closes #62011 (previous attempt at fixing #61793). During #60187, I made the mistake of suggesting that the (re-)computation of `memory_index` in `ty::layout`, after generator-specific logic split/recombined fields, be done off of the `offsets` of those fields (which needed to be computed anyway), as opposed to the `memory_index`. `memory_index` maps each field to its in-memory order index, which ranges over the same `0..n` values as the fields themselves, making it a bijective mapping, and more specifically a permutation (indeed, it's the permutation resulting from field reordering optimizations). Each field has an unique "memory index", meaning a sort based on them, even an unstable one, will not put them in the wrong order. But offsets don't have that property, because of ZSTs (which do not increase the offset), so sorting based on the offset of fields alone can (and did) result in wrong orders. Instead of going back to sorting based on (slices/subsets of) `memory_index`, or special-casing ZSTs to make sorting based on offsets produce the right results (presumably), as #62011 does, I opted to drop sorting altogether and focus on `O(n)` operations involving *permutations*: * a permutation is easily inverted (see the `invert_mapping` `fn`) * an `inverse_memory_index` was already employed in other parts of the `ty::layout` code (that is, a mapping from memory order to field indices) * inverting twice produces the original permutation, so you can invert, modify, and invert again, if it's easier to modify the inverse mapping than the direct one * you can modify/remove elements in a permutation, as long as the result remains dense (i.e. using every integer in `0..len`, without gaps) * for splitting a `0..n` permutation into disjoint `0..x` and `x..n` ranges, you can pick the elements based on a `i < x` / `i >= x` predicate, and for the latter, also subtract `x` to compact the range to `0..n-x` * in the general case, for taking an arbitrary subset of the permutation, you need a renumbering from that subset to a dense `0..subset.len()` - but notably, this is still `O(n)`! * you can merge permutations, as long as the result remains disjoint (i.e. each element is unique) * for concatenating two `0..n` and `0..m` permutations, you can renumber the elements in the latter to `n..n+m` * some of these operations can be combined, and an inverse mapping (be it a permutation or not) can still be used instead of a forward one by changing the "domain" of the loop performing the operation I wish I had a nicer / more mathematical description of the recombinations involved, but my focus was to fix the bug (in a way which preserves information more directly than sorting would), so I may have missed potential changes in the surrounding generator layout code, that would make this all more straight-forward. r? @tmandry
2 parents 5f9c044 + fad27df commit bdd4bda

File tree

3 files changed

+107
-38
lines changed

3 files changed

+107
-38
lines changed

src/librustc/ty/layout.rs

+78-37
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,19 @@ enum StructKind {
226226
Prefixed(Size, Align),
227227
}
228228

229+
// Invert a bijective mapping, i.e. `invert(map)[y] = x` if `map[x] = y`.
230+
// This is used to go between `memory_index` (source field order to memory order)
231+
// and `inverse_memory_index` (memory order to source field order).
232+
// See also `FieldPlacement::Arbitrary::memory_index` for more details.
233+
// FIXME(eddyb) build a better abstraction for permutations, if possible.
234+
fn invert_mapping(map: &[u32]) -> Vec<u32> {
235+
let mut inverse = vec![0; map.len()];
236+
for i in 0..map.len() {
237+
inverse[map[i] as usize] = i as u32;
238+
}
239+
inverse
240+
}
241+
229242
impl<'tcx> LayoutCx<'tcx, TyCtxt<'tcx>> {
230243
fn scalar_pair(&self, a: Scalar, b: Scalar) -> LayoutDetails {
231244
let dl = self.data_layout();
@@ -303,7 +316,9 @@ impl<'tcx> LayoutCx<'tcx, TyCtxt<'tcx>> {
303316
// That is, if field 5 has offset 0, the first element of inverse_memory_index is 5.
304317
// We now write field offsets to the corresponding offset slot;
305318
// field 5 with offset 0 puts 0 in offsets[5].
306-
// At the bottom of this function, we use inverse_memory_index to produce memory_index.
319+
// At the bottom of this function, we invert `inverse_memory_index` to
320+
// produce `memory_index` (see `invert_mapping`).
321+
307322

308323
let mut offset = Size::ZERO;
309324

@@ -360,13 +375,9 @@ impl<'tcx> LayoutCx<'tcx, TyCtxt<'tcx>> {
360375
// Field 5 would be the first element, so memory_index is i:
361376
// Note: if we didn't optimize, it's already right.
362377

363-
let mut memory_index;
378+
let memory_index;
364379
if optimize {
365-
memory_index = vec![0; inverse_memory_index.len()];
366-
367-
for i in 0..inverse_memory_index.len() {
368-
memory_index[inverse_memory_index[i] as usize] = i as u32;
369-
}
380+
memory_index = invert_mapping(&inverse_memory_index);
370381
} else {
371382
memory_index = inverse_memory_index;
372383
}
@@ -1311,18 +1322,7 @@ impl<'tcx> LayoutCx<'tcx, TyCtxt<'tcx>> {
13111322
) -> Result<&'tcx LayoutDetails, LayoutError<'tcx>> {
13121323
use SavedLocalEligibility::*;
13131324
let tcx = self.tcx;
1314-
let recompute_memory_index = |offsets: &[Size]| -> Vec<u32> {
1315-
debug!("recompute_memory_index({:?})", offsets);
1316-
let mut inverse_index = (0..offsets.len() as u32).collect::<Vec<_>>();
1317-
inverse_index.sort_unstable_by_key(|i| offsets[*i as usize]);
13181325

1319-
let mut index = vec![0; offsets.len()];
1320-
for i in 0..index.len() {
1321-
index[inverse_index[i] as usize] = i as u32;
1322-
}
1323-
debug!("recompute_memory_index() => {:?}", index);
1324-
index
1325-
};
13261326
let subst_field = |ty: Ty<'tcx>| { ty.subst(tcx, substs.substs) };
13271327

13281328
let info = tcx.generator_layout(def_id);
@@ -1349,14 +1349,34 @@ impl<'tcx> LayoutCx<'tcx, TyCtxt<'tcx>> {
13491349
// get included in each variant that requested them in
13501350
// GeneratorLayout.
13511351
debug!("prefix = {:#?}", prefix);
1352-
let (outer_fields, promoted_offsets) = match prefix.fields {
1353-
FieldPlacement::Arbitrary { mut offsets, .. } => {
1354-
let offsets_b = offsets.split_off(discr_index + 1);
1352+
let (outer_fields, promoted_offsets, promoted_memory_index) = match prefix.fields {
1353+
FieldPlacement::Arbitrary { mut offsets, memory_index } => {
1354+
let mut inverse_memory_index = invert_mapping(&memory_index);
1355+
1356+
// "a" (`0..b_start`) and "b" (`b_start..`) correspond to
1357+
// "outer" and "promoted" fields respectively.
1358+
let b_start = (discr_index + 1) as u32;
1359+
let offsets_b = offsets.split_off(b_start as usize);
13551360
let offsets_a = offsets;
13561361

1357-
let memory_index = recompute_memory_index(&offsets_a);
1358-
let outer_fields = FieldPlacement::Arbitrary { offsets: offsets_a, memory_index };
1359-
(outer_fields, offsets_b)
1362+
// Disentangle the "a" and "b" components of `inverse_memory_index`
1363+
// by preserving the order but keeping only one disjoint "half" each.
1364+
// FIXME(eddyb) build a better abstraction for permutations, if possible.
1365+
let inverse_memory_index_b: Vec<_> =
1366+
inverse_memory_index.iter().filter_map(|&i| i.checked_sub(b_start)).collect();
1367+
inverse_memory_index.retain(|&i| i < b_start);
1368+
let inverse_memory_index_a = inverse_memory_index;
1369+
1370+
// Since `inverse_memory_index_{a,b}` each only refer to their
1371+
// respective fields, they can be safely inverted
1372+
let memory_index_a = invert_mapping(&inverse_memory_index_a);
1373+
let memory_index_b = invert_mapping(&inverse_memory_index_b);
1374+
1375+
let outer_fields = FieldPlacement::Arbitrary {
1376+
offsets: offsets_a,
1377+
memory_index: memory_index_a,
1378+
};
1379+
(outer_fields, offsets_b, memory_index_b)
13601380
}
13611381
_ => bug!(),
13621382
};
@@ -1386,30 +1406,51 @@ impl<'tcx> LayoutCx<'tcx, TyCtxt<'tcx>> {
13861406
StructKind::Prefixed(prefix_size, prefix_align.abi))?;
13871407
variant.variants = Variants::Single { index };
13881408

1389-
let offsets = match variant.fields {
1390-
FieldPlacement::Arbitrary { offsets, .. } => offsets,
1409+
let (offsets, memory_index) = match variant.fields {
1410+
FieldPlacement::Arbitrary { offsets, memory_index } => {
1411+
(offsets, memory_index)
1412+
}
13911413
_ => bug!(),
13921414
};
13931415

13941416
// Now, stitch the promoted and variant-only fields back together in
13951417
// the order they are mentioned by our GeneratorLayout.
1396-
let mut next_variant_field = 0;
1397-
let mut combined_offsets = Vec::new();
1398-
for local in variant_fields.iter() {
1399-
match assignments[*local] {
1418+
// Because we only use some subset (that can differ between variants)
1419+
// of the promoted fields, we can't just pick those elements of the
1420+
// `promoted_memory_index` (as we'd end up with gaps).
1421+
// So instead, we build an "inverse memory_index", as if all of the
1422+
// promoted fields were being used, but leave the elements not in the
1423+
// subset as `INVALID_FIELD_IDX`, which we can filter out later to
1424+
// obtain a valid (bijective) mapping.
1425+
const INVALID_FIELD_IDX: u32 = !0;
1426+
let mut combined_inverse_memory_index =
1427+
vec![INVALID_FIELD_IDX; promoted_memory_index.len() + memory_index.len()];
1428+
let mut offsets_and_memory_index = offsets.into_iter().zip(memory_index);
1429+
let combined_offsets = variant_fields.iter().enumerate().map(|(i, local)| {
1430+
let (offset, memory_index) = match assignments[*local] {
14001431
Unassigned => bug!(),
14011432
Assigned(_) => {
1402-
combined_offsets.push(offsets[next_variant_field]);
1403-
next_variant_field += 1;
1433+
let (offset, memory_index) = offsets_and_memory_index.next().unwrap();
1434+
(offset, promoted_memory_index.len() as u32 + memory_index)
14041435
}
14051436
Ineligible(field_idx) => {
14061437
let field_idx = field_idx.unwrap() as usize;
1407-
combined_offsets.push(promoted_offsets[field_idx]);
1438+
(promoted_offsets[field_idx], promoted_memory_index[field_idx])
14081439
}
1409-
}
1410-
}
1411-
let memory_index = recompute_memory_index(&combined_offsets);
1412-
variant.fields = FieldPlacement::Arbitrary { offsets: combined_offsets, memory_index };
1440+
};
1441+
combined_inverse_memory_index[memory_index as usize] = i as u32;
1442+
offset
1443+
}).collect();
1444+
1445+
// Remove the unused slots and invert the mapping to obtain the
1446+
// combined `memory_index` (also see previous comment).
1447+
combined_inverse_memory_index.retain(|&i| i != INVALID_FIELD_IDX);
1448+
let combined_memory_index = invert_mapping(&combined_inverse_memory_index);
1449+
1450+
variant.fields = FieldPlacement::Arbitrary {
1451+
offsets: combined_offsets,
1452+
memory_index: combined_memory_index,
1453+
};
14131454

14141455
size = size.max(variant.size);
14151456
align = align.max(variant.align);

src/librustc_target/abi/mod.rs

+10-1
Original file line numberDiff line numberDiff line change
@@ -699,7 +699,16 @@ pub enum FieldPlacement {
699699
offsets: Vec<Size>,
700700

701701
/// Maps source order field indices to memory order indices,
702-
/// depending how fields were permuted.
702+
/// depending on how the fields were reordered (if at all).
703+
/// This is a permutation, with both the source order and the
704+
/// memory order using the same (0..n) index ranges.
705+
///
706+
/// Note that during computation of `memory_index`, sometimes
707+
/// it is easier to operate on the inverse mapping (that is,
708+
/// from memory order to source order), and that is usually
709+
/// named `inverse_memory_index`.
710+
///
711+
// FIXME(eddyb) build a better abstraction for permutations, if possible.
703712
// FIXME(camlorn) also consider small vector optimization here.
704713
memory_index: Vec<u32>
705714
}
+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
// This testcase used to ICE in codegen due to inconsistent field reordering
2+
// in the generator state, claiming a ZST field was after a non-ZST field,
3+
// while those two fields were at the same offset (which is impossible).
4+
// That is, memory ordering of `(X, ())`, but offsets of `((), X)`.
5+
6+
// compile-pass
7+
// edition:2018
8+
9+
#![feature(async_await)]
10+
#![allow(unused)]
11+
12+
async fn foo<F>(_: &(), _: F) {}
13+
14+
fn main() {
15+
foo(&(), || {});
16+
async {
17+
foo(&(), || {}).await;
18+
};
19+
}

0 commit comments

Comments
 (0)