Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Avoid creating illegal byref pointers #17524

Merged
merged 2 commits into from
Apr 14, 2018

Conversation

BruceForstall
Copy link
Member

Byref pointers need to point within their "host" object -- thus
the alternate name "interior pointers". If the JIT creates and
reports a pointer as a "byref", but it points outside the host
object, and a GC occurs that moves the host object, the byref
pointer will not be updated. If a subsequent calculation puts
the byref "back" into the host object, it will actually be pointing
to garbage, since the host object has moved.

This occurred on ARM with array index calculations, in particular
because ARM doesn't have a single-instruction "base + scale*index + offset"
addressing mode. Thus, we were generating, for the jaggedarr_cs_do
test case, ProcessJagged3DArray() function:

// r0 = array object, r6 = computed index offset. We mark r4 as a byref.
add r4, r0, r6

// r4 - 32 is the offset of the object we care about. Then we load the array element.
// In this case, the loaded element is a gcref, so r4 becomes a gcref.
ldr r4, [r4-32]

We get this math because the user code uses a[i - 10], which is
essentially a + (i - 10) * 4 + 8 for element size 4. This is optimized
to a + i * 4 - 32. In the above code, r6 is i * 4. In this case,
after the first instruction, r4 can point beyond the array.
If a GC happens, r4 isn't updated, and the second instruction loads garbage.

There are two fixes:

  1. Change array morphing in fgMorphArrayIndex() to rearrange the array index
    IR node creation to only create a byref pointer that is precise, and no "intermediate"
    byref pointers that don't represent the actual array element address being
    computed.
  2. Change fgMoveOpsLeft() to prevent the left-weighted reassociation optimization
    [byref]+ (ref, [int]+ (int, int)) => [byref]+ ([byref]+ (ref, int), int). This
    optimization creates "incorrect" byrefs that don't necessarily point within
    the host object.

These fixes are all-platform.

Fixes #17517.

There are many, many diffs. They, perhaps surprisingly, overwhelmingly positive.

For AMD64 SuperPMI, the diffs are a 7.6% size win for 5194 functions! This
appears to be due to less code cloning, and sometimes better optimization.

For ARM32 ngen-based desktop asm diffs, it is a 0.30% improvement across all
framework assemblies. A lot of the diffs seem to be because we CSE the entire
array address offset expression, not just the index expression.

@BruceForstall
Copy link
Member Author

@dotnet-bot test Ubuntu arm Cross Checked gcstress0xc Build and Test
@dotnet-bot test Ubuntu arm Cross Checked gcstress0xc_jitstress1 Build and Test
@dotnet-bot test Ubuntu arm Cross Checked gcstress0xc_jitstress2 Build and Test
@dotnet-bot test Ubuntu arm Cross Checked jitstress1 Build and Test
@dotnet-bot test Ubuntu arm Cross Checked jitstress2 Build and Test
@dotnet-bot test Ubuntu arm Cross Checked normal Build and Test

@dotnet-bot test Windows_NT arm Cross Checked gcstress0xc Build and Test
@dotnet-bot test Windows_NT arm Cross Checked normal Build and Test

@dotnet-bot test Windows_NT arm64 Cross Checked gcstress0xc Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked normal Build and Test

@dotnet-bot test Windows_NT x86 Checked gcstress0xc
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_jitstress1
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_jitstress2
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_minopts_heapverify1
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_zapdisable
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_zapdisable_heapverify1
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_zapdisable_jitstress2
@dotnet-bot test Windows_NT x86 Checked heapverify1
@dotnet-bot test Windows_NT x86 Checked jitstress1
@dotnet-bot test Windows_NT x86 Checked jitstress2
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs0x10
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs0x1000
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs0x80
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs1
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs2
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs3
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs4
@dotnet-bot test Windows_NT x86 Checked jitstress2_jitstressregs8
@dotnet-bot test Windows_NT x86 Checked jitstressregs0x10
@dotnet-bot test Windows_NT x86 Checked jitstressregs0x1000
@dotnet-bot test Windows_NT x86 Checked jitstressregs0x80
@dotnet-bot test Windows_NT x86 Checked jitstressregs1
@dotnet-bot test Windows_NT x86 Checked jitstressregs2
@dotnet-bot test Windows_NT x86 Checked jitstressregs3
@dotnet-bot test Windows_NT x86 Checked jitstressregs4
@dotnet-bot test Windows_NT x86 Checked jitstressregs8
@dotnet-bot test Windows_NT x86 Checked minopts

@dotnet-bot test Windows_NT x64 Checked gcstress0xc
@dotnet-bot test Windows_NT x64 Checked gcstress0xc_jitstress1
@dotnet-bot test Windows_NT x64 Checked gcstress0xc_jitstress2
@dotnet-bot test Windows_NT x64 Checked gcstress0xc_minopts_heapverify1
@dotnet-bot test Windows_NT x64 Checked gcstress0xc_zapdisable
@dotnet-bot test Windows_NT x64 Checked gcstress0xc_zapdisable_heapverify1
@dotnet-bot test Windows_NT x64 Checked gcstress0xc_zapdisable_jitstress2
@dotnet-bot test Windows_NT x64 Checked heapverify1
@dotnet-bot test Windows_NT x64 Checked jitstress1
@dotnet-bot test Windows_NT x64 Checked jitstress2
@dotnet-bot test Windows_NT x64 Checked jitstress2_jitstressregs0x10
@dotnet-bot test Windows_NT x64 Checked jitstress2_jitstressregs0x1000
@dotnet-bot test Windows_NT x64 Checked jitstress2_jitstressregs0x80
@dotnet-bot test Windows_NT x64 Checked jitstress2_jitstressregs1
@dotnet-bot test Windows_NT x64 Checked jitstress2_jitstressregs2
@dotnet-bot test Windows_NT x64 Checked jitstress2_jitstressregs3
@dotnet-bot test Windows_NT x64 Checked jitstress2_jitstressregs4
@dotnet-bot test Windows_NT x64 Checked jitstress2_jitstressregs8
@dotnet-bot test Windows_NT x64 Checked jitstressregs0x10
@dotnet-bot test Windows_NT x64 Checked jitstressregs0x1000
@dotnet-bot test Windows_NT x64 Checked jitstressregs0x80
@dotnet-bot test Windows_NT x64 Checked jitstressregs1
@dotnet-bot test Windows_NT x64 Checked jitstressregs2
@dotnet-bot test Windows_NT x64 Checked jitstressregs3
@dotnet-bot test Windows_NT x64 Checked jitstressregs4
@dotnet-bot test Windows_NT x64 Checked jitstressregs8
@dotnet-bot test Windows_NT x64 Checked minopts

@dotnet-bot test Ubuntu x64 Checked gcstress0xc

@BruceForstall
Copy link
Member Author

cc @dotnet/jit-contrib

@dotnet/arm64-contrib Not sure if this is a potential problem/fix for arm64 as well.

@RussKeldorph RussKeldorph added this to the 2.1.0 milestone Apr 12, 2018
Copy link
Member

@erozenfeld erozenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with just one question

//
if (varTypeIsGC(op1->TypeGet()) && op2->TypeGet() == TYP_I_IMPL)
{
noway_assert(varTypeIsGC(tree->TypeGet()) && (oper == GT_ADD));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear to me why this must be a noway_assert. Isn't it just conservative to break out of this? Or am I misunderstanding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right. I just hoisted this assert from below. In fact, the code I've written supercedes that case and assert, so perhaps I should clean that up as well.

Copy link

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

Byref pointers need to point within their "host" object -- thus
the alternate name "interior pointers". If the JIT creates and
reports a pointer as a "byref", but it points outside the host
object, and a GC occurs that moves the host object, the byref
pointer will not be updated. If a subsequent calculation puts
the byref "back" into the host object, it will actually be pointing
to garbage, since the host object has moved.

This occurred on ARM with array index calculations, in particular
because ARM doesn't have a single-instruction "base + scale*index + offset"
addressing mode. Thus, we were generating, for the jaggedarr_cs_do
test case, `ProcessJagged3DArray()` function:
```
// r0 = array object, r6 = computed index offset. We mark r4 as a byref.
add r4, r0, r6

// r4 - 32 is the offset of the object we care about. Then we load the array element.
// In this case, the loaded element is a gcref, so r4 becomes a gcref.
ldr r4, [r4-32]
```
We get this math because the user code uses `a[i - 10]`, which is
essentially `a + (i - 10) * 4 + 8` for element size 4. This is optimized
to `a + i * 4 - 32`. In the above code, `r6` is `i * 4`. In this case,
after the first instruction, `r4` can point beyond the array.
If a GC happens, `r4` isn't updated, and the second instruction loads garbage.

There are several fixes:
1. Change array morphing in `fgMorphArrayIndex()` to rearrange the array index
IR node creation to only create a byref pointer that is precise; don't create
"intermediate" byref pointers that don't represent the actual array element
address being computed. The tree matching code that annotates the generated tree
with field sequences needs to be updated to match the new form.
2. Change `fgMoveOpsLeft()` to prevent the left-weighted reassociation optimization
`[byref]+ (ref, [int]+ (int, int)) => [byref]+ ([byref]+ (ref, int), int)`. This
optimization creates "incorrect" byrefs that don't necessarily point within
the host object.
3. Add an additional condition to the `Fold "((x+icon1)+icon2) to (x+(icon1+icon2))"`
morph optimization to prevent merging of constant TYP_REF nodes, which now were
being recognized due to different tree shapes. This was probably always a problem,
but the particular tree shape wasn't seen before.

These fixes are all-platform. However, to reduce risk at this point, the are
enabled for ARM only, under the `FEATURE_PREVENT_BAD_BYREFS` `#ifdef`.

Fixes #17517.

There are many, many diffs.

For ARM32 ngen-based desktop asm diffs, it is a 0.30% improvement across all
framework assemblies. A lot of the diffs seem to be because we CSE the entire
array address offset expression, not just the index expression.
@BruceForstall BruceForstall force-pushed the FixBadArrayByrefGCHole branch from e08df99 to 6dab63d Compare April 13, 2018 23:21
@BruceForstall
Copy link
Member Author

@dotnet-bot test Ubuntu arm Cross Checked gcstress0xc Build and Test
@dotnet-bot test Ubuntu arm Cross Checked gcstress0xc_jitstress1 Build and Test
@dotnet-bot test Ubuntu arm Cross Checked gcstress0xc_jitstress2 Build and Test
@dotnet-bot test Ubuntu arm Cross Checked jitstress1 Build and Test
@dotnet-bot test Ubuntu arm Cross Checked jitstress2 Build and Test
@dotnet-bot test Ubuntu arm Cross Checked normal Build and Test

@dotnet-bot test Windows_NT arm Cross Checked gcstress0xc Build and Test
@dotnet-bot test Windows_NT arm Cross Checked normal Build and Test

@BruceForstall
Copy link
Member Author

@briansull @CarolEidt @erozenfeld I put all the code under #ifdef for ARM only, to avoid perturbing other platforms at this point. I also fixed a couple bugs that showed up due to various tree pattern matching that changed with the new code.

Copy link

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RyuJIT/arm32][GCStress=C] jaggedarr_cs_do.cmd failure
6 participants