Document the behavior of the CLR when unsafe casts are used #6176

GSPP · 2016-06-18T09:35:23Z

In the System.Runtime.CompilerServices.Unsafe package there is a method that allows unsafe casts from anything to anything:

  .method public hidebysig static !!T As<class T>(object o) cil managed aggressiveinlining
  {
        .custom instance void System.Runtime.Versioning.NonVersionableAttribute::.ctor() = ( 01 00 00 00 )
        .maxstack 1
        ldarg.0
        ret
  } // end of method Unsafe::As

This means that string s = F(); s.GetType(); can be SomeClass or string s = F(); (string)s can throw. In fact it might not throw because the C# compiler or the JIT might optimize that cast out.

Please document the usage rules for unsafe casts in general and document what exact "unsafe" operations are safe to perform on the CLR.

Note, that As<T> can be implemented right now using an unsafe union type e.g.

struct CastHelper<T1, T2> where T1, T2 : class
{ [FieldOffset(0)] T1 T1; [FieldOffset(0)] T2 T2; }

For that reason the question is relevant right now, even without the System.Runtime.CompilerServices.Unsafe package.

It should be documented as well what the CLR thinks about such unsafe structs. In particular I wonder what the optimizing JIT and the GC do with object references that point to an unexpected type.

The text was updated successfully, but these errors were encountered:

jkotas · 2016-06-19T16:42:34Z

These unsafe behaviors are not guaranteed to be the same accross runtimes. My description applies to CLR/CoreCLR.

GC does not care about the exact types, e.g. if type of local object reference variable is not compatible with what is actually stored in it, the GC will still track it fine. The lowest level constructs (load/store fields, call non-virtual non-generic method) work ok too - of course, you have to know what you are doing.

Most of the higher level constructs (virtual methods, etc.) do depend on exact types and can get confused:

Do not call virtual methods on the object references with mismatched types, or perform other similar operations that consult the actual type
Do not pass them to any framework or 3rd party libraries, because of they may do the above on your behalf

buybackoff · 2016-08-16T16:35:40Z

@jkotas Could you please confirm that the following example is "safe" in terms of GC stability, i.e. unsafe casting of an array KVP<DateTme,Decimal> to byte[] and then fixing the pointer to &bytes[0] and using it in unsafe context is "safe in unsafe context", that is it will not corrupt runtime.

In the code below, asBytes.Length is 2 and during debug it is still shown as KeyValuePair<DateTime, decimal>[], but then fixing and unsafe access works as expected. Is this correct usage of the new package? Any previous hacks did not succeed without this package.

using System.Runtime.CompilerServices.Unsafe;
[Test]
public unsafe void CouldUseNewUnsafePackage() {
    var dt = new KeyValuePair<DateTime, decimal>[2];
    dt[0] = new KeyValuePair<DateTime, decimal>(DateTime.UtcNow.Date, 123.456M);
    dt[1] = new KeyValuePair<DateTime, decimal>(DateTime.UtcNow.Date.AddDays(1), 789.101M);
    var obj = (object)dt;
    byte[] asBytes = Unsafe.As<byte[]>(obj);
    //Console.WriteLine(asBytes.Length); // prints 2
    fixed (byte* ptr = &asBytes[0]) {
        // reading this: https://github.com/dotnet/coreclr/issues/5870
        // it looks like we could fix byte[] and actually KeyValuePair<DateTime, decimal>[] will be fixed
        // because:
        // "GC does not care about the exact types, e.g. if type of local object 
        // reference variable is not compatible with what is actually stored in it, 
        // the GC will still track it fine."
        for (int i = 0; i < (8 + 16) * 2; i++) {
            Console.WriteLine(*(ptr + i));
        }
        var firstDate = *(DateTime*)ptr;
        Assert.AreEqual(DateTime.UtcNow.Date, firstDate);
        Console.WriteLine(firstDate);
        var firstDecimal = *(decimal*)(ptr + 8);
        Assert.AreEqual(123.456M, firstDecimal);
        Console.WriteLine(firstDecimal);
        var secondDate = *(DateTime*)(ptr + 8 + 16);
        Assert.AreEqual(DateTime.UtcNow.Date.AddDays(1), secondDate);
        Console.WriteLine(secondDate);
        var secondDecimal = *(decimal*)(ptr + 8 + 16 + 8);
        Assert.AreEqual(789.101M, secondDecimal);
        Console.WriteLine(secondDecimal);
    }
}

jkotas · 2016-08-16T18:04:17Z

It will work in CoreCLR today, but unsafe casting of KeyValuePair[] to byte[] is a bit questionable. There is no guarantee that KeyValuePair[] and byte[] will have same in-memory layout. Some runtimes may insert different amount of padding before the start of the first element based on the array element type.

A bit more robust way would be to do the unsafe cast on the address of the first array element. It requires current preview version of C# compiler (https://github.com/dotnet/roslyn/) and the unsafe package (https://github.com/dotnet/corefx/tree/master/src/System.Runtime.CompilerServices.Unsafe).

var dt = new KeyValuePair<DateTime, decimal>[2];
ref byte asRefByte = ref Unsafe.As<KeyValuePair<DateTime, decimal>, byte>(ref dt[0]);
fixed (byte * ptr = &asRefByte)
{


}

GSPP · 2016-08-16T18:07:08Z

There should be more formal documentation that can be used to clearly decide whether any given piece of code is safe or not. @jkotas your comments are helpful but not sufficient to decide on safety. For example, could the JIT optimize the following assuming that a and b do not alias?

class A { int X; }
class B { int X; }

var a = GetSomeA();
var b = GetSomeB();
a.X = 1;
b.X = 2; //Is b a different object? Who knows...

Console.WriteLine(a.X); //Prints: RefEquals(a, b) ? 2 : 1

If a and b can alias but the JIT assumes them to not alias based on type it might optimize this code to Console.WriteLine(1)!

There needs to be more formal and complete documentation on what effects unsafe operations can cause.

jkotas · 2016-08-16T18:35:23Z

I do not believe that he JIT makes assumptions about no-aliasing today. There are number of variants of it, e.g.:

void foo(ref int x, ref float y)
{
    x = 1;
    y = 2.0;
    if (x == 1) // Can JIT assume that x could not have been modified by the above assignment of float?
    {
       ...
    }
}

I agree it would be nice to have more precise specification for valid unveriable code - that includes unsafe casts. There is some specification for it in ECMA-335, but it is not as precise as the specification for verifiable code. Ideally it should be added to ECMA-335 spec.

cc @dotnet/jit-contrib

JosephTremoulet · 2016-08-16T20:30:32Z

Actually, if I compile this code with the current JIT:

    class A { public int x; }

    class B { public int y; }

    ...
        static int DoIt(A a, B b) {
            int x = a.x;
            b.y = 2;
            return x + a.x;
        }

then the codegen for DoIt does indeed assume that a.x and b.y don't alias:

; Assembly listing for method N.Foo:DoIt(ref,ref):int
; Emitting BLENDED_CODE for X64 CPU with SSE2
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;  V00 arg0         [V00,T00] (  3,   3  )     ref  ->  rcx
;  V01 arg1         [V01,T01] (  3,   3  )     ref  ->  rdx
;  V02 tmp0         [V02,T02] (  2,   4  )     int  ->  rcx
;# V03 OutArgs      [V03    ] (  1,   1  )  lclBlk ( 0) [rsp+0x00]
;  V04 cse0         [V04,T03] (  3,   3  )     int  ->  rax
;
; Lcl frame size = 0

G_M49388_IG01:

G_M49388_IG02:
       8B4108               mov      eax, dword ptr [rcx+8]
       8BC8                 mov      ecx, eax
       C7420802000000       mov      dword ptr [rdx+8], 2
       03C1                 add      eax, ecx

G_M49388_IG03:
       C3                   ret

; Total bytes of code 15, prolog size 0 for method N.Foo:DoIt(ref,ref):int
; ============================================================

It looks like the assumption the JIT makes to acheive this is that different class fields (i.e. with different field handles) will not alias each other so long as CORINFO_FLG_OVERLAPPING_FIELDS is not set for them.

As a more general rule, I'm used to trusting that fields/locals/args/elements whose type is a reference type point to either null or an instance of the class type indicated, but trusting basically nothing about byrefs or unmanaged pointers. But that's my mental model and is informed by other projects, I'm not sure to what extent the current JIT code base actually makes use of such an assumption (aside from the case mentioned above).

GSPP · 2016-08-16T21:07:03Z

Looks like even BCL-internal code is treading on glass using these unsafe primitives.

And it would be a shame to lose optimizations based on aliasing. Also, even if a given optimization is not performed right now we probably don't want to commit ourselves to never perform it for compat reasons. Much stronger optimizers (e.g. LLVM/LLILC) are on the horizon.

CarolEidt · 2016-08-16T21:24:06Z

I think it would require some serious analysis of JIT behavior and the current specification, plus some language-lawyering, to come up with the right specification for how the JIT should behave with regard to these types of unsafe casts. However, I think that it is imperative that we retain the following invariants, from the JIT's perspective:

Type T is never referenced as type U, unless T derives from U.
A field reference T.f is never used to reference a field U.g, where f and g are different fields.

A short list, not at all exhaustive list of implications for developers when using unsafe casts:

Ensure that any unsafe references do not escape the method, OR that the references are mutually exclusive (i.e. they do not overlap) and are held only by the current method (e.g. it was allocated by the current method).
If code violates type safety for managed types (including value types, but excluding unsafe pointers to native memory) it should be marked [MethodImpl(MethodImplOptions.NoOptimization)]

DemiMarie · 2016-12-01T02:03:44Z

@CarolEldt So I assume that unsafe downcasts of discriminated union types (after having already verified that the cast is safe) are okay?

CarolEidt · 2016-12-01T03:40:20Z

@DemiMarie - Yes, at least from the JIT's perspective. Any value types that have overlapping fields are treated conservatively by the JIT.

danmoseley · 2017-04-13T04:33:55Z

@CarolEidt would it make sense to close this now?

CarolEidt · 2017-04-20T17:31:29Z

@danmosemsft - I'm not sure what is the correct way to handle this kind of issue. Ideally, it would be reassigned as a spec issue, and handled there, but we do not currently have an active standard working group, nor a mechanism for filing and tracking these kinds of issues. But, yes, I think this issue should be closed. I've made a note of it, for consideration in future spec work.

danmoseley closed this as completed Apr 20, 2017

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

ghost locked as resolved and limited conversation to collaborators Dec 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document the behavior of the CLR when unsafe casts are used #6176

Document the behavior of the CLR when unsafe casts are used #6176

GSPP commented Jun 18, 2016

jkotas commented Jun 19, 2016

buybackoff commented Aug 16, 2016

jkotas commented Aug 16, 2016

GSPP commented Aug 16, 2016

jkotas commented Aug 16, 2016

JosephTremoulet commented Aug 16, 2016

GSPP commented Aug 16, 2016 •

edited

Loading

CarolEidt commented Aug 16, 2016

DemiMarie commented Dec 1, 2016

CarolEidt commented Dec 1, 2016

danmoseley commented Apr 13, 2017

CarolEidt commented Apr 20, 2017

Document the behavior of the CLR when unsafe casts are used #6176

Document the behavior of the CLR when unsafe casts are used #6176

Comments

GSPP commented Jun 18, 2016

jkotas commented Jun 19, 2016

buybackoff commented Aug 16, 2016

jkotas commented Aug 16, 2016

GSPP commented Aug 16, 2016

jkotas commented Aug 16, 2016

JosephTremoulet commented Aug 16, 2016

GSPP commented Aug 16, 2016 • edited Loading

CarolEidt commented Aug 16, 2016

DemiMarie commented Dec 1, 2016

CarolEidt commented Dec 1, 2016

danmoseley commented Apr 13, 2017

CarolEidt commented Apr 20, 2017

GSPP commented Aug 16, 2016 •

edited

Loading