Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invariant Arrays #23393

Closed
mjp41 opened this issue Aug 30, 2017 · 21 comments
Closed

Invariant Arrays #23393

mjp41 opened this issue Aug 30, 2017 · 21 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime backlog-cleanup-candidate An inactive issue that has been marked for automated closure. no-recent-activity
Milestone

Comments

@mjp41
Copy link
Member

mjp41 commented Aug 30, 2017

Rationale and Proposed Usage

.NET arrays are covariant. These leads to many complexities in the implementation of the runtime for dealing with checking if assignments to the array are valid. For example,

object [] a = ...; //A
a[1]= "hello";  //B

The assignment labelled //B must check if the a allows strings to be stored in it. If the initialisation, had been:

object [] a = new Node[10];

this would raise an exception at //B.

There are clever/ugly workarounds to remove the runtime type checks, such as https://codeblog.jonskeet.uk/2013/06/22/array-covariance-not-just-ugly-but-slow-too/. But they do not provide nice interop as moving between covariant arrays and invariant arrays using this approach requires coping.

Generics negates many of the reasons for covariant subtyping on arrays. Let's add arrays that are invariantly typed, and thus do not require type checks for assignments.

This proposal is to wrap standard covariant arrays with a generic struct to make its use invariantly typed, and thus enable optimization at the runtime level. If we had

InvariantArray<Node> a = new InvariantArray<Node>(10);

We would simply not be allowed to cast it to a different type as generic parameters to classes and structs are invariant with respect to subtyping.

I think an initially beneficiary of this addition would be System.Collections.Generic.List, where this is used to store lots of objects the runtime type checking can be noticeable.

Proposed API

public struct InvariantArray<T>
{
    public InvariantArray(T[] array) {}
    public InvariantArray(int length) {}
    public ref T this[int index] { get }
    public int Length { get; }
    public T[] AsArray() { return null; }
    public static implicit operator T[](InvariantArray<T> ia) { return null; }
    public static void Copy<S,U>(InvariantArray<S> dst, U[] src) where U : S {}
    /// Other things from ArrayNative
}

This should be expanded to cover most of the things in System.Array.

Implementation sketch

I believe this could be implemented using similar tricks to those used in Span.

public struct InvariantArray<T>
{
    private T[] inner;

    public InvariantArray(T[] array)
    {
        if (array.GetType() == typeof(T[]))
        {
            inner = array;
        }
        else 
        {
            throw new Exception();
        }
    }

    public InvariantArray(int length)
    {
        inner = new T[length];
    }

    public ref T this[int index] 
    {
        get
        {
            // Replace with something like the following unverifiable code:
            // ldloc inner
            // ldloc index
            // readonly ldelema T
            return ref inner[index];
        }
    }

    public int Length => inner.Length;

    public T[] AsArray()
    {
        return inner;
    }
    
    public static implicit operator T[](InvariantArray<T> ia)
    {
        return ia.AsArray();
    }

    public static void Copy<S,U>(InvariantArray<S> dst, U[] src) where U : S
    {
        // Implement array copy without requiring type checking
    }

    /// Other things from ArrayNative
}
  • This could be initially implemented as an internal struct inside CoreCLR, where a few collections could be ported to use it. Most notably System.Collections.Generic.List. This could demonstrate the performance benefits.
  • There may need to be an additional Jit helper to implement the index property efficiently, but I believe this may just work with unsafe IL. So it could be possible to prototype on Corefxlab as was done with Span.
  • This approach would allow zero allocation interop with existing array code, where covariant subtyping has not be used. It is wrapping the covariant array, rather than making an internal wrapper, hence the private field can be returned to get a standard array.

Issues

  • The API needs fleshing out to reach parity with System.Array, but I wanted to gauge if invariant arrays might be taken before doing that work.
@Kosat
Copy link

Kosat commented Aug 30, 2017

As Jon Skeet rightfully concluded:

I think it would be a very rare application which noticed a significant performance boost here, but ...

People just don't write code like object [] a = in their projects.
From an idealistic perspective it can be done, but from a practical standpoint the value of this change goes to zero. Unlike Spans, which introduction boosts performance in so many places in corefx, this invariant array wrappers may be only beneficial for some obscure legacy non-generic collections and other .NET 1.0 remnants which nobody is supposed to use anyway. IMHO

@mjp41
Copy link
Member Author

mjp41 commented Aug 30, 2017

@Kosat thanks for the comment. This proposal is not about code that uses object [] a. The runtime has checks that impact every array assignment of a reference type as it must check the types are compatible. Most code will never fail these checks, but they are required because the type system does not rule out covariant subtyping of arrays.

For example, you can see some checks for coreclr here:

https://github.com/dotnet/coreclr/blob/32f0f9721afb584b4a14d69135bea7ddc129f755/src/vm/amd64/JitHelpers_Fast.asm#L790

The common case is where the element type and the type of the value are equal. This still has to perform tests for null (as it doesn't have a method table), and equality of types. Now potentially the JIT can lift some checks out of loops, but by actually providing proper invariant arrays the performance can be directly improved, by removing these checks in the type system.

@Kosat
Copy link

Kosat commented Aug 31, 2017

Thank you for the explanation, but with current implementation of JIT_Stelem_Ref, which you reffered to, when we've got an exact match between types (e.g. class A in line 5)

class A{}
 
A var1 = new A();
A[] arr = new A[1];
arr[0] = var1; //line 5

when it comes to type checking, we only pay for comparing the array element type with the type of the rvalue

mov     r9, [r10 + OFFSETOF__MethodTable__m_ElementType]   ; 10h -> typehandle offset

; check for exact match
cmp     r9, [r8]
jne     NotExactMatch
...

In our case they match so we directly processed to writing bits into array. That's pretty much all if types are the same and there is naturally more overhead if there's a typecasting involved.

If you want to get rid of those 3 instructions without doing struct wrappers like in Skeet's blog post, you still need to somehow make the static compiler (Roslyn) do that checks for you so jit would gain luxury of not doing those checks in runtime.

How can it be done to disable polymorphism in line 5 so that compiler would prevent me from doing this?

class A  {  }
class B: A {  }
 		
InvariantArray<A> arr3 = new InvariantArray<A>(10);
arr3[0] = new B(); //line 5

Regarding nulls, there will be changes in C# 8.
https://channel9.msdn.com/Blogs/Seth-Juarez/A-Preview-of-C-8-with-Mads-Torgersen

@mjp41
Copy link
Member Author

mjp41 commented Aug 31, 2017

How can it be done to disable polymorphism in line 5 so that compiler would prevent me from doing this?

This does not need to be disabled. It is perfectly valid to perform the assignment unchecked, if you do not allow covariant array subtyping. With invariant arrays, it is not that every element is of that precise type. it is that every view of this as an array thinks it is the same type of array. Covariant subtyping on arrays is generally a bad idea, but is very useful if you don't have generics.

My plan was to use a particular IL instruction which disables the type checking component of array access (readonly ldelema T). I need to implement the mapping, to see if it works, and will do that in the next day or so. Verification would forbid assigning to this address, it is called "controlled mutability" in the ECMA spec, but by disabling verification for this method, we can bypass that issue. This type of trick was done for a lot of the Span work, but I don't believe this approach was used, so needs testing.

@benaadams
Copy link
Member

benaadams commented Sep 1, 2017

If you put a fast Span (.NET Core 2.0+) over the array and assign using the span rather than via the array; does that give you invariance?

e.g.

string [] a = ...; //A
Span<string> s = a;
s[1] = "hello";  // Invariant array assignment

@benaadams
Copy link
Member

benaadams commented Sep 1, 2017

Though you pay a type check on construction so isn't better for single assignments, only if you were doing multiple assignments in a loop etc; since it could only live on the stack and needs to be created each time.

List would likely be the single assignment case.

@mjp41
Copy link
Member Author

mjp41 commented Sep 1, 2017

@benaadams I think there is potential to use Span to replace some of the bulk operations on arrays, but I don't think it will work for single assignment case. Although, the JIT has more potential to optimise the Span as the type comparison is done in managed code, rather than the C++ runtime.

So far for use cases I think

The ToArray calls and the temporary arrays in SortedSet could both be optimised without code rewriting by a simple analysis in the JIT or AOT. It is statically simple for the compiler to see that the array is not covariantly typed. The others are fields of an object, so requiring a much more complex analysis to show they are invariantly typed. This would certainly be beyond the analysis that the JIT could do.

@benaadams
Copy link
Member

@mjp41 agreed, it needs to be a heap object you can store with a single initialization, rather than span else will be paying invariance check cost multiple times anyway and it wouldn't help.

@Suchiman
Copy link
Contributor

Suchiman commented Apr 3, 2018

Instead of using an array wrapper, what about using an attribute, similiar to SkipLocalsInit

@jkotas
Copy link
Member

jkotas commented Apr 3, 2018

@Suchiman I like your idea of enabling this optimization without requiring code changes everywhere! We can start with implementing it in ILLinker to measure the benefit and prove its usefulness, and maybe later in Roslyn. We have taken the same journey for SkipLocalsInit. The implementation in ILLinker can:

  • Do escape analysis to identify all arrays that are just used as internal implementation detail (that do not escape outside the assembly - assuming there is no private reflection) and that do not use array covariance
  • Use alternative more efficient sequence for storing elements into them.

@benaadams
Copy link
Member

Raised issue for automatic conversion via escape analysis dotnet/linker#296

However it wouldn't cover all cases; i.e. where escape analysis couldn't be performed, manual opt in.

@GrabYourPitchforks
Copy link
Member

Resurrecting this old conversation. If the automatic escape analysis is still some way off, would it make sense to implement InvariantArray as an internal type for now and experiment with plumbing it through our collections? That would prevent a proliferation of unsafe code (it'd all be confined to a single file) and could allow us to collect the evidence we need that the escape analysis improvement would be worthwhile to investigate.

@jkotas
Copy link
Member

jkotas commented Mar 29, 2019

Take a look at the conversation in dotnet/coreclr#17386. Wrapping arrays in InvariantArray tends to disable optimizations that are happily kicking in today...

@GrabYourPitchforks
Copy link
Member

GrabYourPitchforks commented Mar 29, 2019

@jkotas I just did a quick experiment on my box and JIT seems to be eliding the bounds check correctly, and InvariantArray<T>.get_Item is being inlined into List<T>.Add, and List<T>.Add is also being inlined into its caller. Aside from the register allocator being a bit too greedy with eax, it seems quite promising.

00007ffc`6dd9ceeb ff470c          inc     dword ptr [rdi+0Ch]  ; _version++
00007ffc`6dd9ceee 488b5710        mov     rdx,qword ptr [rdi+10h]  ; make local enregistered copy of _items
00007ffc`6dd9cef2 8b4f08          mov     ecx,dword ptr [rdi+8]  ; make local enregistered copy of _size
00007ffc`6dd9cef5 8b4208          mov     eax,dword ptr [rdx+8]  ; unnecessary local copy of _items.Length
00007ffc`6dd9cef8 3bc1            cmp     eax,ecx  ; compare size to _items.Length
00007ffc`6dd9cefa 7618            jbe     TO_RESIZE_CODE
00007ffc`6dd9cefc 8d4101          lea     eax,[rcx+1]
00007ffc`6dd9ceff 894708          mov     dword ptr [rdi+8],eax
00007ffc`6dd9cf02 4863c9          movsxd  rcx,ecx
00007ffc`6dd9cf05 488d4cca10      lea     rcx,[rdx+rcx*8+10h]
00007ffc`6dd9cf0a 488bd6          mov     rdx,rsi
00007ffc`6dd9cf0d e8deab8a5f      call    CoreCLR!JIT_CheckedWriteBarrier (00007ffc`cd647af0)

https://github.com/GrabYourPitchforks/coreclr/blob/4c9c6ab0b706b0f50cb845b5d94896303b580a75/src/System.Private.CoreLib/shared/System/Collections/Generic/List.cs#L202

https://github.com/GrabYourPitchforks/coreclr/blob/4c9c6ab0b706b0f50cb845b5d94896303b580a75/src/System.Private.CoreLib/shared/System/InvariantArray.cs#L74

Method Toolchain Mean Error StdDev Ratio RatioSD
AddObjToObjList 3.0-master 33.34 us 0.6181 us 0.6070 us 1.00 0.00
AddObjToObjList invariantarray 28.99 us 0.4035 us 0.3369 us 0.87 0.02
AddStringToObjList 3.0-master 44.22 us 0.6714 us 0.5952 us 1.00 0.00
AddStringToObjList invariantarray 31.50 us 0.6296 us 0.8619 us 0.71 0.01
AddIntToIntList 3.0-master 15.71 us 0.1710 us 0.1516 us 1.00 0.00
AddIntToIntList invariantarray 16.21 us 0.2210 us 0.1725 us 1.03 0.01

@mjp41
Copy link
Member Author

mjp41 commented Mar 29, 2019

Nice, that looks like it might finally be a win. Is the AddIntToIntList test a repeatable perf regression? It looks above noise.

@GrabYourPitchforks
Copy link
Member

@mjp41 It's noise. A subsequent run showed 0.98, not 1.03. The get_Item indexer benchmark (not shown) is also noise: bounces around .99 and 1.02.

@GrabYourPitchforks
Copy link
Member

I'll spin up a new PR with the proposed changes.

@GrabYourPitchforks
Copy link
Member

New PR: dotnet/coreclr#23571

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@joperezr joperezr removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2020
@joperezr joperezr modified the milestones: 5.0.0, Future Jul 6, 2020
@Neme12
Copy link

Neme12 commented May 11, 2022

I think instead of a separate type, there should be a runtime mode that would disable array variance completely, and this would be enabled for new projects.

@dakersnar dakersnar added the backlog-cleanup-candidate An inactive issue that has been marked for automated closure. label Dec 13, 2022
@ghost
Copy link

ghost commented Dec 13, 2022

Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process.

This process is part of our issue cleanup automation.

@ghost ghost added the no-recent-activity label Dec 13, 2022
@ghost
Copy link

ghost commented Dec 28, 2022

This issue will now be closed since it had been marked no-recent-activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the issue, but please note that the issue will be locked if it remains inactive for another 30 days.

@ghost ghost closed this as completed Dec 28, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Jan 27, 2023
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime backlog-cleanup-candidate An inactive issue that has been marked for automated closure. no-recent-activity
Projects
None yet
Development

No branches or pull requests