Add MemoryExtensions to CoreLib and type forward for .NET Core #24107

ahsonkhan · 2017-11-10T21:46:28Z

A larger concern for me around the extension method isn't that they're exposed from System.Memory.dll, but that they're not available for use inside corelib, such that we've had to duplicate some of the functionality in order to implement various things in corelib, and those duplicated implementations aren't as good / efficient / complex as the ones in System.Memory.dll.
https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/System/StringSpanHelpers.cs, for example the trivial IndexOf implementation here versus the optimized implementation in https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/SpanExtensions.cs
This is a good reason why to move SpanExtensions into CoreLib.

From thread (for context): dotnet/corefx#25120 (comment)

cc @KrzysztofCwalina, @jkotas, @stephentoub

benaadams · 2017-11-10T22:20:28Z

SpanHelpers uses vectors though which aren't in coreclr?

ahsonkhan · 2017-11-10T22:26:49Z

SpanHelpers uses vectors though which aren't in coreclr?

This is why I had initially split the SpanExtension class into separate files and SpanExtensions.cs contained methods that could not be implemented in core:
https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/SpanExtensions.cs#L13

stephentoub · 2017-11-10T22:50:35Z

Can we instead use the new intrinsics? My understanding is that those will be available in corelib.

benaadams · 2017-11-10T23:07:21Z

benaadams · 2017-11-10T23:13:19Z

I feel I let myself down with that meme; apologies...

jkotas · 2017-11-10T23:33:58Z

Just move the dozen .cs files that implement Vector<T> and friends to CoreLib as well.

System.Numerics.Vectors is left over from the too fine grained factoring era. It is tightly coupled with the JIT, so it is not like that the two can version independently from the runtime in a reasonable way. CoreLib is the right place for it.

Can we instead use the new intrinsics? My understanding is that those will be available in corelib.

Eventually. I am not sure whether Intel will be able to get all of them in place for .NET Core 2.1.

Also, Vector<T> could be eventually implemented using the new intrinsics as well - it would be simple that way than the current implementation baked into the JIT.

KrzysztofCwalina · 2018-01-04T00:17:15Z

In addition, there are some duplicated methods in the platform now:

    public static class Span {
        public static ReadOnlySpan<byte> AsBytes<T>(this ReadOnlySpan<T> source) where T : struct;
        public static Span<byte> AsBytes<T>(this Span<T> source) where T : struct;

        public static ReadOnlyMemory<char> AsReadOnlyMemory(this string text);

        public static ReadOnlySpan<char> AsReadOnlySpan(this string text);

        public static ReadOnlySpan<TTo> NonPortableCast<TFrom, TTo>(this ReadOnlySpan<TFrom> source) where TFrom : struct where TTo : struct;
        public static Span<TTo> NonPortableCast<TFrom, TTo>(this Span<TFrom> source) where TFrom : struct where TTo : struct;        public static bool TryGetString(this ReadOnlyMemory<char> readOnlyMemory, out string text, out int start, out int length);
    }

    public static class MemoryExtensions {
        public static ReadOnlySpan<byte> AsBytes<T>(this ReadOnlySpan<T> source) where T : struct;
        public static Span<byte> AsBytes<T>(this Span<T> source) where T : struct;

        public static ReadOnlyMemory<char> AsReadOnlyMemory(this string text);

        public static ReadOnlySpan<char> AsReadOnlySpan(this string text);

        
        public static ReadOnlySpan<TTo> NonPortableCast<TFrom, TTo>(this ReadOnlySpan<TFrom> source) where TFrom : struct where TTo : struct;
        public static Span<TTo> NonPortableCast<TFrom, TTo>(this Span<TFrom> source) where TFrom : struct where TTo : struct;
        
        public static bool TryGetString(this ReadOnlyMemory<char> readOnlyMemory, out string text, out int start, out int length);
    }

We should get rid of this duplication as part of this workitem

eerhardt · 2018-01-10T21:32:15Z

I'm not sure moving the Vector class into corelib is going to be that simple. The System.Numerics.Vectors library ships OOB from .NET Core.

The hardware intrinsics are coming along. Maybe we should re-evaluate whether we can use them or not.

/cc @fiigii @tannergooding @CarolEidt - do you have a timeline when each ISA will be complete? Specifically, it looks like for sure we would need Sse2 implemented in order to do int IndexOf(ref byte searchSpace, byte value, int length).

fiigii · 2018-01-10T21:46:46Z

do you have a timeline when each ISA will be complete?

I am trying my best to ship all the Avx/Avx2 intrinsics in .NET Core 2.1. @tannergooding and @4creators are helping me to extend 2.1 with Sse and Sse2.

jkotas · 2018-01-10T21:50:12Z

The System.Numerics.Vectors library ships OOB from .NET Core.

@eerhardt It is not OOB for NET Core. It is inbox on .NET Core and it is tigthly coupled with the JIT.

tannergooding · 2018-01-10T21:51:52Z

I have Sse essentially complete for the basic codegen scenario (minus the name changes and the couple missing APIs that were just done that I need to account for).

Provided it gets proper review/sign-off, I think Sse could be ready for 2.1.

I believe the majority of the remaining work on Sse is containment analysis for better codegen (which I have started on) and adding a lot more tests.

Getting the table-driven PR (dotnet/coreclr#15749) resolved and merged will also greatly help accelerate some of these efforts (as less duplicate code needs to be written).

eerhardt · 2018-01-10T21:52:36Z

It is not OOB for NET Core. It is inbox on .NET Core and it is tigthly coupled with the JIT.

Agreed. But we also ship a NuGet package out of band that supports netstandard, netcoreapp, xamarin, net46 TFMs. See https://www.nuget.org/packages/System.Numerics.Vectors/

tannergooding · 2018-01-10T21:53:54Z

It is inbox on .NET Core and it is tigthly coupled with the JIT.

Pending investigation, performance checks, etc. It might also be possible to rewrite the library to use HWIntrinsics and decouple it from the JIT in the future

jkotas · 2018-01-10T21:56:01Z

we also ship a NuGet package out of band that supports

Right. Nothing would change about this NuGet package. We are on the same plan for number of other NuGet packages.

eerhardt · 2018-01-10T21:57:21Z

So the code would be duplicated between coreclr and corefx?

jkotas · 2018-01-10T21:58:27Z

Right. Nothing would change about this NuGet package. We are on the same duplicated code plan for number of other NuGet packages. E.g. ArrayPool or Span - lives in CoreLib for .NET Core, and slightly different more portable slower implementation ships in NuGet package.

jkotas · 2018-01-10T22:00:04Z

https://github.com/dotnet/corefx/issues/25188 is about making management of the duplicates easier.

4creators · 2018-01-10T22:10:19Z

Regarding SSE2 it should be possible to get it shipped with 2.1, it's close to be ready. It should not be a problem to integrate it with new jit tablegen for intrinsics.

fiigii · 2018-01-11T03:49:20Z

Moving S.N.Vector down to mscorlib looks not a good idea, which requires many changes in RyuJIT/VM (most changes are about intrinsic recognition). And these changes would become useless soon if we implement S.N.Vector in hardware intrinsic in the future.

@ahsonkhan Which vector intrinsics do you need? Maybe I can try to enable them in Intel HW intrinsics at first.

jkotas · 2018-01-11T03:52:49Z

Which vector intrinsics do you need?

We need Vector<T>.

tannergooding · 2018-01-11T03:53:35Z

We need Vector.

So a good portion of Sse, Sse2, Avx, and Avx2

jkotas · 2018-01-11T03:54:51Z

And we do not want the complexity of the implementation to go through the roof. E,g. we want the following IndexOf method to be CoreLib: https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/SpanHelpers.byte.cs#L96

jkotas · 2018-01-11T03:56:54Z

which requires many changes in RyuJIT/VM

I do not think it is many changes.

fiigii · 2018-01-11T04:08:05Z

I do not think it is many changes.

For the intrinsic types, we just need to add [Intrinsic] for the types of Vector<T>/Vector2/Vector3... and change a bit code in RyuJIT.
For the intrinsic methods,
- if we do not want the many changes, the current recognition system should be maintained. That extends the scope of searching Vector<T> intrinsics from S.N.Vector.dll to the whole mscorlib. JIT-throughput would be significantly impacted.
- if we want Vector<T> intrinsics to be recognized by the new-style named intrinsics ([Intrinsic]), I guess more engineering efforts would be required.

cc @AndyAyersMS @CarolEidt

CarolEidt · 2018-01-11T04:46:36Z

if we want Vector intrinsics to be recognized by the new-style named intrinsics ([Intrinsic]), I guess more engineering efforts would be required.

I think that's the only reasonable way to do this.

weshaggard · 2018-01-11T16:58:52Z

Right. Nothing would change about this NuGet package. We are on the same duplicated code plan for number of other NuGet packages. E.g. ArrayPool or Span - lives in CoreLib for .NET Core, and slightly different more portable slower implementation ships in NuGet package.

@jkotas unlike Span and ArrayPool, Vectors ships an OOB implementation for netcoreapp which allows the application to override the inbox version cleanly. I'm not sure if we have a great scenario to ship updates with the application but we eliminate that as a possibility if we move it into CoreLib. We also require the JIT intrinsics to be different and I suspect we will need to make them recognize both CoreLib and Vectors library for finding them. Beyond the intrinics we may need to carry the IL fallback implementation in corelib which is a decent size increase (~228KB). To me that seems like a pretty high price just for one method in Span.

CarolEidt · 2018-01-11T17:29:53Z

I can see both sides of this, but I hope that we don't make a decision in the interest of short-term expediency, rather than what is the best long-term approach.

Perhaps there is some obvious flaw in my thinking, but I believe that one of the main sources of size & complexity of the IL implementation is that it must support variable-sized vectors. At that time, we didn't use the "mustExpand when recursive" approach, so the IL implementation had to support variable-length vectors in the event that the method was exposed indirectly, or was exposed for the purposes of diagnostic tools, etc.

What if the IL implementation only had to deal with a default vector size (e.g. 16 bytes)? It could look something like this (pseudo code for method Foo on a Vector v)

if (Vector.IsAccelerated)
{
    Foo();
}
else
{
    ThrowIfVectorSizeNot16Bytes();
    // code implementing for 16-byte vector
}

What have I missed? If we went the route of supporting both methods for recognizing Vector<T> and its methods, this is something that could be phased in without having to do so in lock-step.

tannergooding · 2018-01-11T17:46:10Z

What if the IL implementation only had to deal with a default vector size (e.g. 16 bytes)? It could look something like this (pseudo code for method Foo on a Vector v)

That makes sense to me. The code should be functionally equivalent to before, so back-compat doesn't seem like a concern.

It would likely also produce a perf improvement for any indirect calls to the methods.

I put some of my thoughts on this (the general topic) here, but it can basically be summed up as:

If we think Vector<T> will be more generally used throughout corlib, we should probably move it. The biggest argument against it is being able to updated the library (System.Numerics.Vector) OOB, but I'm not sure the scenarios where that is useful are outweigh the cost of duplicating code between the libraries.
If its only for a couple methods and it isn't going to be more widely used, then we probably shouldn't move it (at least right now).

jkotas · 2018-01-11T17:47:10Z

Vectors ships an OOB implementation for netcoreapp which allows the application to override the inbox version cleanly.

I would not call this a clean override. Vectors are coupled with the JIT. If app overrides the inbox version and there are new methods added, they will not be accelerated by the old JIT and it is hard to reason about how to write a performant code against it under these conditions.

When we were versioning System.Numerics independently in .NET Framework, I have seen people writing code that checked clrjit.dll version to figure out which methods on Vector are good and fast, and then pick the right algorithm implementation based on that.

To me that seems like a pretty high price just for one method in Span.

It is not just one method in Span. There are number of them on Span, and folks wanted to use in CoreLib outside the Span for performance optimizations.

IL fallback implementation in corelib which is a decent size increase (~228KB)

This is just moving the bytes from one .dll to other .dll that ship together. I do not think it has any material negative impact at end-to-end metrics, like .NET Core runtime distro size. In fact, having this in corelib sets us up to not compile in the IL fallback on platforms that do not need it. E.g. we require SSE2+ on x86 and x64 today and so it should be safe to assume that Vector<T> will be always accelerated on these platforms.

I think that the alternative is to leave System.Numerics alone and create a smaller internal Vector<T> for CoreLib as @tannergooding explained in dotnet/coreclr#15821 (comment) . We have prior art for this. It would be fine with me too if you think it is better.

4creators · 2018-01-11T18:03:36Z

There is large group of most basic algorithms which were successfully vectorized and in .NET Core large part of them lives in CoreLib. Once Vector<T> will get expanded with functionality required to implement them (shuffle, shifts, scatter, gather etc) which are missing right now new implementation possibilities will open up. Examples: string search, string comparison, string length, sorting (quick sort and merge sort), fast collections, encoding conversions i.e. base64, utf-8, and many others.

weshaggard · 2018-01-11T18:56:09Z

I don't personally have a strong opinion in either direction I just wanted to make sure we are considering all the options and understand the implications. Looks like @eerhardt and @ViktorHofer are the current owners for this library according to https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/issue-guide.md so the decision is up to them.

ahsonkhan · 2018-01-11T18:57:10Z

cc'ing @terrajobst as well

benaadams · 2018-01-11T22:58:53Z

Could type forward the Span methods and use intrinsics rather than Vector<T> #if def'd on platform?

Would mean it would be a completely different implementation to corefx though

ahsonkhan · 2018-01-17T00:12:47Z

To be clear, we will still have the MemoryExtensions class with all the APIs implemented in System.Memory.dll for the portable span. For netcore specifically though, we will type forward down to corelib. This will result in some duplication (which we can manage by the mirroring that was recently setup, cc @tarekgh). Does anyone have concerns with this approach?

@eerhardt, what's the verdict?

eerhardt · 2018-01-17T15:33:02Z

Do we have to duplicate the whole assembly? Or could just Vector<T> be mirrored/duplicated?

How much work would the smaller internal Vector<T> for CoreLib be?

ahsonkhan · 2018-01-22T19:16:26Z

How much work would the smaller internal Vector<T> for CoreLib be?

I am not really sure since I am still unfamiliar with S.N.V code. I don't know what changes are required on the JIT side, but I would imagine that work would remain unchanged regardless of whether we duplicate the whole assembly or only have an internal Vector<T>.

cc @tannergooding, @AndyAyersMS - can you please confirm?

jkotas · 2018-01-22T19:39:36Z

The work required for internal Vector<T> is a quite a bit more (weeks) than just the work required to moving Vector<T> to CoreLib. The reason is that the JIT and runtime have to treat two Vector<T> types in the special way, not just one - going from 1 to 2 of anything tends to always require non-trivial changes. Plus there is cost of the ongoing maintenance for the dual scheme, and making sure that that the internal Vector<T> actually works because of there would not be any direct test coverage for it.

CarolEidt · 2018-01-22T23:53:21Z

It seems that the best approach is to move Vector<T> to CoreLib. Are there motivating factors against making that move? It seems that most of the discussion has reinforced that choice.

ahsonkhan · 2018-02-28T07:47:03Z

Done

AceHack · 2019-04-12T22:47:15Z

Where can I find AsBytes? I need this function.

jkotas · 2019-04-13T05:06:51Z

@AceHack https://www.bing.com/search?q=.NET%20AsBytes

AceHack · 2019-04-13T16:20:04Z

Wow I googled for it and could not find it. Thanks.

ahsonkhan self-assigned this Jan 11, 2018

ahsonkhan changed the title ~~Move MemoryExtensions down to CoreLib~~ Add MemoryExtensions to CoreLib and typeforward for .NET Core Jan 17, 2018

ahsonkhan changed the title ~~Add MemoryExtensions to CoreLib and typeforward for .NET Core~~ Add MemoryExtensions to CoreLib and type forward for .NET Core Jan 17, 2018

ahsonkhan closed this as completed Feb 28, 2018

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the 2.1.0 milestone Jan 31, 2020

davidwrighton mentioned this issue Jan 31, 2020

Remove workaround for using Arm64 without updating System.Reflection.Metadata #25278

Closed

ghost locked as resolved and limited conversation to collaborators Dec 19, 2020

Add MemoryExtensions to CoreLib and type forward for .NET Core #24107

Add MemoryExtensions to CoreLib and type forward for .NET Core #24107

Comments

ahsonkhan commented Nov 10, 2017

benaadams commented Nov 10, 2017

ahsonkhan commented Nov 10, 2017

stephentoub commented Nov 10, 2017

benaadams commented Nov 10, 2017

benaadams commented Nov 10, 2017

jkotas commented Nov 10, 2017 • edited Loading

KrzysztofCwalina commented Jan 4, 2018

eerhardt commented Jan 10, 2018

fiigii commented Jan 10, 2018

jkotas commented Jan 10, 2018 • edited Loading

tannergooding commented Jan 10, 2018

eerhardt commented Jan 10, 2018 • edited Loading

tannergooding commented Jan 10, 2018

jkotas commented Jan 10, 2018

eerhardt commented Jan 10, 2018

jkotas commented Jan 10, 2018

jkotas commented Jan 10, 2018

4creators commented Jan 10, 2018

fiigii commented Jan 11, 2018

jkotas commented Jan 11, 2018

tannergooding commented Jan 11, 2018

jkotas commented Jan 11, 2018

jkotas commented Jan 11, 2018

fiigii commented Jan 11, 2018

CarolEidt commented Jan 11, 2018

weshaggard commented Jan 11, 2018

CarolEidt commented Jan 11, 2018

tannergooding commented Jan 11, 2018

jkotas commented Jan 11, 2018 • edited Loading

4creators commented Jan 11, 2018

weshaggard commented Jan 11, 2018

ahsonkhan commented Jan 11, 2018

benaadams commented Jan 11, 2018

ahsonkhan commented Jan 17, 2018 • edited Loading

eerhardt commented Jan 17, 2018

ahsonkhan commented Jan 22, 2018

jkotas commented Jan 22, 2018

CarolEidt commented Jan 22, 2018

ahsonkhan commented Feb 28, 2018

AceHack commented Apr 12, 2019

jkotas commented Apr 13, 2019

AceHack commented Apr 13, 2019

jkotas commented Nov 10, 2017 •

edited

Loading

jkotas commented Jan 10, 2018 •

edited

Loading

eerhardt commented Jan 10, 2018 •

edited

Loading

jkotas commented Jan 11, 2018 •

edited

Loading

ahsonkhan commented Jan 17, 2018 •

edited

Loading