Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Metamorphic treatment for arrays and scalar values #40418

Open
msedi opened this issue Aug 5, 2020 · 4 comments
Open

[Proposal] Metamorphic treatment for arrays and scalar values #40418

msedi opened this issue Aug 5, 2020 · 4 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Runtime
Milestone

Comments

@msedi
Copy link

msedi commented Aug 5, 2020

Motivation

For mathematical operations there is currently no support in C# that allows "vectorized" operations on arrays and Spans (simply called vectors here). For that reason we have created our own library that is able to do this. For a simple operation like Add this looks like (shorted):

public static T[] Add<T>(T[] a, T[] b)
{
  T[] = new T[a.Length];

 for (int i=0; i<a.Length; i++)
  c[i]=a[i]+b[i]; // This does not work currently but for the sake of brevity it shown like this.

return c;
}

But it's not always like only adding two vectors, often you need to have a vector+scalar operation. So the methods above extend to:

public static T[] Add<T>(T[] a, T[] b) ;
public static T[] Add<T>(T[] a, T b);
public static T[] Add<T>(T a, T[] b);

This is causing a lot of code writing where the inner core of the loop as shown above is always the same. But instead of incrementing through the vector the scalar stays constant. The methods shown above create their result by allocating new memory, but memory reasons we also need to have methods were you can hand over memory from the array, memory pool or other sources. So in the end for each operation we have 6 variants, with slightly different treatment due to scalar and vector differences. And the examples repeat for numerous math routines - and - the code gets even more complex when using intrinsics or Vector.

With the advent of Span we have also rewritten our methods so that they now use Span

public static T[] Add<T>(ReadOnlySpan<T> a, ReadOnlySpan<T> b) ;
public static T[] Add<T>(ReadOnlySpan<T> a, T b);
public static T[] Add<T>(T a, ReadOnlySpan<T> b);

Writing all of these operations is highly prone to errors. .NET/C# fortunately introduce new performance improvements that sometimes force us to update the whole code base to gather the performance improvements (which we of course highly like ;-) so at this point - thank you for that!

Proposal

I suggest a type that always mimics to be a vector type but has the ability to adapt it behavior. In the end it a form of an iterator with the same performance of a vector or a scalar depending on what type it was initialized with. It should somehow be like following example

   public ref struct Value<T> where T : struct
    {
        private readonly ReadOnlySpan<T> Vector;
        private readonly T Scalar;
        private readonly bool IsVector;
        public readonly int Length;

        public Value(T scalar)
        {
            IsVector = false;
            Vector = default; // Should be omitted somehow because not needed later
            Scalar = scalar;
            Length = 1;
        }

        public Value(ReadOnlySpan<T> vector)
        {
            IsVector = false;
            Vector = vector;
            Scalar = default; // Should be omitted somehow because not needed later
            Length = vector.Length;
        }

        public T this[int i]
        {
            get
            {
                // Either or of these calls below should be omitted depending on IsVector.
                // And the call should be optimized to be as fast as the native access when doing vector or scalar operations.
                if (IsVector) return Vector[i];
                return Scalar;
            }
        }

        public static implicit operator Value<T>(T[] vector) => new Value<T>(vector);
        public static implicit operator Value<T>(Span<T> vector) => new Value<T>(vector);
        public static implicit operator Value<T>(T scalar) => new Value<T>(scalar);

        public Vector<T> AsVector(int i)
        {
            // Either or of these calls below should be omitted depending on IsVector.
            // And the call should be optimized to be as fast as the native access when doing vector or scalar operations.
            if (IsVector) return new Vector<T>(Vector.Slice(i, Vector<T>.Count));
            return new Vector<T>(Scalar); // Should be only done once for scalars
        }
    }

So that I'm able to write a method like this:

 public static T[] Add<T>(Value<T> a, Value<T> b)
        {
            T[] c = new T[a.Length];

            for (int i = 0; i < a.Length; i++)
                c[i] = a[i] + b[i];

            return c;
        }

and can further be used like this

  static void Main(string[] args)
        {
            int[] A = new int[100];
            int[] B = new int[100];
            var C = Add<int>(A, B);
        }

Keep in mind, the examples above are not fully functional.

Open questions

Currently most of the things can already be done as shown in proposed Value type. The only problem is that there is no runtime support that is able to see which code parts can be elided at runtime. My benchmarks of the implementations above show a tremedous performance drop when trying to do something like the above.

This is the reason why I called this type metamorphic because in contrast to regular generics the structure is not only bound to the type and a version of this type can be created, it is also dependent on the datatype that is used during construction of the type.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Aug 5, 2020
@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@svick
Copy link
Contributor

svick commented Aug 5, 2020

To me, the proposed Value<T> seems to be quite specific for your needs and not widely usable. That means it would not be worth adding to .Net, especially if adding it would require runtime changes.

It seems you already have a solution that works and the only problem is that it's very repetitive (with all the issues that come from that). To improve that, I'd suggest using some form of code generation, possibly using C# Source Generators (currently in preview).

Another approach would be to have the methods be generic over ref struct types. That's currently not possible, but it's being considered for C# 10.0: dotnet/csharplang#1148 (comment).

@msedi
Copy link
Author

msedi commented Aug 6, 2020

@svick : In the end you are right it's specific to my needs. Here I only wanted to show that the runtime is currently not able to optimize things that are constant at the time of the construction of the object.

So for example the IsVector could be completely elided if it's not a vector and the call could be competely inlined (see the following code). The type is defined at "JIT" time, but the IsVector is defined at construction time. The runtime can therefore maybe not detect that this is also a "JIT" time constant and I cannot express it.

 public T this[int i]
        {
            get
            {
                // Either or of these calls below should be omitted depending on IsVector.
                // And the call should be optimized to be as fast as the native access when doing vector or scalar operations.
                if (IsVector) return Vector[i];
                return Scalar;
            }
        }

So in the end the runtime needs to identify constant structures that I can currently only express with readonly which is maybe to weak.

In C++ there was something for template where in the template you could not only hand over the type but also a value (in my case a bool) which would fit the need here but it's maybe a bit to specific, but I would also prefer a more generic solution. Something like

Value<T, bool>

where I can simply write

new Value<float,true>()

which constructs a type variant that runtime is able to optimize the in the required way. This is still of course written in my specific need but I think there's more to get when you could something like described above.

@joperezr joperezr added api-suggestion Early API idea and discussion, it is NOT ready for implementation and removed untriaged New issue has not been triaged by the area owner labels Aug 27, 2020
@joperezr joperezr added this to the Future milestone Aug 27, 2020
@msedi
Copy link
Author

msedi commented Aug 28, 2020

Just as a reference some people call this requirement frugal object or discriminated union. There is also some similar implemention for strings called StringValues in Microsoft.Extensions.Primitives. So obviously despite my needs there is also need from others ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Runtime
Projects
None yet
Development

No branches or pull requests

5 participants