Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API for inserting a Span<T> into a List<T> efficiently #1530

Closed
Tracked by #77391
josetr opened this issue Mar 21, 2019 · 18 comments · Fixed by #76274
Closed
Tracked by #77391

API for inserting a Span<T> into a List<T> efficiently #1530

josetr opened this issue Mar 21, 2019 · 18 comments · Fixed by #76274
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Collections
Milestone

Comments

@josetr
Copy link

josetr commented Mar 21, 2019

EDITED 9/22/2022 by @stephentoub:

namespace System.Collections.Generic
{
    public class List<T>
    {
+        public void AddSpan(ReadOnlySpan<T> span);
+        public void InsertSpan(int index, ReadOnlySpan<T> span);

+        public void CopyTo(Span<T> span);
    }
}
  • We can't just add an AddRange overload as it's ambiguous for anything convertible to both span and enumerable, like array. We could alternatively call it AddRange and add overloads as well for arrays and strings.

Perhaps an AddRange overload taking a Span<T> can be added to the List class?

var list = new List<byte>();
list.AddRange(new Span<byte>());

Workarounds

foreach(var b in span)
    list.Add(b); // Inefficient only copying one byte at a time
var list = new List<byte>();
var span = new Span<byte>();
list.AddRange(span.ToArray()); Silly allocation
@ahsonkhan
Copy link
Member

Do you have a scenario to help motivate this addition (maybe example code today where the workarounds are noticeably inefficient)? Are there other List APIs that fall into this category (for instance what about the List<T> ctor)?

Similarly, we have the CopyTo that takes arrays.

Presumably, there was a reason such APIs didn't meet the bar during the first round:
https://github.com/dotnet/corefx/issues/21281

There are some other namespaces that could likely benefit from Span/Buffer, but we should probably handle separately from this initial push:

  • System.Collections. It’s not clear to me to what extent we should add span/buffer-based APIs to collections. CopyTo overloads that take Span<T>?

@DaZombieKiller
Copy link
Contributor

DaZombieKiller commented Mar 21, 2019

Just to add to the List<T> of workarounds 😛

public static class ListMarshal
{
    static readonly IntPtr _items   = Marshal.OffsetOf(typeof(List<>), "_items");
    static readonly IntPtr _size    = Marshal.OffsetOf(typeof(List<>), "_size");
    static readonly IntPtr _version = Marshal.OffsetOf(typeof(List<>), "_version");

    class RawData
    {
        public byte Data;
    }

    static ref TField GetField<T, TField>(List<T> list, IntPtr offset)
    {
        var raw  = Unsafe.As<RawData>(list);
        return ref Unsafe.As<byte, TField>(ref Unsafe.Add(ref raw.Data, offset));
    }

    public static T[]     GetArray     <T>(List<T> list) =>     GetField<T, T[]>(list, _items);
    public static ref int GetCountRef  <T>(List<T> list) => ref GetField<T, int>(list, _size);
    public static ref int GetVersionRef<T>(List<T> list) => ref GetField<T, int>(list, _version);
}

Which you could turn into the following extension method:

public static class ListExtensions
{
    public static void AddRange<T>(this List<T> self, ReadOnlySpan<T> values)
    {
        if (self.Count + values.Length > self.Capacity)
            self.Capacity = self.Count + values.Length;

        var array = ListMarshal.GetArray(self);
        values.CopyTo(array.AsSpan(self.Count));

        ListMarshal.GetCountRef  (self) += values.Length;
        ListMarshal.GetVersionRef(self)++;
    }
}

Which would be used as follows:

var list = new List<int> { 1, 2, 3, 4 };
var data = new[] { 5, 6, 7, 8 }.AsSpan();
list.AddRange(data);

foreach (var value in list)
    Console.WriteLine(value);

It feels incredibly dirty to do this though, so a proper solution in CoreFX would be greatly appreciated.

@jkotas
Copy link
Member

jkotas commented Mar 21, 2019

It feels incredibly dirty to do this though

Yes, it is dirty and it does not even work. You will get Type 'System.Collections.Generic.List1[T]' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed.exception thrown from theMarshal.OffsetOf` call.

@jkotas
Copy link
Member

jkotas commented Mar 21, 2019

The best current workaround for this is to roll your own List<T> type that does exactly what you need.

@DaZombieKiller
Copy link
Contributor

It feels incredibly dirty to do this though

Yes, it is dirty and it does not even work. You will get Type 'System.Collections.Generic.List1[T]' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed.exception thrown from theMarshal.OffsetOf` call.

Interestingly, it worked perfectly fine while running on Mono (must be an oversight on their part). However on .NET Core I get the error you mentioned. It's fixable by using reflection, although it requires some extra ceremony.

The best current workaround for this is to roll your own List<T> type that does exactly what you need.

I'd normally recommend this too, but it's not an option when you're dealing with a third party API taking a List<T>.

@safern safern transferred this issue from dotnet/corefx Jan 9, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Collections untriaged New issue has not been triaged by the area owner labels Jan 9, 2020
@safern safern added this to the 5.0 milestone Jan 9, 2020
@safern safern added api-needs-work API needs work before it is approved, it is NOT ready for implementation and removed untriaged New issue has not been triaged by the area owner labels Jan 9, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@layomia layomia removed the untriaged New issue has not been triaged by the area owner label Jun 24, 2020
@layomia layomia modified the milestones: 5.0.0, Future Jun 24, 2020
@ghost
Copy link

ghost commented Oct 26, 2021

Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process.

This process is part of the experimental issue cleanup initiative we are currently trialing in a limited number of areas. Please share any feedback you might have in the linked issue.

@ghost
Copy link

ghost commented Nov 9, 2021

This issue will now be closed since it had been marked no recent activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the issue, but please note that the issue will be locked if it remains inactive for another 30 days.

@ghost ghost closed this as completed Nov 9, 2021
@Raymonf
Copy link

Raymonf commented Nov 11, 2021

Since this is now closed (I didn’t see the comment by the bot from 2 weeks ago), what is the preferred solution for doing this while avoiding the allocation that would come from ToArray()? Does one exist?

@ghost ghost removed the no-recent-activity label Nov 11, 2021
@eiriktsarpalis
Copy link
Member

@Raymonf to my knowledge the only known workarounds are the ones listed in the OP. In the future it might be possible to do this using the CollectionsMarshall.AsSpan method, however we're still missing a method for presizing lists (cf. #55217). In pseudocode it would look as follows:

public static void AddRange<T>(this List<T> list, ReadOnlySpan<T> items)
{
    int initialCount = list.Count;
    list.Resize(initialCount + items.Length); // !!! NB this method is currently not implemented
    Span<T> buffer = CollectionsMarshal.AsSpan(list);
    items.CopyTo(buffer.Slice(initialCount));
}

MichalStrehovsky pushed a commit to MichalStrehovsky/runtime that referenced this issue Dec 9, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2021
@MihaZupan
Copy link
Member

@eiriktsarpalis how come we're closing this? Was there a decision to reject the proposal altogether?

@stephentoub
Copy link
Member

#1530 (comment)

@MihaZupan
Copy link
Member

I guess the question is "why was no-recent-activity added"

@eiriktsarpalis
Copy link
Member

See #60288 for context.

@eiriktsarpalis eiriktsarpalis added the backlog-cleanup-candidate An inactive issue that has been marked for automated closure. label Feb 18, 2022
@stephentoub stephentoub reopened this Sep 23, 2022
@stephentoub stephentoub modified the milestones: Future, 8.0.0 Sep 23, 2022
@stephentoub stephentoub added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-needs-work API needs work before it is approved, it is NOT ready for implementation backlog-cleanup-candidate An inactive issue that has been marked for automated closure. labels Sep 23, 2022
@stephentoub
Copy link
Member

I updated the top post. I think we should add this. List<T> is the primary growable array type we push people to use, and ReadOnlySpan<T> is now a critically-important data type; you should be able to add the latter to the former efficiently.

@eiriktsarpalis
Copy link
Member

Are there opportunities for adding other span overloads? At quick glance, a constructor accepting ROS<T> and a CopyTo(Span<T>) overload seem like obvious candidates.

@dotnet dotnet unlocked this conversation Sep 23, 2022
@tfenise
Copy link
Contributor

tfenise commented Sep 23, 2022

Adding the proposed overload public void AddRange(ReadOnlySpan<T> collection); (as well as InsertRange) would be a source code breaking change, because there would be an ambiguity when passing a T[] as the argument collection. Potential solutions include introducing another overload taking a T[] as parameter, or adding the proposed overload as an extension method, or maybe renaming the proposed overload (but we may want an argument of type T[] to go through the overload with parameter ReadOnlySpan<T> instead of IEnumerable<T>).

@stephentoub
Copy link
Member

Are there opportunities for adding other span overloads? At quick glance, a constructor accepting ROS and a CopyTo(Span) overload seem like obvious candidates.

CopyTo would be reasonable. It does have a workaround today, which is that someone can use CollectionsMarshal.AsSpan to get a span for the list, and then CopyTo with that as the source... there's no good workaround today for AddRange.

A ctor is going to have the same ambiguity problem that tfenise calls out, and we can't workaround that with a name.

@terrajobst
Copy link
Member

terrajobst commented Sep 27, 2022

Video

  • We should talk to @jaredpar / @dotnet/ldm about whether his proposal for "binary compatibility only" could be used here to avoid having to add new method group. We could also make them extension methods in which case they wouldn't cause ambiguities but only be used for spans. That being said, given that F# and VB wouldn't have the hypothetical feature so we probably would do the extension method for a core type anyway.
  • There is another concern about generic instantiations and lack of pay-for-play for methods only used for some Ts.

This would be the desired API:

namespace System.Collections.Generic;

public partial class List<T>
{
    public void AddRange(ReadOnlySpan<T> span);
    public void InsertRange(int index, ReadOnlySpan<T> span);
    public void CopyTo(Span<T> span);
}

If we go down the extension method route, it would look like this:

namespace System.Collections.Generic;

public static class CollectionExtensions
{
    public static void AddRange<T>(this List<T> list, ReadOnlySpan<T> source);
    public static void InsertRange<T>(this List<T> list, int index, ReadOnlySpan<T> source);
    public static void CopyTo<T>(this List<T> list, Span<T> destination);
    // We discussed this but decided against this until there is a need.
    // public static bool TryCopyTo<T>(this List<T> list, Span<T> destination);
}

We're inclined to approve the extension method shape and simultaneously work with the compiler team to see whether we can use a language feature to avoid using the extension methods.

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Sep 27, 2022
@stephentoub stephentoub self-assigned this Sep 27, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Sep 27, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 29, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Collections
Projects
None yet
Development

Successfully merging a pull request may close this issue.