Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LINQ - use TryGetSpan in Enumerable.Count with predicate #102696

Closed
Meir017 opened this issue May 26, 2024 · 4 comments · Fixed by #102884
Closed

LINQ - use TryGetSpan in Enumerable.Count with predicate #102696

Meir017 opened this issue May 26, 2024 · 4 comments · Fixed by #102884
Labels
area-System.Linq tenet-performance Performance related issue
Milestone

Comments

@Meir017
Copy link
Contributor

Meir017 commented May 26, 2024

Description

similar how Max.cs & Min try to convert to span

if (source.TryGetSpan(out ReadOnlySpan<T> span))

if (source.TryGetSpan(out ReadOnlySpan<decimal> span))

Iterating over a span when possible would be faster then iterating over enumerable.

Configuration

// * Summary *

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3593/23H2/2023Update/SunValley3)
11th Gen Intel Core i9-11950H 2.60GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 8.0.205
[Host] : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
ShortRun : .NET 8.0.5 (8.0.524.21615), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

Job=ShortRun IterationCount=3 LaunchCount=1
WarmupCount=3

Method count Mean Error StdDev Allocated
Count_IEnumerable 100 1,732.7 ns 741.6 ns 40.65 ns 48 B
Count_List 100 209.5 ns 427.7 ns 23.44 ns 40 B
Count_Array 100 196.8 ns 256.8 ns 14.08 ns 32 B
Count_IEnumerable_New 100 1,638.3 ns 1,091.5 ns 59.83 ns 48 B
Count_List_New 100 164.6 ns 366.9 ns 20.11 ns -
Count_Array_New 100 114.8 ns 135.1 ns 7.41 ns -
Count_IEnumerable 10000 161,695.6 ns 64,981.1 ns 3,561.83 ns 48 B
Count_List 10000 42,592.3 ns 17,101.9 ns 937.41 ns 40 B
Count_Array 10000 44,711.1 ns 15,163.0 ns 831.14 ns 32 B
Count_IEnumerable_New 10000 161,839.9 ns 210,702.2 ns 11,549.30 ns 48 B
Count_List_New 10000 36,906.0 ns 32,625.2 ns 1,788.30 ns -
Count_Array_New 10000 36,233.2 ns 7,040.9 ns 385.94 ns -
Count_IEnumerable 1000000 14,854,601.0 ns 12,265,409.7 ns 672,308.49 ns 60 B
Count_List 1000000 6,323,082.3 ns 9,923,756.9 ns 543,954.60 ns 43 B
Count_Array 1000000 6,402,595.8 ns 2,037,387.6 ns 111,676.09 ns 35 B
Count_IEnumerable_New 1000000 15,711,805.2 ns 13,675,725.7 ns 749,612.67 ns 54 B
Count_List_New 1000000 4,253,217.7 ns 3,217,607.6 ns 176,367.93 ns 3 B
Count_Array_New 1000000 4,241,221.4 ns 4,164,944.1 ns 228,294.64 ns 3 B

// * Legends *
count : Value of the 'count' parameter
Mean : Arithmetic mean of all measurements
Error : Half of 99.9% confidence interval
StdDev : Standard deviation of all measurements
Allocated : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
1 ns : 1 Nanosecond (0.000000001 sec)

// * Diagnostic Output - MemoryDiagnoser *

// ***** BenchmarkRunner: End *****
Run time: 00:01:54 (114.04 sec), executed benchmarks: 18

Global total time: 00:02:04 (124.39 sec), executed benchmarks: 18

using System.Runtime.InteropServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

namespace MyBenchmarks;

[MemoryDiagnoser(false)]
[ShortRunJob]
public class EnumerableExcept
{
    [Params(100, 10000, 1000000)]
    public int count;

    private IEnumerable<int> collection;
    
    private List<int> list;
    private int[] array;


    [GlobalSetup]
    public void GlobalSetup()
    {
        Random random = new Random(42);
        collection = Enumerable.Range(0, count).Select(i => random.Next());
        
        list = collection.ToList();
        array = collection.ToArray();
    }

    [Benchmark]
    public int Count_IEnumerable() => collection.Count(x => x % 2 == 0);

    [Benchmark]
    public int Count_List() => list.Count(x => x % 2 == 0);

    [Benchmark]
    public int Count_Array() => array.Count(x => x % 2 == 0);

    [Benchmark]
    public int Count_IEnumerable_New() => MyEnumerable.Count(collection, x => x % 2 == 0);

    [Benchmark]
    public int Count_List_New() => MyEnumerable.Count(list, x => x % 2 == 0);

    [Benchmark]
    public int Count_Array_New() => MyEnumerable.Count(array, x => x % 2 == 0);
}

public static class MyEnumerable
{

    public static int Count<TSource>(IEnumerable<TSource> collection, Func<TSource, bool> predicate)
    {
        ArgumentNullException.ThrowIfNull(collection);
        ArgumentNullException.ThrowIfNull(predicate);

        if (collection is TSource[] array) 
        {
            return SpanCount(array, predicate);
        }

        if (collection is List<TSource> list)
        {
            return SpanCount(CollectionsMarshal.AsSpan(list), predicate);
        }

        int count = 0;
        foreach (var item in collection)
        {
            if (predicate(item))
            {
                count++;
            }
        }

        return count;

        static int SpanCount(Span<TSource> span, Func<TSource, bool> predicate)
        {
            int count = 0;
            for (int i = 0; i < span.Length; i++)
            {
                if (predicate(span[i]))
                {
                    count++;
                }
            }

            return count;
        }
    }
}

public class Program
{
    public static void Main(string[] args)
    {
        var summary = BenchmarkRunner.Run<EnumerableExcept>();
    }
}

Regression?

No

Analysis

call source.TryGetSpan(out ReadOnlySpan<TSource> span) and then a dedicated foreach loop for the read-only-span

@Meir017 Meir017 added the tenet-performance Performance related issue label May 26, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 26, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label May 26, 2024
@Meir017 Meir017 changed the title use TryGetSpan in Enumerable.Count with predicate LINQ - use TryGetSpan in Enumerable.Count with predicate May 26, 2024
@filipnavara filipnavara added area-System.Linq and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels May 26, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-linq
See info in area-owners.md if you want to be subscribed.

@stephentoub
Copy link
Member

Yes, we've experimented with this multiple times, eg main...stephentoub:runtime:fasterlinqenumeration, but haven't yet moved ahead with it as for some scenarios it makes things worse.

@Meir017
Copy link
Contributor Author

Meir017 commented May 26, 2024

@stephentoub can you share the benchmark results?

@eiriktsarpalis
Copy link
Member

@stephentoub is there a tracking issue for this, or should we just keep this open in lieu of one?

@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Jul 5, 2024
@eiriktsarpalis eiriktsarpalis added this to the Future milestone Jul 5, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Aug 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Linq tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants