Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve List performance in IEnumerable.SequenceEqual #67722

Closed
wants to merge 1 commit into from

Conversation

yesmey
Copy link
Contributor

@yesmey yesmey commented Apr 7, 2022

Currently IEnumerable.SequenceEqual is only vectorized when both the source and comparison enumerable are arrays. This change allows them to be either an array or a List.

To be able to use the already existing TryGetSpan I had to remove the struct constraint. I still wanted to keep the original intent as the comment suggests, so I replaced it with a Type.IsValueType condition. The method was introduced in #64624
@stephentoub Please let me know what you think of this.

The fast path of MemoryExtensions.SequenceEqual is restricted to RuntimeHelpers.IsBitwiseEquatable, and it seem a bit too low-level and implementation coupled to use here, so I chose to keep the restriction to IsValueType for that reason.

For benchmark numbers it's pretty much as expected, it scales really well with large ranges. For this example I mixed comparing an array with a List

public class SequenceEqualsListArrayBenchmark
{
    [Params(1, 4, 32, 256, 1024)]
    public int Length { get; set; }

    private IEnumerable<int> _first;
    private IEnumerable<int> _second;

    [GlobalSetup]
    public void Setup()
    {
        _first = Enumerable.Range(1, Length).ToList();
        _second = Enumerable.Range(1, Length).ToArray();
    }

    [Benchmark]
    public bool SequenceEquals() => _first.SequenceEqual(_second);
}
Method Job Toolchain Length Mean Error StdDev Ratio
SequenceEquals Job-OIGBKS main 1 26.86 ns 0.018 ns 0.016 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 1 16.82 ns 0.163 ns 0.152 ns 0.63
SequenceEquals Job-OIGBKS main 4 42.17 ns 0.053 ns 0.047 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 4 17.61 ns 0.192 ns 0.180 ns 0.42
SequenceEquals Job-OIGBKS main 32 187.94 ns 0.134 ns 0.119 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 32 18.18 ns 0.211 ns 0.197 ns 0.10
SequenceEquals Job-OIGBKS main 256 1,342.10 ns 2.395 ns 1.870 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 256 26.63 ns 0.238 ns 0.223 ns 0.02
SequenceEquals Job-OIGBKS main 1024 5,228.43 ns 6.054 ns 4.726 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 1024 79.60 ns 0.266 ns 0.249 ns 0.02

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Apr 7, 2022
@ghost
Copy link

ghost commented Apr 7, 2022

Tagging subscribers to this area: @dotnet/area-system-linq
See info in area-owners.md if you want to be subscribed.

Issue Details

Currently IEnumerable.SequenceEqual is only vectorized when both the source and comparison enumerable are arrays. This change allows them to be either an array or a List.

To be able to use the already existing TryGetSpan I had to remove the struct constraint. I still wanted to keep the original intent as the comment suggests, so I replaced it with a Type.IsValueType condition. The method was introduced in #64624
@stephentoub Please let me know what you think of this.

The fast path of MemoryExtensions.SequenceEqual is restricted to RuntimeHelpers.IsBitwiseEquatable, and it seem a bit too low-level and implementation coupled to use here, so I chose to keep the restriction to IsValueType for that reason.

For benchmark numbers it's pretty much as expected, it scales really well with large ranges. For this example I mixed comparing an array with a List

public class SequenceEqualsListArrayBenchmark
{
    [Params(1, 4, 32, 256, 1024)]
    public int Length { get; set; }

    private IEnumerable<int> _first;
    private IEnumerable<int> _second;

    [GlobalSetup]
    public void Setup()
    {
        _first = Enumerable.Range(1, Length).ToList();
        _second = Enumerable.Range(1, Length).ToArray();
    }

    [Benchmark]
    public bool SequenceEquals() => _first.SequenceEqual(_second);
}
Method Job Toolchain Length Mean Error StdDev Ratio
SequenceEquals Job-OIGBKS main 1 26.86 ns 0.018 ns 0.016 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 1 16.82 ns 0.163 ns 0.152 ns 0.63
SequenceEquals Job-OIGBKS main 4 42.17 ns 0.053 ns 0.047 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 4 17.61 ns 0.192 ns 0.180 ns 0.42
SequenceEquals Job-OIGBKS main 32 187.94 ns 0.134 ns 0.119 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 32 18.18 ns 0.211 ns 0.197 ns 0.10
SequenceEquals Job-OIGBKS main 256 1,342.10 ns 2.395 ns 1.870 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 256 26.63 ns 0.238 ns 0.223 ns 0.02
SequenceEquals Job-OIGBKS main 1024 5,228.43 ns 6.054 ns 4.726 ns 1.00
SequenceEquals Job-NXVCHG sequenceequals 1024 79.60 ns 0.266 ns 0.249 ns 0.02
Author: yesmey
Assignees: -
Labels:

area-System.Linq, community-contribution

Milestone: -

@EgorBo
Copy link
Member

EgorBo commented Apr 7, 2022

Nice! I assume this change slightly regresses e.g. List of floats (and any struct)

@Wraith2
Copy link
Contributor

Wraith2 commented Apr 8, 2022

Is removal of a generic constraint on a public method a breaking change? It can cause a callsite to become ambiguous where it was not before if there was a user provided extension that did not have the constraint.

@danmoseley
Copy link
Member

@bartonjs for that breaking change question

@stephentoub
Copy link
Member

Is removal of a generic constraint on a public method a breaking change?

The method the constraint is being removed from isn't public.

I assume this change slightly regresses

I'm more concerned this is going to regress the performance of arrays of reference types. Prior to this change, they could take the fast path that uses MemoryExtensions.SequenceEqual on spans, avoiding all enumeration and interface dispatch costs. With this change, they will be forced down the IList<T> code path.

@yesmey, can you please share benchmark numbers for reference types?

@yesmey
Copy link
Contributor Author

yesmey commented Apr 9, 2022

here's the benchmark results for IEnumerable<Type>

Method Job Toolchain Length IsArray Mean Error StdDev Median Ratio RatioSD
SequenceEquals Job-GZRMNK main 1 False 28.28 ns 0.033 ns 0.031 ns 28.27 ns 1.00 0.00
SequenceEquals Job-OTBSIO sequenceequals 1 False 24.76 ns 0.255 ns 0.238 ns 24.71 ns 0.88 0.01
SequenceEquals Job-GZRMNK main 1 True 28.62 ns 0.048 ns 0.042 ns 28.62 ns 1.00 0.00
SequenceEquals Job-OTBSIO sequenceequals 1 True 40.81 ns 0.032 ns 0.030 ns 40.81 ns 1.43 0.00
SequenceEquals Job-GZRMNK main 4 False 47.84 ns 0.374 ns 0.350 ns 47.67 ns 1.00 0.00
SequenceEquals Job-OTBSIO sequenceequals 4 False 44.26 ns 0.065 ns 0.058 ns 44.23 ns 0.92 0.01
SequenceEquals Job-GZRMNK main 4 True 38.37 ns 0.546 ns 0.484 ns 38.32 ns 1.00 0.00
SequenceEquals Job-OTBSIO sequenceequals 4 True 68.04 ns 0.268 ns 0.251 ns 68.02 ns 1.77 0.02
SequenceEquals Job-GZRMNK main 32 False 220.79 ns 0.639 ns 0.533 ns 220.77 ns 1.00 0.00
SequenceEquals Job-OTBSIO sequenceequals 32 False 216.49 ns 0.718 ns 0.671 ns 216.34 ns 0.98 0.00
SequenceEquals Job-GZRMNK main 32 True 128.59 ns 2.592 ns 6.260 ns 132.19 ns 1.00 0.00
SequenceEquals Job-OTBSIO sequenceequals 32 True 327.15 ns 0.273 ns 0.213 ns 327.11 ns 2.54 0.12
SequenceEquals Job-GZRMNK main 256 False 1,574.82 ns 5.065 ns 4.230 ns 1,576.47 ns 1.00 0.00
SequenceEquals Job-OTBSIO sequenceequals 256 False 1,677.98 ns 1.263 ns 1.055 ns 1,677.94 ns 1.07 0.00
SequenceEquals Job-GZRMNK main 256 True 722.74 ns 14.354 ns 33.267 ns 745.19 ns 1.00 0.00
SequenceEquals Job-OTBSIO sequenceequals 256 True 2,473.36 ns 1.086 ns 0.907 ns 2,473.33 ns 3.41 0.14

like you said @stephentoub the regression is high, so I'll try to see if there's another way to approach this. We can put this on hold until then

@yesmey yesmey marked this pull request as draft April 9, 2022 19:29
@ghost ghost added the no-recent-activity label Apr 23, 2022
@ghost
Copy link

ghost commented Apr 23, 2022

This pull request has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

@yesmey yesmey closed this Apr 29, 2022
@ghost ghost locked as resolved and limited conversation to collaborators May 29, 2022
@yesmey yesmey deleted the sequenceequals branch May 9, 2023 18:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Linq community-contribution Indicates that the PR has been added by a community member no-recent-activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants