Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary outer loop in CountBy #106110

Closed
AlexRadch opened this issue Aug 8, 2024 · 6 comments
Closed

Unnecessary outer loop in CountBy #106110

AlexRadch opened this issue Aug 8, 2024 · 6 comments
Labels
area-System.Linq in-pr There is an active PR which will close this issue when it is merged needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration tenet-performance Performance related issue
Milestone

Comments

@AlexRadch
Copy link
Contributor

AlexRadch commented Aug 8, 2024

The implementation of the new CountBy adds an unnecessary outer loop through the dictionary elements.

foreach (KeyValuePair<TKey, int> countBy in BuildCountDictionary(enumerator, keySelector, keyComparer))
{
yield return countBy;
}

I think we can return created Dictionary enumerator:

        public static IEnumerable<KeyValuePair<TKey, int>> CountBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey>? keyComparer = null) where TKey : notnull
        {
            // Same code 
            return new CountByEnumerable<TSource, TKey>(source, keySelector, keyComparer);
        }

        private sealed class CountByEnumerable<TSource, TKey>(IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey>? keyComparer) : IEnumerable<KeyValuePair<TKey, int>> where TKey : notnull
        {
            public IEnumerator<KeyValuePair<TKey, int>> GetEnumerator()
            {
                using IEnumerator<TSource> enumerator = source.GetEnumerator();

                if (!enumerator.MoveNext())
                {
                    return Empty<KeyValuePair<TKey, int>>().GetEnumerator();
                }

                return BuildCountDictionary(enumerator, keySelector, keyComparer).GetEnumerator();
            }
        }
@AlexRadch AlexRadch added the tenet-performance Performance related issue label Aug 8, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 8, 2024
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Aug 8, 2024
@eiriktsarpalis
Copy link
Member

The proposed solution allocates an additional delegate and inner enumerator, whereas the existing code iterates over a struct enumerator. It also makes the code substantially less readable, so this doesn't seem like a clear-cut win to me. Can you share any benchmark numbers?

@eiriktsarpalis eiriktsarpalis added the needs-author-action An issue or pull request that requires more info or actions from the author. label Aug 8, 2024
@AlexRadch
Copy link
Contributor Author

Benchmarks

Ratio 0.92
Alloc Ratio 0.97

| Method               | Job        | Toolchain                                                                                                  | Mean    | Error    | StdDev   | Median  | Min     | Max     | Ratio | Gen0        | Allocated | Alloc Ratio |
|--------------------- |----------- |----------------------------------------------------------------------------------------------------------- |--------:|---------:|---------:|--------:|--------:|--------:|------:|------------:|----------:|------------:|
| CountBy00LinqMethodX | Job-EJMQQA | \runtimeU\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe | 3.854 s | 0.0267 s | 0.0250 s | 3.858 s | 3.802 s | 3.888 s |  1.00 | 279000.0000 |   1.09 GB |        1.00 |
| CountBy00LinqMethodX | Job-CQXFKD | \runtime\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe  | 3.558 s | 0.0302 s | 0.0268 s | 3.552 s | 3.522 s | 3.605 s |  0.92 | 271000.0000 |   1.06 GB |        0.97 |

@dotnet-policy-service dotnet-policy-service bot removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Aug 8, 2024
@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Aug 8, 2024

Would it be possible to share results for smaller numbers? It would be interesting to see how it fares with < 10-20 sources and empty sources specifically (since it now allocates an enumerator when it previously didn't).

@AlexRadch
Copy link
Contributor Author

The proposed solution allocates an additional delegate and inner enumerator, whereas the existing code iterates over a struct enumerator. It also makes the code substantially less readable, so this doesn't seem like a clear-cut win to me. Can you share any benchmark numbers?

I made the code readable and removed delegate allocation.

@AlexRadch
Copy link
Contributor Author

Would it be possible to share results for smaller numbers? It would be interesting to see how it fares with < 10-20 sources and empty sources specifically (since it now allocates an enumerator when it previously didn't).

I created benchmarks for 1_000_000, 10_00, 100, 1, and 0 elements.

| Method               | Job        | Toolchain                                                                                                  | Mean                 | Error               | StdDev              | Median               | Min                  | Max                  | Ratio | RatioSD | Gen0        | Allocated    | Alloc Ratio |
|--------------------- |----------- |----------------------------------------------------------------------------------------------------------- |---------------------:|--------------------:|--------------------:|---------------------:|---------------------:|---------------------:|------:|--------:|------------:|-------------:|------------:|
| CountBy00LinqMethodX | Job-GDABRL | \runtimeU\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe | 3,910,704,407.143 ns |  47,932,915.4754 ns |  42,491,281.0757 ns | 3,910,284,800.000 ns | 3,862,096,400.000 ns | 4,001,268,400.000 ns |  1.00 |    0.01 | 279000.0000 | 1168000400 B |        1.00 |
| CountBy00LinqMethodX | Job-MIBMKO | \runtime\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe  | 3,626,278,040.000 ns | 117,166,355.5391 ns | 134,928,949.3195 ns | 3,638,968,850.000 ns | 3,449,981,700.000 ns | 3,961,468,900.000 ns |  0.93 |    0.04 | 269000.0000 | 1128000400 B |        0.97 |
|                      |            |                                                                                                            |                      |                     |                     |                      |                      |                      |       |         |             |              |             |
| CountBy01LinqMethodX | Job-GDABRL | \runtimeU\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe |    38,913,778.571 ns |     633,673.1341 ns |     561,734.7283 ns |    38,809,525.000 ns |    38,209,625.000 ns |    40,176,100.000 ns |  1.00 |    0.02 |   2750.0000 |   11680100 B |        1.00 |
| CountBy01LinqMethodX | Job-MIBMKO | \runtime\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe  |    37,106,366.071 ns |     402,289.7268 ns |     356,619.3645 ns |    37,172,125.000 ns |    36,270,925.000 ns |    37,495,000.000 ns |  0.95 |    0.02 |   2500.0000 |   11280100 B |        0.97 |
|                      |            |                                                                                                            |                      |                     |                     |                      |                      |                      |       |         |             |              |             |
| CountBy02LinqMethodX | Job-GDABRL | \runtimeU\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe |       392,563.812 ns |      10,905.2740 ns |      12,558.5297 ns |       391,153.333 ns |       375,399.583 ns |       417,625.417 ns |  1.00 |    0.04 |     25.0000 |     116802 B |        1.00 |
| CountBy02LinqMethodX | Job-MIBMKO | \runtime\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe  |       371,972.559 ns |       2,695.0589 ns |       2,104.1247 ns |       372,177.148 ns |       366,644.922 ns |       374,392.188 ns |  0.95 |    0.03 |     23.4375 |     112802 B |        0.97 |
|                      |            |                                                                                                            |                      |                     |                     |                      |                      |                      |       |         |             |              |             |
| CountBy03LinqMethodX | Job-GDABRL | \runtimeU\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe |         3,937.227 ns |          67.8874 ns |          63.5019 ns |         3,938.165 ns |         3,852.516 ns |         4,044.015 ns |  1.00 |    0.02 |      0.2668 |       1168 B |        1.00 |
| CountBy03LinqMethodX | Job-MIBMKO | \runtime\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe  |         3,884.312 ns |          75.5855 ns |          70.7027 ns |         3,863.425 ns |         3,790.323 ns |         4,040.566 ns |  0.99 |    0.02 |      0.2657 |       1128 B |        0.97 |
|                      |            |                                                                                                            |                      |                     |                     |                      |                      |                      |       |         |             |              |             |
| CountBy04LinqMethodX | Job-GDABRL | \runtimeU\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe |             3.113 ns |           0.1465 ns |           0.1687 ns |             3.112 ns |             2.775 ns |             3.415 ns |  1.00 |    0.08 |           - |            - |          NA |
| CountBy04LinqMethodX | Job-MIBMKO | \runtime\artifacts\bin\testhost\net9.0-windows-Release-x64\shared\Microsoft.NETCore.App\9.0.0\CoreRun.exe  |             3.966 ns |           0.0786 ns |           0.0696 ns |             3.990 ns |             3.825 ns |             4.046 ns |  1.28 |    0.07 |           - |            - |          NA |

@jeffhandley jeffhandley added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner in-pr There is an active PR which will close this issue when it is merged labels Aug 8, 2024
@jeffhandley jeffhandley added this to the Future milestone Aug 8, 2024
@jeffhandley jeffhandley added the in-pr There is an active PR which will close this issue when it is merged label Aug 8, 2024
@eiriktsarpalis
Copy link
Member

Closing per feedback in #106111.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Linq in-pr There is an active PR which will close this issue when it is merged needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants