Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makes GetChildKeys more efficient #67186

Merged
merged 14 commits into from
Mar 31, 2022
Merged

Conversation

maryamariyan
Copy link
Member

@maryamariyan maryamariyan commented Mar 26, 2022

Details

Prevents ChainedConfigurationProvider.GetChildKeys from splitting strings or making unnecessary allocation when keys do not contain delimiter or when keys are empty.

Fixes #62510

Benchmarks

Benchmark code: link to gist
Also dotnet/performance PR: dotnet/performance#2336

Without diff:

|                                     Method |        Mean |     Error |      StdDev |    Gen 0 |  Gen 1 | Allocated |
|------------------------------------------- |------------:|----------:|------------:|---------:|-------:|----------:|
|         AddChainedConfigurationNoDelimiter |  1,255.4 us |  24.20 us |    50.52 us | 130.8594 | 5.8594 |  1,011 KB |
|               AddChainedConfigurationEmpty | 37,957.7 us | 718.05 us | 1,466.79 us |  76.9231 |      - |  1,010 KB |
|       AddChainedConfigurationWithSplitting |    563.1 us |  11.11 us |    27.88 us |  61.5234 | 5.8594 |    472 KB |
| AddChainedConfigurationWithSplittingLooped |    557.0 us |  11.01 us |    23.69 us |  61.5234 | 5.8594 |    472 KB |

UPDATED TO LATEST:
With latest PR diff (updated using b128f63):

|                                     Method |        Mean |     Error |    StdDev |   Gen 0 |  Gen 1 | Allocated |
|------------------------------------------- |------------:|----------:|----------:|--------:|-------:|----------:|
|         AddChainedConfigurationNoDelimiter |    838.2 us |  16.30 us |  21.19 us | 18.5547 | 2.9297 |    143 KB |
|               AddChainedConfigurationEmpty | 20,030.8 us | 398.32 us | 545.23 us |       - |      - |    143 KB |
|       AddChainedConfigurationWithSplitting |    445.4 us |   8.12 us |   7.20 us |  4.8828 | 0.4883 |     40 KB |
| AddChainedConfigurationWithSplittingLooped |    430.9 us |   8.54 us |  22.50 us |  4.8828 |      - |     40 KB |

@ghost
Copy link

ghost commented Mar 26, 2022

Tagging subscribers to this area: @dotnet/area-extensions-configuration
See info in area-owners.md if you want to be subscribed.

Issue Details

Details

Prevents ChainedConfigurationProvider.GetChildKeys from splitting strings or making unnecessary allocation when keys do not contain delimiter or when keys are empty.

Fixes #62510

Benchmarks

Benchmark code: link to gist

  • Without the PR changes:
image
  • With the PR changes:
image
Author: maryamariyan
Assignees: -
Labels:

area-Extensions-Configuration

Milestone: -

@maryamariyan
Copy link
Member Author

@adamsitnik not sure if I need to change a setting for the MultimodalDistribution warning showing up in the screenshot

@davidfowl
Copy link
Member

Can you also add a performance test for keys that do need the split?

@maryamariyan
Copy link
Member Author

Can you also add a performance test for keys that do need the split?

As you suspected the commit was not optimized for the keys needing split. Now with the newest diff I changed it to be optimized for split case. Using:

        {
// in Setup:
            var splittingKeys = new Dictionary<string, string>() { };
            for (int i = 1000; i < 2000; i++)
            {
                splittingKeys.Add("a:" + i.ToString(), string.Empty);
            }

            _chainedConfigWithSplitting = new ChainedConfigurationProvider(new ChainedConfigurationSource
            {
                Configuration = new ConfigurationBuilder()
                    .Add(new MemoryConfigurationSource { InitialData = splittingKeys })
                    .Build(),
                ShouldDisposeConfiguration = false,
            });
        }

        [Benchmark]
        public void AddChainedConfigurationWithSplitting()
            => _chainedConfigWithSplitting.GetChildKeys(new string[0], null);
  • Without changes

image

  • After new commit: optimized for split case

Screen Shot 2022-03-26 at 1 17 28 AM

@danmoseley
Copy link
Member

Should your perf tests get added to the perf repo?

BTW you can copy paste the results tables into an issue like this as they are valid markdown. That’s easier than pasting screenshots.

Copy link
Member

@gfoidl gfoidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits.

y = yParts[i];
if (x!.Contains(ConfigurationPath.KeyDelimiter) && y!.Contains(ConfigurationPath.KeyDelimiter))
{
string[] xParts = x!.Split(_keyDelimiterArray, StringSplitOptions.RemoveEmptyEntries);
Copy link
Member

@tarekgh tarekgh Mar 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xParts

We can optimize this more by avoiding the allocations occur because of String.Split. I think this is possible by using String.indexof() then compare substrings by indexes (or even using spans).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @tarekgh.

BTW it's a pity that the code which is consuming this type is using string to represent a complex type and uses indexof/split back and forth.

The index is known here:

int indexOf = key.IndexOf(ConfigurationPath.KeyDelimiter, prefixLength, StringComparison.OrdinalIgnoreCase);

then a substring is performed:

return indexOf < 0 ? key.Substring(prefixLength) : key.Substring(prefixLength, indexOf - prefixLength);

then all the logic is reversed to be able to sort the keys:

results.Sort(ConfigurationKeyComparer.Comparison);

It's just a digression, I don't expect @maryamariyan to fix it in this PR (these types are public)

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if I need to change a setting for the MultimodalDistribution warning showing up in the screenshot

You don't need to change anything. This benchmark just seems to be multimodal.

y = yParts[i];
if (x!.Contains(ConfigurationPath.KeyDelimiter) && y!.Contains(ConfigurationPath.KeyDelimiter))
{
string[] xParts = x!.Split(_keyDelimiterArray, StringSplitOptions.RemoveEmptyEntries);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @tarekgh.

BTW it's a pity that the code which is consuming this type is using string to represent a complex type and uses indexof/split back and forth.

The index is known here:

int indexOf = key.IndexOf(ConfigurationPath.KeyDelimiter, prefixLength, StringComparison.OrdinalIgnoreCase);

then a substring is performed:

return indexOf < 0 ? key.Substring(prefixLength) : key.Substring(prefixLength, indexOf - prefixLength);

then all the logic is reversed to be able to sort the keys:

results.Sort(ConfigurationKeyComparer.Comparison);

It's just a digression, I don't expect @maryamariyan to fix it in this PR (these types are public)

@maryamariyan
Copy link
Member Author

maryamariyan commented Mar 28, 2022

With new changes,

Without diff:

|                                     Method |        Mean |     Error |      StdDev |    Gen 0 |  Gen 1 | Allocated |
|------------------------------------------- |------------:|----------:|------------:|---------:|-------:|----------:|
|         AddChainedConfigurationNoDelimiter |  1,255.4 us |  24.20 us |    50.52 us | 130.8594 | 5.8594 |  1,011 KB |
|               AddChainedConfigurationEmpty | 37,957.7 us | 718.05 us | 1,466.79 us |  76.9231 |      - |  1,010 KB |
|       AddChainedConfigurationWithSplitting |    563.1 us |  11.11 us |    27.88 us |  61.5234 | 5.8594 |    472 KB |
| AddChainedConfigurationWithSplittingLooped |    557.0 us |  11.01 us |    23.69 us |  61.5234 | 5.8594 |    472 KB |

With changes:

|                                     Method |        Mean |     Error |    StdDev |   Gen 0 |  Gen 1 | Allocated |
|------------------------------------------- |------------:|----------:|----------:|--------:|-------:|----------:|
|         AddChainedConfigurationNoDelimiter |    832.3 us |  16.46 us |  28.39 us | 18.5547 | 2.9297 |    143 KB |
|               AddChainedConfigurationEmpty | 18,357.2 us | 365.24 us | 676.99 us |       - |      - |    143 KB |
|       AddChainedConfigurationWithSplitting |    391.5 us |   7.66 us |  14.01 us |  4.8828 |      - |     40 KB |
| AddChainedConfigurationWithSplittingLooped |    376.5 us |   6.71 us |   9.19 us |  4.8828 | 0.4883 |     40 KB |

For AddChainedConfigurationWithSplittingLooped we are using keys with a:b:c as the common path, e.g. a:b:c111 gets compared to a:b:c112.

@danmoseley
Copy link
Member

Should your benchmarks above be added to dotnet/performance to protect your change from regressions?

@maryamariyan
Copy link
Member Author

Should your benchmarks above be added to dotnet/performance to protect your change from regressions?

I plan to add those. Are you recommending to add them before this PR goes in?

@danmoseley
Copy link
Member

I plan to add those. Are you recommending to add them before this PR goes in?

That's your call. Adding them before the main PR (at least say half a day before) has the nice effect of gathering numbers before the change, so you see a nice uptick in the graph when your change goes in, and if it regresses it might possibly be interesting what the original value was. I don't think it matters otherwise.

@maryamariyan
Copy link
Member Author

maryamariyan commented Mar 29, 2022

Found certain use cases for which replacing Split with IndexOf would regress for: e.g. for empty segments, the Split API also is taking RemoveEmptyEntries option and so a string like ::: would get and empty parts array.

The replacement function needs to behave similar to Split, returning the same count, not place allocation while we traverse one segment at a time to compare against the other string's segment.

One way forward would be to make use of something like the SpanSplitEnumerator that was reverted back in this PR https://github.com/dotnet/runtime/pull/37757/files.

Vaguely something like:

var splitterX = spanX.Split(',', StringSplitOptions.RemoveEmptyEntries);
var splitterY = spanY.Split(',', StringSplitOptions.RemoveEmptyEntries);
while (splitterX.TryMoveNext(out var sliceX) && splitterY.TryMoveNext(out var sliceY))
{
    // then compare sliceX with sliceY
}
// return splitterX.Length - splitterY.Length

UPDATE

Seems like StringTokenizer can do the logic I need, will be using that instead https://github.com/dotnet/runtime/blob/bd43f55230d533af326e727366f61cfd3ed0e050/src/libraries/Microsoft.Extensions.Primitives/src/StringTokenizer.cs

@tarekgh
Copy link
Member

tarekgh commented Mar 29, 2022

Seems like StringTokenizer can do the logic I need, will be using that instead bd43f55/src/libraries/Microsoft.Extensions.Primitives/src/StringTokenizer.cs

This looks a good idea to use.

- Also fixing a regression from last commit + adds test
@maryamariyan
Copy link
Member Author

maryamariyan commented Mar 30, 2022

I first tried using StringTokenizer and that would have given us the allocation improvement we wanted but not time perf benefits. So I ended up using an improved version of IndexOf for this fix, so we get perf benefits in both time and allocation.

Without diff:

|                                     Method |        Mean |     Error |      StdDev |    Gen 0 |  Gen 1 | Allocated |
|------------------------------------------- |------------:|----------:|------------:|---------:|-------:|----------:|
|         AddChainedConfigurationNoDelimiter |  1,255.4 us |  24.20 us |    50.52 us | 130.8594 | 5.8594 |  1,011 KB |
|               AddChainedConfigurationEmpty | 37,957.7 us | 718.05 us | 1,466.79 us |  76.9231 |      - |  1,010 KB |
|       AddChainedConfigurationWithSplitting |    563.1 us |  11.11 us |    27.88 us |  61.5234 | 5.8594 |    472 KB |
| AddChainedConfigurationWithSplittingLooped |    557.0 us |  11.01 us |    23.69 us |  61.5234 | 5.8594 |    472 KB |

UPDATED TO LATEST:
With latest PR diff (updated using b128f63):

|                                     Method |        Mean |     Error |    StdDev |   Gen 0 |  Gen 1 | Allocated |
|------------------------------------------- |------------:|----------:|----------:|--------:|-------:|----------:|
|         AddChainedConfigurationNoDelimiter |    838.2 us |  16.30 us |  21.19 us | 18.5547 | 2.9297 |    143 KB |
|               AddChainedConfigurationEmpty | 20,030.8 us | 398.32 us | 545.23 us |       - |      - |    143 KB |
|       AddChainedConfigurationWithSplitting |    445.4 us |   8.12 us |   7.20 us |  4.8828 | 0.4883 |     40 KB |
| AddChainedConfigurationWithSplittingLooped |    430.9 us |   8.54 us |  22.50 us |  4.8828 |      - |     40 KB |

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know that xDelimiterIndex is at a delimiter. There is no reason to slice to that location, you can skip past it.

Co-authored-by: Eric Erhardt <eric.erhardt@microsoft.com>
Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Thanks for the great work here, @maryamariyan.

@maryamariyan maryamariyan merged commit 09d8346 into dotnet:main Mar 31, 2022
@ghost ghost locked as resolved and limited conversation to collaborators May 1, 2022
@adamsitnik adamsitnik added the tenet-performance Performance related issue label Nov 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ChainedConfigurationProvider.GetChildKeys is inefficient
8 participants