Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Text.Json Migration - Adding code to parse the Project.Assets.Json file using STJ. #5558

Conversation

jgonz120
Copy link
Contributor

@jgonz120 jgonz120 commented Dec 22, 2023

Bug

Fixes: NuGet/Home#12715

Regression? Last working version:

Description

This PR update the LockFileFormat class to use STJ for parsing the assets file. There are some comments that were pointed out from #5530 which I think should be done as a separate PR if we decide to move forward with them, so there is potentially one more PR to come.

Benchmark Results

Method Runtime InputFile Mean StdDev Ratio Gen0 Gen1 Gen2 Allocated Alloc Ratio
'LockFileFormat read StreamSTJ' .NET 8.0 10KB.json 1,096.2 us 15.93 us 0.88 1.9531 - - 57.64 KB 0.29
'LockFileFormat read NJ' .NET 8.0 10KB.json 1,239.3 us 17.15 us 1.00 11.7188 3.9063 - 199.3 KB 1.00
'LockFileFormat read StreamSTJ' .NET Framework 4.7.2 10KB.json 380.8 us 6.29 us 0.65 10.2539 0.9766 - 65.13 KB 0.30
'LockFileFormat read NJ' .NET Framework 4.7.2 10KB.json 590.1 us 11.59 us 1.00 35.1563 9.7656 - 218.74 KB 1.00
'LockFileFormat read StreamSTJ' .NET 8.0 11527KB.json 172,547.8 us 6,243.82 us 0.43 4500.0000 4000.0000 1000.0000 63580.02 KB 0.47
'LockFileFormat read NJ' .NET 8.0 11527KB.json 397,688.1 us 8,228.42 us 1.00 10000.0000 9000.0000 2000.0000 136010.63 KB 1.00
'LockFileFormat read StreamSTJ' .NET Framework 4.7.2 11527KB.json 293,957.7 us 7,402.11 us 0.64 11500.0000 4500.0000 1500.0000 66057.4 KB 0.47
'LockFileFormat read NJ' .NET Framework 4.7.2 11527KB.json 462,337.6 us 10,347.79 us 1.00 24000.0000 9000.0000 2000.0000 140931.8 KB 1.00
'LockFileFormat read StreamSTJ' .NET 8.0 1308KB.json 6,410.1 us 69.73 us 0.39 273.4375 257.8125 - 4473.46 KB 0.47
'LockFileFormat read NJ' .NET 8.0 1308KB.json 16,530.2 us 292.99 us 1.00 562.5000 531.2500 - 9534.15 KB 1.00
'LockFileFormat read StreamSTJ' .NET Framework 4.7.2 1308KB.json 12,922.8 us 147.81 us 0.46 765.6250 375.0000 - 4769.82 KB 0.47
'LockFileFormat read NJ' .NET Framework 4.7.2 1308KB.json 28,322.2 us 152.84 us 1.00 1750.0000 781.2500 218.7500 10053.53 KB 1.00
'LockFileFormat read StreamSTJ' .NET 8.0 2756KB.json 18,265.2 us 126.52 us 0.29 718.7500 656.2500 - 11935.02 KB 0.44
'LockFileFormat read NJ' .NET 8.0 2756KB.json 66,284.9 us 3,021.89 us 1.00 2000.0000 1666.6667 333.3333 27390.5 KB 1.00
'LockFileFormat read StreamSTJ' .NET Framework 4.7.2 2756KB.json 46,018.3 us 729.88 us 0.50 2250.0000 916.6667 333.3333 12489.87 KB 0.44
'LockFileFormat read NJ' .NET Framework 4.7.2 2756KB.json 92,955.5 us 1,347.34 us 1.00 5000.0000 2166.6667 833.3333 28609.74 KB 1.00
'LockFileFormat read StreamSTJ' .NET 8.0 786KB.json 4,713.7 us 30.15 us 0.42 179.6875 164.0625 - 2989.07 KB 0.41
'LockFileFormat read NJ' .NET 8.0 786KB.json 11,352.3 us 86.55 us 1.00 437.5000 406.2500 - 7343.26 KB 1.00
'LockFileFormat read StreamSTJ' .NET Framework 4.7.2 786KB.json 8,629.2 us 32.36 us 0.45 500.0000 250.0000 - 3156.93 KB 0.41
'LockFileFormat read NJ' .NET Framework 4.7.2 786KB.json 18,987.8 us 365.01 us 1.00 1250.0000 625.0000 93.7500 7696.09 KB 1.00

PR Checklist

  • PR has a meaningful title

  • PR has a linked issue.

  • Described changes

  • Tests

    • Automated tests added
    • OR
    • Test exception
    • OR
    • N/A
  • Documentation

    • Documentation PR or issue filled
    • OR
    • N/A

@jgonz120 jgonz120 marked this pull request as ready for review December 22, 2023 21:59
@jgonz120 jgonz120 requested a review from a team as a code owner December 22, 2023 21:59
@jgonz120 jgonz120 changed the base branch from dev to dev-jgonz120-FeatureBranch-Stj-Migration December 22, 2023 21:59
@ghost ghost added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Dec 30, 2023
@ghost
Copy link

ghost commented Dec 30, 2023

This PR has been automatically marked as stale because it has no activity for 7 days. It will be closed if no further activity occurs within another 7 days of this comment. If it is closed, you may reopen it anytime when you're ready again, as long as you don't delete the branch.

@ghost ghost removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Jan 4, 2024
@@ -141,5 +170,36 @@ internal static JToken WriteString(string item)
{
return item != null ? new JValue(item) : JValue.CreateNull();
}

internal static NuGetVersion ParseNugetVersion(string value)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#5529 (comment)

@davkean @nkolev92 continuing this conversation here. Would we want the cache of version/version range to live only during the parsing of the file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think defining a great lifetime for these caches would be challenging.

I'm comfortable if this change starts with the assets file read scope, but we should look into expanding it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title of this PR is to migrate to System.Text.Json, and the existing Newtonsoft.Json code doesn't do caching, so can you bump caching work to a different PR?

I'm behind on my other priorities, but ideally I should start work on NuGet/Home#12124 in the next sprint or two. For that work, I need this PR merged into the feature branch, and the feature branch merged into the dev branch.

If merging the assets file System.Text.Json work is delayed, then that will delay my ability to start on this other work. Therefore, I'd like to minimize scope creep that blocks other work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that, so I went ahead and undid the caching stuff and saved a stash of it.

{
/// <summary>
/// A <see cref="Utf8JsonStreamReaderConverter{T}"/> to allow read JSON into <see cref="LockFileTargetLibrary"/>
/// </summary>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be super helpful if these convertors showed an example of what exactly they were parsing to be able to match that to the code.

var propertyName = reader.GetString();
var (targetLibraryName, version) = propertyName.SplitInTwo('/');
lockFileTargetLibrary.Name = targetLibraryName;
lockFileTargetLibrary.Version = version is null ? null : JsonUtility.ParseNugetVersion(version);
Copy link
Contributor

@davkean davkean Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate to this PR, but JsonUtility.ParseNuGetVersion should be a candidate for moving to parsing from a UTF8 string, or at least a ReadOnlySpan to avoid the wasted allocation to split the version into a string, only to throw it away. This is about ~3% of allocations in the trace you shared.

{
string propertyName = reader.GetString();
string versionString = reader.ReadNextTokenAsString();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

35% of allocations come from these values, I feel like sharing the strings for common dependencies would be helpful in reducing allocations, especially given the version number is immediately thrown away.

Copy link
Member

@zivkan zivkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few correctness concerns about methods in the shared/generic Utf8JsonStreamReader class. Even if specific converters are using them in a safe way, since it's in a utility class that could be called by any (internal) method, I think that raises the quality bar so that it should adhere to the principal of least surprise.

@jgonz120 jgonz120 requested a review from zivkan January 22, 2024 23:06
@jgonz120
Copy link
Contributor Author

@zivkan @nkolev92 one thought, I currently have the default path for the restore to use STJ if the environment variable isn't set. Do we want to release this with that default or release it still using Newtonsoft by default asking internal folks to opt into it?

@zivkan
Copy link
Member

zivkan commented Jan 23, 2024

We don't expect 3rd party tools to write the asset file, and we have a lot of integration tests. So, I think if the tests pass using STJ, then it should be quite safe, but the env var to fall back to NJ is a sufficient safety barrier. So, I think that STJ by default is fine.

@nkolev92
Copy link
Member

Great job getting this done!

You probably need to do a rebase before merging into dev.
If you look at the feature branch and see which dev commits are missing,dev-jgonz120-FeatureBranch-Stj-Migration...dev, this one 95f6e0c has a conflict with you changes, so it'll probably need some work to get it to work equivalently.

@jgonz120 jgonz120 merged commit ed7f00c into dev-jgonz120-FeatureBranch-Stj-Migration Jan 26, 2024
16 checks passed
@jgonz120 jgonz120 deleted the dev-FeatureBranch-jgonz120-Stj-LockfileFormat-Migration branch January 26, 2024 20:40
jgonz120 added a commit that referenced this pull request Jan 27, 2024
…Json file using STJ. (#5558)

Update the LockFileFormat class to use STJ for parsing the assets file.
jgonz120 added a commit that referenced this pull request Feb 5, 2024
…Json file using STJ. (#5558)

Update the LockFileFormat class to use STJ for parsing the assets file.
jgonz120 added a commit that referenced this pull request Feb 12, 2024
Crated a new struct to parse the assets file using System.Text.Json instead of Newtonsoft. It will read the file without loading it completely in memory, reducing memory allocations. 

PRs associated with this:
* Create class for reading Json files in chunks (#5530)
* System.Text.Json Migration - Adding code to parse the PackageSpec using STJ (#5541)
* System.Text.Json Migration - Adding code to parse the Project.Assets.Json file using STJ.  (#5558)
* Create class for reading Json files in chunks (#5530)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stop using JObject in assets file reading to reduce allocations
5 participants