Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync project level info in parallel. #72976

Merged
merged 58 commits into from
Apr 12, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
da8d7e0
Add perf logging
CyrusNajmabadi Apr 10, 2024
9829da1
sync in strips
CyrusNajmabadi Apr 10, 2024
3552d4b
in progress'
CyrusNajmabadi Apr 10, 2024
6377536
fix
CyrusNajmabadi Apr 10, 2024
d150afd
downstream
CyrusNajmabadi Apr 10, 2024
a255fa8
Fetch attributes up front'
CyrusNajmabadi Apr 10, 2024
6d82a4b
get info and text together
CyrusNajmabadi Apr 10, 2024
87de924
Merge remote-tracking branch 'upstream/main' into syncStripes
CyrusNajmabadi Apr 10, 2024
046d435
sync in parallel
CyrusNajmabadi Apr 10, 2024
c8b9a0f
fix
CyrusNajmabadi Apr 10, 2024
d140c85
delete
CyrusNajmabadi Apr 10, 2024
f82f5ad
Apply suggestions from code review
CyrusNajmabadi Apr 10, 2024
58ae336
revert
CyrusNajmabadi Apr 10, 2024
ee6ce0a
Merge branch 'syncStripes' of https://github.com/CyrusNajmabadi/rosly…
CyrusNajmabadi Apr 10, 2024
1033273
Merge branch 'syncStripes' into syncStripesParallel
CyrusNajmabadi Apr 10, 2024
c63c074
remove log helper
CyrusNajmabadi Apr 10, 2024
cdc412a
rename file
CyrusNajmabadi Apr 11, 2024
2439f04
Search less
CyrusNajmabadi Apr 11, 2024
939b9c4
rename
CyrusNajmabadi Apr 11, 2024
357bfeb
rename file
CyrusNajmabadi Apr 11, 2024
c808509
Search less
CyrusNajmabadi Apr 11, 2024
542020e
rename
CyrusNajmabadi Apr 11, 2024
170a0d3
Merge branch 'syncStripes' into syncStripesParallel
CyrusNajmabadi Apr 11, 2024
4c8be11
Merge remote-tracking branch 'upstream/main' into syncStripesParallel2
CyrusNajmabadi Apr 11, 2024
bd55071
Hybrid mode
CyrusNajmabadi Apr 11, 2024
bc626e2
Docs
CyrusNajmabadi Apr 11, 2024
6cbd1b5
Share code
CyrusNajmabadi Apr 11, 2024
e15a73c
optimize
CyrusNajmabadi Apr 11, 2024
69528e5
Add docs
CyrusNajmabadi Apr 11, 2024
21ae618
Docs
CyrusNajmabadi Apr 11, 2024
510331c
Share code
CyrusNajmabadi Apr 11, 2024
d8eb43a
Docs
CyrusNajmabadi Apr 11, 2024
92654f4
Docs
CyrusNajmabadi Apr 11, 2024
afb12f6
fine grained kinds
CyrusNajmabadi Apr 11, 2024
d49c484
Simplify pattern
CyrusNajmabadi Apr 11, 2024
dc8ed7b
Only search solutions and project checksums
CyrusNajmabadi Apr 11, 2024
5b3db93
Add asset hints
CyrusNajmabadi Apr 11, 2024
8929756
REmove unused
CyrusNajmabadi Apr 11, 2024
38902cf
Simplify more
CyrusNajmabadi Apr 11, 2024
2a39c5c
Add implicit
CyrusNajmabadi Apr 11, 2024
4304094
Use kinds uniformly
CyrusNajmabadi Apr 11, 2024
b4efb7f
Updates
CyrusNajmabadi Apr 11, 2024
e91b985
speeling
CyrusNajmabadi Apr 11, 2024
6142d02
Simplify
CyrusNajmabadi Apr 11, 2024
9cdb0bc
Do not capture
CyrusNajmabadi Apr 11, 2024
c9193fd
Finer grained kinds
CyrusNajmabadi Apr 11, 2024
84e2b7a
Finer grained kinds
CyrusNajmabadi Apr 11, 2024
633f604
in progress
CyrusNajmabadi Apr 11, 2024
b044039
in progress
CyrusNajmabadi Apr 11, 2024
5569fb5
in progress
CyrusNajmabadi Apr 11, 2024
2ff97df
in progress
CyrusNajmabadi Apr 11, 2024
10f5b5e
in progress
CyrusNajmabadi Apr 11, 2024
9c9f17b
Only doc text
CyrusNajmabadi Apr 11, 2024
1229821
in progress
CyrusNajmabadi Apr 11, 2024
06dd3e5
in progress
CyrusNajmabadi Apr 11, 2024
4639794
Flesh out
CyrusNajmabadi Apr 11, 2024
b7638f6
tests
CyrusNajmabadi Apr 11, 2024
ec5ad04
docs
CyrusNajmabadi Apr 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
using System.Threading.Tasks;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.Editor.UnitTests.Workspaces;
using Microsoft.CodeAnalysis.PooledObjects;
using Microsoft.CodeAnalysis.Remote;
using Microsoft.CodeAnalysis.Remote.Testing;
using Microsoft.CodeAnalysis.Serialization;
Expand Down Expand Up @@ -137,7 +138,11 @@ public async Task TestProjectSynchronization()
var assetSource = new SimpleAssetSource(workspace.Services.GetService<ISerializerService>(), map);

var service = new AssetProvider(sessionId, storage, assetSource, remoteWorkspace.Services.GetService<ISerializerService>());
await service.SynchronizeProjectAssetsAsync(await project.State.GetStateChecksumsAsync(CancellationToken.None), CancellationToken.None);

using var _ = ArrayBuilder<ProjectStateChecksums>.GetInstance(out var allProjectChecksums);
allProjectChecksums.Add(await project.State.GetStateChecksumsAsync(CancellationToken.None));

await service.SynchronizeProjectAssetsAsync(allProjectChecksums, CancellationToken.None);

TestUtils.VerifyAssetStorage(map, storage);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ internal readonly struct AssetPath
/// </summary>
public static readonly AssetPath SolutionAndTopLevelProjectsOnly = new(AssetPathKind.Solution | AssetPathKind.TopLevelProjects);

public static readonly AssetPath ProjectOnly = new(AssetPathKind.Projects);
CyrusNajmabadi marked this conversation as resolved.
Show resolved Hide resolved

/// <summary>
/// Special instance, allowed only in tests/debug-asserts, that can do a full lookup across the entire checksum
/// tree. Should not be used in normal release-mode product code.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -272,14 +272,13 @@ public async Task FindAsync(

if (projectState.TryGetStateChecksums(out var projectStateChecksums))
{
if (searchingChecksumsLeft.Remove(projectStateChecksums.Checksum))
result[projectStateChecksums.Checksum] = projectStateChecksums;

if (searchingChecksumsLeft.Remove(projectStateChecksums.Info))
result[projectStateChecksums.Info] = projectState.Attributes;

if (searchingChecksumsLeft.Remove(projectStateChecksums.CompilationOptions))
result[projectStateChecksums.CompilationOptions] = projectState.CompilationOptions!;
await projectStateChecksums.FindAsync(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code was duplicating exactly what projectStateChecksums.FindAsync already does. We just want to be able to search projects, just without a particular project-id provided. so this now defers to that helper to do that. thanks the enum values this is easy.

projectState,
// Only search the project level info of the project we're diving into. Do not go into documents.
AssetPath.ProjectOnly,
searchingChecksumsLeft,
result,
cancellationToken).ConfigureAwait(false);
}
}
}
Expand Down
105 changes: 70 additions & 35 deletions src/Workspaces/Remote/ServiceHub/Host/AssetProvider.cs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
using System.IO;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.CodeAnalysis.Diagnostics;
using Microsoft.CodeAnalysis.Internal.Log;
using Microsoft.CodeAnalysis.PooledObjects;
using Microsoft.CodeAnalysis.Serialization;
Expand All @@ -22,7 +23,7 @@ namespace Microsoft.CodeAnalysis.Remote;
internal sealed partial class AssetProvider(Checksum solutionChecksum, SolutionAssetCache assetCache, IAssetSource assetSource, ISerializerService serializerService)
: AbstractAssetProvider
{
private const int PooledChecksumArraySize = 256;
private const int PooledChecksumArraySize = 1024;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i upped this as i was gonig over 256 a fair bit (easy when we have 900 projects in roslyn alone).

private static readonly ObjectPool<Checksum[]> s_checksumPool = new(() => new Checksum[PooledChecksumArraySize], 16);

private readonly Checksum _solutionChecksum = solutionChecksum;
Expand Down Expand Up @@ -109,16 +110,21 @@ await this.SynchronizeAssetsAsync<object, Dictionary<Checksum, object>>(
static (checksum, asset, checksumToObjects) => checksumToObjects.Add(checksum, asset),
arg: checksumToObjects, cancellationToken).ConfigureAwait(false);

using var _3 = ArrayBuilder<ProjectStateChecksums>.GetInstance(out var allProjectStateChecksums);

// fourth, get all projects and documents in the solution
foreach (var (projectChecksum, _) in stateChecksums.Projects)
{
var projectStateChecksums = (ProjectStateChecksums)checksumToObjects[projectChecksum];
await SynchronizeProjectAssetsAsync(projectStateChecksums, cancellationToken).ConfigureAwait(false);
allProjectStateChecksums.Add(projectStateChecksums);
}

await SynchronizeProjectAssetsAsync(allProjectStateChecksums, cancellationToken).ConfigureAwait(false);
}
}

public async ValueTask SynchronizeProjectAssetsAsync(ProjectStateChecksums projectChecksums, CancellationToken cancellationToken)
public async ValueTask SynchronizeProjectAssetsAsync(
ArrayBuilder<ProjectStateChecksums> allProjectChecksums, CancellationToken cancellationToken)
{
// this will pull in assets that belong to the given project checksum to this remote host. this one is not
// supposed to be used for functionality but only for perf. that is why it doesn't return anything. to get
Expand All @@ -129,58 +135,87 @@ public async ValueTask SynchronizeProjectAssetsAsync(ProjectStateChecksums proje
// GetAssetAsync call will most likely cache hit. it is most likely since we might change cache heuristic in
// future which make data to live a lot shorter in the cache, and the data might get expired before one actually
// consume the data.
using (Logger.LogBlock(FunctionId.AssetService_SynchronizeProjectAssetsAsync, Checksum.GetProjectChecksumsLogInfo, projectChecksums, cancellationToken))
using (Logger.LogBlock(FunctionId.AssetService_SynchronizeProjectAssetsAsync, message: null, cancellationToken))
CyrusNajmabadi marked this conversation as resolved.
Show resolved Hide resolved
{
await SynchronizeProjectAssetsWorkerAsync().ConfigureAwait(false);
}

async ValueTask SynchronizeProjectAssetsWorkerAsync()
{
// get children of project checksum objects at once
using var _ = ArrayBuilder<Task>.GetInstance(out var tasks);

// Make parallel requests for all the project data across all projects at once.
tasks.Add(SynchronizeProjectAssetAsync<ProjectInfo.ProjectAttributes>(static p => p.Info, cancellationToken));
tasks.Add(SynchronizeProjectAssetAsync<CompilationOptions>(static p => p.CompilationOptions, cancellationToken));
tasks.Add(SynchronizeProjectAssetAsync<ParseOptions>(static p => p.ParseOptions, cancellationToken));
tasks.Add(SynchronizeProjectAssetCollectionAsync<ProjectReference>(static p => p.ProjectReferences, cancellationToken));
tasks.Add(SynchronizeProjectAssetCollectionAsync<MetadataReference>(static p => p.MetadataReferences, cancellationToken));
tasks.Add(SynchronizeProjectAssetCollectionAsync<AnalyzerReference>(static p => p.AnalyzerReferences, cancellationToken));

// Then sync each project's documents in parallel with each other.
foreach (var projectChecksums in allProjectChecksums)
tasks.Add(SynchronizeProjectDocumentsAsync(projectChecksums, cancellationToken));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cancellationToken

Any concern about a potential large increase in cancellation exceptions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can cross that bridge if it happens :-)


await Task.WhenAll(tasks).ConfigureAwait(false);
}

static void AddAll(HashSet<Checksum> checksums, ChecksumCollection checksumCollection)
{
foreach (var checksum in checksumCollection)
checksums.Add(checksum);
}

async Task SynchronizeProjectAssetAsync<TAsset>(Func<ProjectStateChecksums, Checksum> getChecksum, CancellationToken cancellationToken)
{
await Task.Yield();
using var _ = PooledHashSet<Checksum>.GetInstance(out var checksums);

checksums.Add(projectChecksums.Info);
checksums.Add(projectChecksums.CompilationOptions);
checksums.Add(projectChecksums.ParseOptions);
AddAll(checksums, projectChecksums.ProjectReferences);
AddAll(checksums, projectChecksums.MetadataReferences);
AddAll(checksums, projectChecksums.AnalyzerReferences);
AddAll(checksums, projectChecksums.Documents.Checksums);
AddAll(checksums, projectChecksums.AdditionalDocuments.Checksums);
AddAll(checksums, projectChecksums.AnalyzerConfigDocuments.Checksums);
foreach (var projectChecksums in allProjectChecksums)
checksums.Add(getChecksum(projectChecksums));

// First synchronize all the top-level info about this project.
await this.SynchronizeAssetsAsync<object, VoidResult>(
assetPath: AssetPath.ProjectAndDocuments(projectChecksums.ProjectId), checksums, callback: null, arg: default, cancellationToken).ConfigureAwait(false);
await this.SynchronizeAssetsAsync<TAsset, VoidResult>(
AssetPath.SolutionAndTopLevelProjectsOnly, checksums, callback: null, arg: default, cancellationToken).ConfigureAwait(false);
CyrusNajmabadi marked this conversation as resolved.
Show resolved Hide resolved
}

checksums.Clear();
async Task SynchronizeProjectAssetCollectionAsync<TAsset>(Func<ProjectStateChecksums, ChecksumCollection> getChecksums, CancellationToken cancellationToken)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is almost the same as the prior method. just the primary method says "we have one checksum of a particular type per project" whereas thsi method says "we have many checksums of a particular type per project".

Note; lots of checksums will be duplicated across projects (like metadata and analyzer references) so thsi hashset dedupes a lot most of the time.

{
await Task.Yield();
using var _ = PooledHashSet<Checksum>.GetInstance(out var checksums);

// Then synchronize the info about all the documents within.
await CollectChecksumChildrenAsync(checksums, projectChecksums.Documents).ConfigureAwait(false);
await CollectChecksumChildrenAsync(checksums, projectChecksums.AdditionalDocuments).ConfigureAwait(false);
await CollectChecksumChildrenAsync(checksums, projectChecksums.AnalyzerConfigDocuments).ConfigureAwait(false);
foreach (var projectChecksums in allProjectChecksums)
AddAll(checksums, getChecksums(projectChecksums));

await this.SynchronizeAssetsAsync<object, VoidResult>(
assetPath: AssetPath.ProjectAndDocuments(projectChecksums.ProjectId), checksums, callback: null, arg: default, cancellationToken).ConfigureAwait(false);
await this.SynchronizeAssetsAsync<TAsset, VoidResult>(
AssetPath.SolutionAndTopLevelProjectsOnly, checksums, callback: null, arg: default, cancellationToken).ConfigureAwait(false);
}

async ValueTask CollectChecksumChildrenAsync(HashSet<Checksum> checksums, ChecksumsAndIds<DocumentId> collection)
async Task SynchronizeProjectDocumentsAsync(ProjectStateChecksums projectChecksums, CancellationToken cancellationToken)
{
// This GetAssetsAsync call should be fast since they were just retrieved above. There's a small chance
// the asset-cache GC pass may have cleaned them up, but that should be exceedingly rare.
var allDocChecksums = await this.GetAssetsAsync<DocumentStateChecksums>(
AssetPath.ProjectAndDocuments(projectChecksums.ProjectId), collection.Checksums, cancellationToken).ConfigureAwait(false);
foreach (var docChecksums in allDocChecksums)
await Task.Yield();
using var _1 = PooledHashSet<Checksum>.GetInstance(out var checksums);

AddAll(checksums, projectChecksums.Documents.Checksums);
AddAll(checksums, projectChecksums.AdditionalDocuments.Checksums);
AddAll(checksums, projectChecksums.AnalyzerConfigDocuments.Checksums);

// First, fetch all the DocumentStateChecksums for all the documents in the project.
using var _2 = ArrayBuilder<DocumentStateChecksums>.GetInstance(out var allDocumentStateChecksums);
await this.SynchronizeAssetsAsync<DocumentStateChecksums, ArrayBuilder<DocumentStateChecksums>>(
assetPath: AssetPath.ProjectAndDocuments(projectChecksums.ProjectId), checksums,
CyrusNajmabadi marked this conversation as resolved.
Show resolved Hide resolved
static (_, documentStateChecksums, allDocumentStateChecksums) => allDocumentStateChecksums.Add(documentStateChecksums),
allDocumentStateChecksums,
cancellationToken).ConfigureAwait(false);

// Now go and fetch the info and text for all of those documents.
checksums.Clear();
foreach (var docChecksums in allDocumentStateChecksums)
{
checksums.Add(docChecksums.Info);
checksums.Add(docChecksums.Text);
}
}

static void AddAll(HashSet<Checksum> checksums, ChecksumCollection checksumCollection)
{
foreach (var checksum in checksumCollection)
checksums.Add(checksum);
await this.SynchronizeAssetsAsync<object, VoidResult>(
assetPath: AssetPath.ProjectAndDocuments(projectChecksums.ProjectId), checksums, callback: null, arg: default, cancellationToken).ConfigureAwait(false);
CyrusNajmabadi marked this conversation as resolved.
Show resolved Hide resolved
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -264,16 +264,25 @@ await _assetProvider.GetAssetsAsync<CompilationOptions, VoidResult>(
}

using var _2 = ArrayBuilder<ProjectInfo>.GetInstance(out var projectInfos);
using var _3 = ArrayBuilder<ProjectStateChecksums>.GetInstance(out var projectStateChecksumsToAdd);

// added project
foreach (var (projectId, newProjectChecksums) in newProjectIdToStateChecksums)
{
if (!oldProjectIdToStateChecksums.ContainsKey(projectId))
{
// bulk sync added project assets fully since we'll definitely need that data, and we won't want
// to make tons of intermediary calls for it.
projectStateChecksumsToAdd.Add(newProjectChecksums);
}

// bulk sync added project assets fully since we'll definitely need that data, and we won't want

await _assetProvider.SynchronizeProjectAssetsAsync(projectStateChecksumsToAdd, cancellationToken).ConfigureAwait(false);

await _assetProvider.SynchronizeProjectAssetsAsync(newProjectChecksums, cancellationToken).ConfigureAwait(false);
foreach (var (projectId, newProjectChecksums) in newProjectIdToStateChecksums)
{
if (!oldProjectIdToStateChecksums.ContainsKey(projectId))
{
// Now make a ProjectInfo corresponding to the new project checksums. This should be fast due
// to the bulk sync we just performed above.
var projectInfo = await _assetProvider.CreateProjectInfoAsync(projectId, newProjectChecksums.Checksum, cancellationToken).ConfigureAwait(false);
projectInfos.Add(projectInfo);
}
Expand All @@ -295,7 +304,7 @@ await _assetProvider.GetAssetsAsync<CompilationOptions, VoidResult>(
}
}

using var _3 = ArrayBuilder<ProjectId>.GetInstance(out var projectsToRemove);
using var _4 = ArrayBuilder<ProjectId>.GetInstance(out var projectsToRemove);

// removed project
foreach (var (projectId, _) in oldProjectIdToStateChecksums)
Expand Down Expand Up @@ -549,7 +558,9 @@ await _assetProvider.GetAssetsAsync<DocumentStateChecksums, Dictionary<DocumentI
// 🔗 https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1365014
if (newDocumentIdToStateChecksums.Count > 2)
{
await _assetProvider.SynchronizeProjectAssetsAsync(projectChecksums, cancellationToken).ConfigureAwait(false);
using var _ = ArrayBuilder<ProjectStateChecksums>.GetInstance(out var allProjectChecksums);
allProjectChecksums.Add(projectChecksums);
await _assetProvider.SynchronizeProjectAssetsAsync(allProjectChecksums, cancellationToken).ConfigureAwait(false);
}

return await UpdateDocumentsAsync(project, addDocuments, removeDocuments, oldDocumentIdToStateChecksums, newDocumentIdToStateChecksums, cancellationToken).ConfigureAwait(false);
Expand Down
Loading