Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable parallel builds across repos of VMR #18824

Merged
merged 3 commits into from
Mar 8, 2024

Conversation

mthalman
Copy link
Member

@mthalman mthalman commented Feb 29, 2024

Fixes dotnet/source-build#3608

This updates the VMR build configuration so that the repos are built in parallel according to their dependencies. This means that two or more repos with no dependencies between them, such as xdt and symreader, can be built at the same time. Note that the previous configuration of building repos sequentially was already making fairly efficient use of available resources (CPU/disk) so the benefit gained from introducing parallelism is marginal. This will be explained with numbers further below.

By default, the configuration will have parallelism enabled. It can be disabled with /p:BuildInParallel=false. One side effect of parallelism is the behavior that occurs when an error occurs in the build. For CI builds, you'd want to collect errors that may occur from all the repo builds that may be happening in parallel so that they can be examined afterwards. For dev builds, it's preferable to fail the build on the first error. Otherwise, the build may continue building other repos that are not dependent on the one that failed and just wasting time. There's already the StopOnFirstFailure setting that is set to true when building the repos but, by default, this is implicitly ignored when BuildInParallel is true as shown in this output from MSBuild:

StopOnFirstFailure will have no effect when the following conditions are all present: 1) The system is running in multiple process mode 2) The BuildInParallel property is true. 3) The RunEachTargetSeparately property is false.

Therefore, in order to handle this, RunEachTargetSeparately is set to true. But that is only done for dev builds. For CI builds, we essentially want StopOnFirstFailure to be ignored.

Now onto the numbers. As mentioned, enabling parallelism only provides marginal benefit. Build times for source build show the most benefit compared to the vertical builds but that may be because not all repos are yet enabled for vertical builds. Why is there only a small benefit? Because the machine resources are already being used efficiently, running the repo builds in parallel is trying to make use of additional resources that just aren't there. So it ends up taking each of those repos longer to build in total time than it did when being run in a sequential manner. Here are some example build times of repos showing their build time when being built sequentially vs in parallel:

Repo Sequential build time Parallel build time
cecil 00:00:13 00:00:48
roslyn-analyzers 00:03:30 00:05:10
razor 00:03:28 00:07:15

This leads to marginally better build times or even just equivalent build times compared to the original sequential behavior. For total time, this can get source build from a time of 1:12:00 down to 1:06:00, shaving off 6 minutes in the build. For vertical builds, it varies according to the platform. It shows an improvement of anywhere from 0 to 4 mins.

@mthalman mthalman requested a review from a team as a code owner February 29, 2024 14:50
@mthalman mthalman requested a review from mmitche February 29, 2024 14:50
Copy link
Member

@mmitche mmitche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprised to see so little benefit compared to my earlier experiments. Let's see how it plays out as the last few repos get enabled.

Copy link
Member

@ViktorHofer ViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now with repositories running in parallel, how does the console output look like when building locally? I would expect it to not be readable anymore. I wonder if we should re-enable MinimalConsoleLogOutput (we disabled that in the past) with parallel builds enabled.

I wonder if a sequential build with full logging (current experience) makes more sense locally.

@mthalman
Copy link
Member Author

mthalman commented Mar 5, 2024

Now with repositories running in parallel, how does the console output look like when building locally? I would expect it to not be readable anymore. I wonder if we should re-enable MinimalConsoleLogOutput (we disabled that in the past) with parallel builds enabled.

I wonder if a sequential build with full logging (current experience) makes more sense locally.

Yes, it can be jumbled. Here's a good example snippet from the output:

  source-build-externals using package version properties saved at /vmr/artifacts/obj/PackageVersions/PackageVersions.source-build-externals.Current.props and /vmr/artifacts/obj/PackageVersions/PackageVersions.source-build-externals.Previous.props
  diagnostics using package version properties saved at /vmr/artifacts/obj/PackageVersions/PackageVersions.diagnostics.Current.props and /vmr/artifacts/obj/PackageVersions/PackageVersions.diagnostics.Previous.props
  [15:01:39.40] Building diagnostics
  Running command:
    /vmr/src/diagnostics/eng/common/build.sh --restore --build --pack --publish --ci --configuration Release -bl /p:DotNetBuildRepo=true /p:DotNetBuildOrchestrator=true /p:RestoreConfigFile=/vmr/artifacts/obj/diagnostics/NuGet.config /p:ArcadeBuildFromSource=true /p:DotNetBuildSourceOnly=true /p:PreviouslySourceBuiltNupkgCacheDir="/vmr/prereqs/packages/previously-source-built/" /p:ReferencePackageNupkgCacheDir="/vmr/prereqs/packages/reference/"
    Log: /vmr/artifacts/log/Release/diagnostics.log
    With Environment Variables:
      SOURCE_BUILT_SDK_VERSION_TRAVERSAL=3.4.0
      DOTNET_INSTALL_DIR=/vmr/.dotnet/
      DOTNET_PATH=/vmr/.dotnet/
      SOURCE_BUILT_SDK_DIR_ARCADE=/vmr/artifacts/source-built-sdks/Microsoft.DotNet.Arcade.Sdk/
      DOTNET_HOST_PATH=/vmr/.dotnet/dotnet
      _InitializeDotNetCli=/vmr/.dotnet/
      SOURCE_BUILT_SDK_DIR_ARCADE_CMAKE_SDK=/vmr/artifacts/source-built-sdks/Microsoft.DotNet.CMake.Sdk/
      _DotNetInstallDir=/vmr/.dotnet/
      SOURCE_BUILT_SDK_DIR_ARCADE_SHARED_FX_SDK=/vmr/artifacts/source-built-sdks/Microsoft.DotNet.SharedFramework.Sdk/
      _InitializeToolset=/vmr/artifacts/source-built-sdks/Microsoft.DotNet.Arcade.Sdk/tools/Build.proj
      SOURCE_BUILT_SDK_DIR_NOTARGETS=/vmr/artifacts/source-built-sdks/Microsoft.Build.NoTargets/
      _OverrideArcadeInitializeBuildToolFramework=net9.0
      DotNetUseShippingVersions=true
      SOURCE_BUILT_SDK_DIR_TRAVERSAL=/vmr/artifacts/source-built-sdks/Microsoft.Build.Traversal/
  DirSize Before Building command-line-api
  Filesystem      Size  Used Avail Use% Mounted on
  /dev/sda1        63G   12G   49G  19% /vmr
      PreReleaseVersionLabel=preview
      PackageVersionStamp=preview
      ContinuousIntegrationBuild=true
  [15:01:39.40] Building source-build-externals
  Running command:
    /vmr/src/source-build-externals/eng/common/build.sh --restore --build --pack --publish --ci --configuration Release -bl /p:DotNetBuildRepo=true /p:DotNetBuildOrchestrator=true /p:RestoreConfigFile=/vmr/artifacts/obj/source-build-externals/NuGet.config /p:ArcadeBuildFromSource=true /p:DotNetBuildSourceOnly=true /p:PreviouslySourceBuiltNupkgCacheDir="/vmr/prereqs/packages/previously-source-built/" /p:ReferencePackageNupkgCacheDir="/vmr/prereqs/packages/reference/" /p:UseInnerClone=true /p:CopySrcInsteadOfClone=true /p:CopyWipIntoInnerSourceBuildRepo=true
    Log: /vmr/artifacts/log/Release/source-build-externals.log

I'll investigate the use of MinimalConsoleLogOutput.

@ViktorHofer
Copy link
Member

I'll investigate the use of MinimalConsoleLogOutput.

The thing is, we intentionally disabled the minimal console log output feature when building locally, a few months ago. I'm not sure if we want to revert that for the sake of parallel build support given that the gains aren't significant on your machine. Even with minimal console log, the log could theoretically still be jumbled. That said, I'm not against enabling it but I wonder if there's a way to prevent the jumbled output.

@mthalman
Copy link
Member Author

mthalman commented Mar 5, 2024

What was the motivation for disabling minimal output? Just to provide more info?

@ViktorHofer
Copy link
Member

ViktorHofer commented Mar 5, 2024

What was the motivation for disabling minimal output? Just to provide more info?

Couple of things:

  1. Full console output makes it possible to search / investigate issues faster. With minimal console log output you need to open the separate repo log text file.
  2. During the unified-build bring-up we wanted to fully understand what repositories build. The minimal console log makes understanding the WHAT harder.
  3. The minimal console log back then had couple of issues, i.e. it always appended instead of overwriting.

cc @mmitche in case you remember

cc @rainersigwald in case you have suggestions / opinions here. Minimal console log output means that output from an msbuild Exec task is redirected into a separate file. Now that we want to enable parallel repo builds, we wonder if we should re-enable that minimal console log feature of if there's something else that we could do in terms of logging.

@mthalman mthalman requested a review from a team as a code owner March 8, 2024 18:32
@mthalman mthalman enabled auto-merge (squash) March 8, 2024 18:32
@mthalman
Copy link
Member Author

mthalman commented Mar 8, 2024

I'll just get this merged for now and we can follow up later with any perceived usability issues with the log.

@mthalman
Copy link
Member Author

mthalman commented Mar 8, 2024

Got an interesting build error:

##[error]repo-projects\Directory.Build.targets(464,5): error MSB3026: Could not copy "D:\a\_work\1\vmr\src\aspire\artifacts\sb\package-cache\microsoft.diasymreader.pdb2pdb\1.1.0-beta2-19575-01\microsoft.diasymreader.pdb2pdb.1.1.0-beta2-19575-01.nupkg" to "D:\a\_work\1\vmr\.packages\microsoft.diasymreader.pdb2pdb\1.1.0-beta2-19575-01\microsoft.diasymreader.pdb2pdb.1.1.0-beta2-19575-01.nupkg". Beginning retry 1 in 1000ms. The process cannot access the file 'D:\a\_work\1\vmr\.packages\microsoft.diasymreader.pdb2pdb\1.1.0-beta2-19575-01\microsoft.diasymreader.pdb2pdb.1.1.0-beta2-19575-01.nupkg' because it is being used by another process. 

I think what's happening here is that two repos that are being built in parallel are both attempting to copy the restored microsoft.diasymreader.pdb2pdb package to the NuGetPackagesRoot dir, .packages, as part of the CopyInnerBuildRestoredPackages target:

<Copy SourceFiles="@(_InnerPackageCacheFiles)"
DestinationFiles="$(NuGetPackageRoot)%(RecursiveDir)%(Filename)%(Extension)"
SkipUnchangedFiles="true"
Condition="'@(_InnerPackageCacheFiles)' != ''" />

I think this can be resolved with just configuring retries on the Copy task. But I also question whether this target should be executed at all in the unified-build. It seems as though it's only used for prebuilt detection which is only relevant to source only builds.

@mthalman mthalman merged commit 3de6912 into dotnet:main Mar 8, 2024
22 checks passed
@mthalman mthalman deleted the sb3608-parallel branch March 8, 2024 21:59
@rainersigwald
Copy link
Member

Minimal console log output means that output from an msbuild Exec task is redirected into a separate file. Now that we want to enable parallel repo builds, we wonder if we should re-enable that minimal console log feature of if there's something else that we could do in terms of logging.

I don't think we have any great options for you other than those right now. cc @baronfel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal for changes to source-build to support parallelism
4 participants