Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysis performance regression when lockfile is enabled #19952

Closed
purkhusid opened this issue Oct 26, 2023 · 9 comments
Closed

Analysis performance regression when lockfile is enabled #19952

purkhusid opened this issue Oct 26, 2023 · 9 comments
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@purkhusid
Copy link

purkhusid commented Oct 26, 2023

Description of the bug:

While migrating our repo to bzlmod we encountered an 3x slowdown in analysis when using bzlmod.

Looking further into it it seems like it's caused by the newly introduced lockfile because the performance returns back to normal when --lockfile_mode=off is set.

I can unfortunately not share an reproduction since the repo is private. The repo is using MODULE.bazel and WORKSPACE.bzlmod since not all of our external deps have been migrated to bzlmod.

I'll share a profile with and without the lockfile:
timing_profile_with_lockfile.gz
timing_profile_without_lockfile.gz

Tagging @fmeum and @Wyverald since I had talked to them on Slack

Which operating system are you running Bazel on?

Linux Fedora 38

What is the output of bazel info release?

release 7.0.0rc2

@Wyverald
Copy link
Member

@SalmaSamy @meteorcloudy

@Wyverald
Copy link
Member

I'm going to preemptively mark this as a release blocker. 3x slowdown is unacceptable

@Wyverald Wyverald added this to the 7.0.0 release blockers milestone Oct 26, 2023
@Wyverald Wyverald added P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. area-Bzlmod Bzlmod-specific PRs, issues, and feature requests and removed untriaged labels Oct 26, 2023
@iancha1992
Copy link
Member

@bazel-io fork 7.0.0

@fmeum
Copy link
Collaborator

fmeum commented Oct 26, 2023

@purkhusid Could you share the output of bazel info after each of these builds?

@purkhusid
Copy link
Author

With lockfile_mode=off (25s runtime):

bazel-bin: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/bin
bazel-genfiles: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/bin
bazel-testlogs: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/testlogs
character-encoding: file.encoding = ISO-8859-1, defaultCharset = ISO-8859-1
command_log: /workspaces/bazel_output_base/command.log
committed-heap-size: 6954MB
execution_root: /workspaces/bazel_output_base/execroot/_main
gc-count: 59
gc-time: 1137ms
install_base: /workspaces/bazel_user_root/install/e0df88ff6ee693b65a8a08e8e987fb40
java-home: /workspaces/bazel_user_root/install/e0df88ff6ee693b65a8a08e8e987fb40/embedded_tools/jdk
java-runtime: OpenJDK Runtime Environment (build 21+35) by Azul Systems, Inc.
java-vm: OpenJDK 64-Bit Server VM (build 21+35, mixed mode) by Azul Systems, Inc.
local_resources: RAM=64290MB, CPU=16.0
max-heap-size: 16861MB
output_base: /workspaces/bazel_output_base
output_path: /workspaces/bazel_output_base/execroot/_main/bazel-out
package_path: %workspace%
release: release 7.0.0rc2
repository_cache: null
server_log: /workspaces/bazel_output_base/java.log.codespaces-4c3412.vscode.log.java.20231026-201446.5072
server_pid: 5072
used-heap-size: 2798MB
workspace: /workspaces/monorepo

With --lockfile_mode=update (96s runtime):

bazel-bin: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/bin
bazel-genfiles: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/bin
bazel-testlogs: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/testlogs
character-encoding: file.encoding = ISO-8859-1, defaultCharset = ISO-8859-1
command_log: /workspaces/bazel_output_base/command.log
committed-heap-size: 4445MB
execution_root: /workspaces/bazel_output_base/execroot/_main
gc-count: 143
gc-time: 2837ms
install_base: /workspaces/bazel_user_root/install/e0df88ff6ee693b65a8a08e8e987fb40
java-home: /workspaces/bazel_user_root/install/e0df88ff6ee693b65a8a08e8e987fb40/embedded_tools/jdk
java-runtime: OpenJDK Runtime Environment (build 21+35) by Azul Systems, Inc.
java-vm: OpenJDK 64-Bit Server VM (build 21+35, mixed mode) by Azul Systems, Inc.
local_resources: RAM=64290MB, CPU=16.0
max-heap-size: 16861MB
output_base: /workspaces/bazel_output_base
output_path: /workspaces/bazel_output_base/execroot/_main/bazel-out
package_path: %workspace%
release: release 7.0.0rc2
repository_cache: null
server_log: /workspaces/bazel_output_base/java.log.codespaces-4c3412.vscode.log.java.20231026-201700.7299
server_pid: 7299
used-heap-size: 3670MB
workspace: /workspaces/monorepo

With --lockfile_mode=error (96s runtime):

bazel-bin: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/bin
bazel-genfiles: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/bin
bazel-testlogs: /workspaces/bazel_output_base/execroot/_main/bazel-out/k8-fastbuild/testlogs
character-encoding: file.encoding = ISO-8859-1, defaultCharset = ISO-8859-1
command_log: /workspaces/bazel_output_base/command.log
committed-heap-size: 3019MB
execution_root: /workspaces/bazel_output_base/execroot/_main
gc-count: 139
gc-time: 3255ms
install_base: /workspaces/bazel_user_root/install/e0df88ff6ee693b65a8a08e8e987fb40
java-home: /workspaces/bazel_user_root/install/e0df88ff6ee693b65a8a08e8e987fb40/embedded_tools/jdk
java-runtime: OpenJDK Runtime Environment (build 21+35) by Azul Systems, Inc.
java-vm: OpenJDK 64-Bit Server VM (build 21+35, mixed mode) by Azul Systems, Inc.
local_resources: RAM=64290MB, CPU=16.0
max-heap-size: 16861MB
output_base: /workspaces/bazel_output_base
output_path: /workspaces/bazel_output_base/execroot/_main/bazel-out
package_path: %workspace%
release: release 7.0.0rc2
repository_cache: null
server_log: /workspaces/bazel_output_base/java.log.codespaces-4c3412.vscode.log.java.20231026-201940.9941
server_pid: 9941
used-heap-size: 1371MB
workspace: /workspaces/monorepo

I did bazel shutdown between each invocation.

@fmeum
Copy link
Collaborator

fmeum commented Oct 28, 2023

I am able to reproduce the issue with a synthetic module extension that generates 2000 repos. This is the call stack responsible for the slowdown:
IMG-20231028-WA0001.jpg

@fmeum
Copy link
Collaborator

fmeum commented Oct 29, 2023

@purkhusid Are you able to run Bazel built from source (bazel build //src:bazel)? If so, could you build and test #19970?

@purkhusid
Copy link
Author

Ran the proposed fix on on our repo:
With lockfile_mode=off: 50s
With lockfile_mode=update: 41s

Note that it's not the same machine as I used for the earlier runs so the overall slowdown compared to that is expected

The proposed fix seems to have worked and then some additional perf gain!

bazel-io pushed a commit to bazel-io/bazel that referenced this issue Nov 2, 2023
This commit fixes a regression in analysis time caused by 19c0c80. Since `BazelModuleResolutionEvent` and `ModuleExtensionResolutionEvent` are both marked as `storeForReplay`, they are stored in a nested set for essentially all analysis phase nodes. This resulted in frequent (i.e., per target) calls to their `hashCode` methods, which are not cached and delegated to the likewise uncached methods on large `ImmutableMap` and `ImmutableTable` instances.

Since there is no need for these events to be deduplicated, switching to reference equality resolves this issue.

The following analysis phase measurements for a synthetic project with a single "hub and spokes" module extension and 2,000 repos illustrate the effect:

* without lockfile: 4.3s
* with lockfile before this commit: 8.3s
* with lockfile after this commit: 4.2s

Fixes bazelbuild#19952

Closes bazelbuild#19970.

PiperOrigin-RevId: 578734010
Change-Id: I870867c5c509389632793b0d5fe5c43ddc6176f3
bazel-io pushed a commit to bazel-io/bazel that referenced this issue Nov 2, 2023
This commit fixes a regression in analysis time caused by 19c0c80. Since `BazelModuleResolutionEvent` and `ModuleExtensionResolutionEvent` are both marked as `storeForReplay`, they are stored in a nested set for essentially all analysis phase nodes. This resulted in frequent (i.e., per target) calls to their `hashCode` methods, which are not cached and delegated to the likewise uncached methods on large `ImmutableMap` and `ImmutableTable` instances.

Since there is no need for these events to be deduplicated, switching to reference equality resolves this issue.

The following analysis phase measurements for a synthetic project with a single "hub and spokes" module extension and 2,000 repos illustrate the effect:

* without lockfile: 4.3s
* with lockfile before this commit: 8.3s
* with lockfile after this commit: 4.2s

Fixes bazelbuild#19952

Closes bazelbuild#19970.

PiperOrigin-RevId: 578734010
Change-Id: I870867c5c509389632793b0d5fe5c43ddc6176f3
iancha1992 pushed a commit that referenced this issue Nov 2, 2023
This commit fixes a regression in analysis time caused by
19c0c80. Since
`BazelModuleResolutionEvent` and `ModuleExtensionResolutionEvent` are
both marked as `storeForReplay`, they are stored in a nested set for
essentially all analysis phase nodes. This resulted in frequent (i.e.,
per target) calls to their `hashCode` methods, which are not cached and
delegated to the likewise uncached methods on large `ImmutableMap` and
`ImmutableTable` instances.

Since there is no need for these events to be deduplicated, switching to
reference equality resolves this issue.

The following analysis phase measurements for a synthetic project with a
single "hub and spokes" module extension and 2,000 repos illustrate the
effect:

* without lockfile: 4.3s
* with lockfile before this commit: 8.3s
* with lockfile after this commit: 4.2s

Fixes #19952

Closes #19970.

Commit
435d1c2

PiperOrigin-RevId: 578734010
Change-Id: I870867c5c509389632793b0d5fe5c43ddc6176f3

Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 7.0.0 RC5. Please test out the release candidate and report any issues as soon as possible. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants