Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: provisional parallelization support #1202

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

novas0x2a
Copy link
Contributor

@novas0x2a novas0x2a commented May 16, 2024

This is a proof of concept for one potential (limited) version of parallelization which doesn't require complex implementation and doesn't put the number of CPUS into the cache key (and break caching across machines, the way using --action_env=CMAKE_BUILD_PARALLEL_LEVEL). This is not fully implemented in this MR, known things that are missing:

  1. Windows and FreeBSD don't have impls because I'm not currently sure what to put there.
  2. The ability to enable and disable this on a per-target basis is not currently implemented and is probably necessary (since otherwise, this would be a bit of a compat break, and it's not uncommon for projects to break under parallelization). I'm open to suggestions on the best way to plumb this.

This might eventually implement a simplified form of #329

@jsharpe I know you did some similar work on this in #848; do you think an approach like this would be mergeable (if we figured out good ways of doing the TODOs above?)

Copy link
Member

@jsharpe jsharpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main challenge I see is that all current approaches to parallelism over provision the resources that bazel has assigned to the action; this is often fine in single user systems but when you are on a multi-tenant system or RBE then this is not being a good citizen of the resources.
I think if we were to land something this it would have to be opt-in via a config_setting to turn it on, and then overridable on a per target basis.
I'm also not that thrilled at the idea of adding to the shell toolchain; longer term the shell toolchain should probably be replaced by a cross-platform wrapper tool - TBH I think I prefer the approach of placing this logic in tool wrappers rather than centrally - this means that any consumers of e.g. make would be able to get the parallel invocation semantics without needing support at the callsite; this also solves the opt-in problem as it just becomes a toolchain selection and so then there is no extra attributes or config settings required.
Incidently ninja defaults to parallel builds, we likely should be explicitly wrapping that and limiting it to 1 thread but defaulting CMake builds to the Ninja generator is a quick hack to get parallel builds.

Looking at bazelbuild/bazel#10443 though -i think resource_sets are probably the correct way to pass parallelism info to bazel but only allows a fixed size, and not a dynamically allocated value based on how much resource is currently available.

def enable_parallel_build():
return """
_cpu_count=$(nproc)
export MAKEFLAGS="-j$_cpu_count${MAKEFLAGS:+ }${MAKEFLAGS:-}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also suggest setting -l$_cpu_count here just to be a good citizen to the machine as bazel has only assigned a single core to the action in its scheduling and so limiting the parallelism under load is a useful thing to do.
I'm not familiar enough with CMake's parallelism config to know if there's something equivalent there?

@@ -38,6 +38,13 @@ def enable_tracing():
def disable_tracing():
return "set +x"

def enable_parallel_build():
return """
_cpu_count=$(nproc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a slightly naiive way of getting a cpu account as nproc is not cgroup aware AFAIK.
When running under RBE its likely that you will be constrained in a cgroup and so this added parallelism there will likely actually result in slower builds in those environments. I've yet to see a good way of interrogating cgroups to work out how many processes you actually want to spawn in a reliable way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants