Cooperative Parallelism #10443

kastiglione · 2019-12-18T23:48:53Z

Description of the problem / feature request:

This is an umbrella issue of problems that arise from using build tools that have their own internal parallelism.

In this Google Groups thread, @jmmv asked to file an issue about this:

https://groups.google.com/d/msg/bazel-discuss/_oHaU50P5Rg/imx5Y49MAwAJ

A little context: swiftc is the swift compiler driver. It's a non-traditional compiler, it doesn't build one source file at time, it builds one module of N source files at time. swiftc spawns "swift frontend" invocations, and the number of spawned processes is very often >1.

There are two related problems:

Tools that perform parallel sub-actions cannot express this use of parallelism to bazel
Actions have no API through which they can specify maximum parallelism

In the first case, it would be good if the action API could express to Bazel how much parallelism is used by an action. This avoids the problem of N bazel actions each running some M sub-actions each.

In the second case, it would be good if the action API could express a range of parallelism an action is capable of using. This would really help the performance of bottleneck actions in the critical path. For example, Bazel could see that it's not using its full amount of jobs, and donate the extra parallelism to the bottleneck action. We see this as particularly useful at the tail end of builds, where there are fewer targets left to build. This problem shows up even more in incremental builds, where the action graph is often much more flat, even linear.

As @allevato pointed out in the google groups thread, this would require some way for actions to pass args that are known not to affect output, such as a -j<N> flag. This would also need to preserve the cache keys.

Feature requests: what underlying problem are you trying to solve with this feature?

This feature allows us to avoid two current problems:

prevent oversubscribing cpu during the build
prevent slower-than-necessary builds that are caused by wasted parallelism

The first issue can happen with any swift module over 25 files. The default batching logic creates one swift frontend for each group of 25 files. A swift module with 100 files will spawn 4 sub-actions, unbeknownst to Bazel.

As mentioned, the second case is something that causes slowdowns for incremental development builds.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

If needed, I can make a rules_swift project that demonstrates the issue.

We see the problem in our build by looking at --experimental_generate_json_trace_profile and by comparing to Xcode's builds, which can sometimes be faster due to its seemingly hard code use of -j8.

What operating system are you running Bazel on?

macOS

What's the output of `bazel info release`?

release 1.2.0

Have you found anything relevant by searching the web?

As mentioned above, a small amount of discussion happened on Google Groups:

https://groups.google.com/d/msg/bazel-discuss/_oHaU50P5Rg/imx5Y49MAwAJ

I've also posted a general (non-bazel) question to the Swift Forums.

https://forums.swift.org/t/globally-optimized-build-parallelism/31802/2

The text was updated successfully, but these errors were encountered:

rupertks · 2019-12-20T21:44:26Z

Thanks for creating this issue!

Since @jmmv specifically asked for it to be created I am removing it from the untriaged label and giving it an initial P2 priority

jmmv · 2020-05-14T15:14:19Z

Just one more addition since I just closed #11275: if we do this and explicitly tell an action that we should use X threads, we also have to go the other way and ensuring the action doesn't use more than X threads (1 in the general case!) when told so.

mzeren-vmw · 2020-10-08T13:58:32Z

we also have to go the other way and ensuring the action doesn't use more than X threads (1 in the general case!) when told so.

I don't see this as a requirement, at least not until someone provides a use case. We have a local patch that lets us set concurrency and we have cases where an action may peek briefly at 4 threads but empirically has a steady state of 2 threads, for example. A multi-threaded process does not usually consume cores in 100% core increments.

motiejus · 2020-12-28T20:14:06Z

Consider xz -T0, which uses all available cores. Assume:

genrule(
    name = name+"-xz",
    srcs = [name],
    outs = [name + ".xz"],
    cmd = "xz -T0 -f $< > $@",
)

I would like to tell Bazel that "this rule uses N-1 workers, where N is the number of available cores"; I would still like to leave 1-2 slots, in case they are IO bound.

pauldraper · 2021-11-11T14:45:00Z

Note for comparison: GNU make supports this via its "jobserver": https://www.gnu.org/software/make/manual/html_node/Job-Slots.html

brentleyjones · 2021-11-16T21:32:03Z

xcodebuild is using the swift driver library to accomplish this: https://twitter.com/BenchR/status/1460699068846456832

matts1 · 2022-12-06T02:24:52Z

GNU make recently added support for jobservers via named pipes (previously they had to be passed around via file descriptors).

Could we have bazel create named pipes for its own jobserver, and provide some kind of mechanism to provide those pipes to actions (variable substitition?)? Someone made a workaround which literally just has a jobserver service running in the background for rules_foreign_cc's make, but it'd be nice if we could have this run with fully self-contained builds.

larsrc-google · 2022-12-07T08:20:28Z

One thing you can do now is provide better estimates of the #CPUs your jobs will use. @wilwell submitted d7f0724, which allows specifying the expected amount of CPU/RAM depending on the number of inputs. And I have work in progress to use cgroups for sandboxes, which would allow more flexible limits. Neither of those are as powerful as negotiating with the processes, but I want to see a strong need for that power before complicating matters even more.

lukokr-aarch64 · 2023-12-12T14:34:24Z

We have a very heavy compute build action that would like to use the most cores it can.

We have no option to break it down into smaller chunks.

Having the cooperative parallelism would go a long way for us as right now there is no obvious way to obtain the number of jobs that Bazel has access to within a rule context such that we can return it via resource_set.

We could use a workaround with a repository rule calling out to nproc but that would not account for Bazel --jobs=N flag.

In the short term it would be useful if the resource_set callable recieved the upper limits on the resources. Something like:

# This breaks the current API but it demonstrates what we would like to be able to do. 
def _resources(default, limit, platform):
    default["cpu"] = _jobs(max = limit["cpu"], min = 1, diff = -2)
    return default

The idea for us is on a machine with 56 cores we could reserve some cores for other smaller actions to trickle through.

Alternatively a tag to mark all actions of a certain type to be exclusive similiar to how tests can be marked with the exclusive tag.

fmeum · 2023-12-12T23:13:35Z

With resource_set, you can approximate having an action run on min(N, $(nproc)) cores for a constant N due to how scheduling works in Bazel: If a resource isn't in use (i.e., all remaining actions are waiting for the big one), an action is executed even if it requests resources in excess of the available total.

This doesn't allow any kind of $(nproc) - 2 logic though, which would be quite convenient. But I think that this would require special support in ctx.actions.args so that the action's command line can indicate the available level of parallelism. Just making the resource API more flexible wouldn't be sufficient.

blackliner · 2024-12-14T01:19:57Z

rules_rust having a similar challenge: bazelbuild/rules_rust#3101

fmeum · 2024-12-14T08:43:10Z

Bazel gained a --experimental_cpu_load_scheduling flag that schedules actions based on continuously measured actual (not modeled) CPU load. That could be interesting for rulesets to try out.

jin added team-Local-Exec Issues and PRs for the Execution (Local) team untriaged labels Dec 20, 2019

rupertks added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Dec 20, 2019

kastiglione mentioned this issue Feb 3, 2020

Use multiprocessing to speed up bundletool bazelbuild/rules_apple#698

Closed

jmmv added the type: feature request label May 13, 2020

jmmv mentioned this issue May 14, 2020

Limit CPU usage of actions (or all sandboxed operations) #11275

Closed

keith mentioned this issue Aug 11, 2020

Build times and swiftc parallelism bazelbuild/rules_swift#258

Closed

HackAttack mentioned this issue Sep 26, 2020

Parallel build support bazel-contrib/rules_foreign_cc#329

Open

jmmv mentioned this issue Oct 8, 2020

Allow to control the degree of parallelism for specific actions #12143

Closed

phlax mentioned this issue Jun 7, 2022

[WIP] protodoc: Don't use aspect envoyproxy/envoy#21579

Closed

jwnimmer-tri mentioned this issue Jan 18, 2023

kcov fails ("status: 0x85") reproducibly on some binaries (tracker for kcov bug 339) RobotLocomotion/drake#17978

Closed

fmorency mentioned this issue Feb 6, 2023

tags and build_script_tags crate annotations bazelbuild/rules_rust#1821

Open

benradf mentioned this issue Aug 4, 2023

Distributed build for nixpkgs itself. tweag/rules_nixpkgs#208

Closed

jin mentioned this issue Nov 29, 2023

Consistent resource management across actions and resource types #19679

Open

5 tasks

benradf mentioned this issue Mar 6, 2024

Gracefuly handle available CPU cores tweag/rules_nixpkgs#482

Open

jsharpe mentioned this issue May 17, 2024

WIP: provisional parallelization support bazel-contrib/rules_foreign_cc#1202

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cooperative Parallelism #10443

Cooperative Parallelism #10443

kastiglione commented Dec 18, 2019 •

edited

Loading

rupertks commented Dec 20, 2019 •

edited

Loading

jmmv commented May 14, 2020

mzeren-vmw commented Oct 8, 2020 •

edited

Loading

motiejus commented Dec 28, 2020 •

edited

Loading

pauldraper commented Nov 11, 2021

brentleyjones commented Nov 16, 2021

matts1 commented Dec 6, 2022 •

edited

Loading

larsrc-google commented Dec 7, 2022

lukokr-aarch64 commented Dec 12, 2023

fmeum commented Dec 12, 2023 •

edited

Loading

blackliner commented Dec 14, 2024

fmeum commented Dec 14, 2024

Cooperative Parallelism #10443

Cooperative Parallelism #10443

Comments

kastiglione commented Dec 18, 2019 • edited Loading

Description of the problem / feature request:

Feature requests: what underlying problem are you trying to solve with this feature?

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

What operating system are you running Bazel on?

What's the output of bazel info release?

Have you found anything relevant by searching the web?

rupertks commented Dec 20, 2019 • edited Loading

jmmv commented May 14, 2020

mzeren-vmw commented Oct 8, 2020 • edited Loading

motiejus commented Dec 28, 2020 • edited Loading

pauldraper commented Nov 11, 2021

brentleyjones commented Nov 16, 2021

matts1 commented Dec 6, 2022 • edited Loading

larsrc-google commented Dec 7, 2022

lukokr-aarch64 commented Dec 12, 2023

fmeum commented Dec 12, 2023 • edited Loading

blackliner commented Dec 14, 2024

fmeum commented Dec 14, 2024

kastiglione commented Dec 18, 2019 •

edited

Loading

What's the output of `bazel info release`?

rupertks commented Dec 20, 2019 •

edited

Loading

mzeren-vmw commented Oct 8, 2020 •

edited

Loading

motiejus commented Dec 28, 2020 •

edited

Loading

matts1 commented Dec 6, 2022 •

edited

Loading

fmeum commented Dec 12, 2023 •

edited

Loading