API for Skylark rules and genrules to declare ResourceSets #6477

ghost · 2018-10-23T15:12:43Z

Description of the problem / feature request:

My project uses multithreaded code emitters written in Java. They may eat CPU and RAM, and when running concurrently may overwhelm the build host. It'd be great if I could somehow signal to Bazel's scheduler that these processes are heavier so that it could adjust accordingly.

Feature requests: what underlying problem are you trying to solve with this feature?

When porting a project from Maven to Bazel, I observe longer build times with Bazel. Looking at the critical path, I see that a few actions--most notably one of the code emitters--takes significantly longer under Bazel (23s) than when run from the command line (15s) or Maven.

If I understand the code correctly (GenRuleAction.java), Bazel treats all genrules the same. They're assumed to consume ~300M of RAM and a single CPU. My tasks, however, typically occupy 2-4 CPUs, so I'd like Bazel to take this into account when scheduling actions running on my build host.

I'm still too new to Bazel to suggest a concrete API, but maybe something like

genrule(
    ...,
    resources=(512, 3.0, 0.0),  # Corresponds to ResourceSet: 512M, 3 CPUs, 0/default for I/O
)

# or

genrule(
    ...,
    resources_ram=512,
    resources_cpu=3.0,
)

would be feasible? (Ditto for plain rules, too!)

What operating system are you running Bazel on?

Linux (CentOS 6.6)

What's the output of `bazel info release`?

$ bazel info release
release 0.16.1- (@non-git)

(This is 0.18.0rc8.)

If `bazel info release` returns "development version" or "(@non-git)", tell us how you built Bazel.

n/a. This is just a feature request.

What's the output of `git remote get-url origin ; git rev-parse master ; git rev-parse HEAD` ?

n/a. This is just a feature request.

Have you found anything relevant by searching the web?

#988 touches on a similar topic.

Any other information, logs, or outputs that you want to share?

The text was updated successfully, but these errors were encountered:

jin · 2018-10-23T17:02:15Z

cc @ulfjack @jmmv

jmmv · 2018-10-24T01:44:11Z

I think this makes a lot of sense. (The need for this came up as well during the Bazel hackathon but I cannot remember the details right now.)

I don't like the first proposal of using a tuple: unnamed parameters are very hard to read. I prefer either a dictionary with named quantities, the separate attributes, or maybe better, a higher-level data type to represent the resource set with named fields (a la collections.namedtuple).

I'd also try to avoid floats even if the backing ResourceSet supports them (because I think we should get rid of them). For CPU reservations, floats are very hard to deal with, and for I/O, we can just represent them as an integer in the [0,100] range.

ittaiz · 2018-10-24T02:57:50Z

Should this plug into the same/similar configuration of configuring “requirements” from remote workers? Sounds related to configurability but I might be wrong (cc @katre)

…

On Wed, 24 Oct 2018 at 4:44 Julio Merino ***@***.***> wrote: I think this makes a lot of sense. (The need for this came up as well during the Bazel hackathon but I cannot remember the details right now.) I don't like the first proposal of using a tuple: unnamed parameters are very hard to read. I prefer either a dictionary with named quantities, the separate attributes, or maybe better, a higher-level data type to represent the resource set with named fields (a la collections.namedtuple). I'd also try to avoid floats even if the backing ResourceSet supports them (because I think we should get rid of them). For CPU reservations, floats are very hard to deal with, and for I/O, we can just represent them as an integer in the [0,100] range. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#6477 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABUIF_TrjabaiQc1wv7QkV6FvDLCLECqks5un8XvgaJpZM4X10Sq> .

ulfjack · 2018-10-24T05:52:37Z

We already have a mechanism to configure cpu, but not one to configure ram. I don't think adding a mechanism to configure I/O makes any sense, given that the use of the "IO" resource in Bazel is inconsistent and more to address one-off restriction cases than as any sort of general I/O model. Basically, what I/O is is completely undefined and people have been abusing the setting to implement random restrictions. I think those are basically all wrong because they inherently don't scale. Bazel assumes a machine has "1.0 I/O", so a setting of 0.3 means "don't run more than 3 of these at the same time", regardless of whether I have an uber-powerful 52 core workstation with x-point high throughput memory in a raid-0 configuration for disk or a puny 2 core laptop with a spinning platter.

The way to configure cpu is to set tags = ["cpu:3"]. It's possible that this isn't properly processed everywhere, but that would be easy to fix. Arguably, using tags isn't ideal because it's not typesafe. However, adding this to all sorts of rules and actions as custom type-safe fields isn't ideal either, and I think this solution is a reasonable compromise. Certainly, if someone were to add "mem:" or similar, I would insist that the user must provide a unit at the end, e.g., "mem:5g" or "mem:3kb".

dslomov · 2019-02-15T15:19:10Z

Please do not assign issues to more than one team.

Globegitter · 2019-07-03T06:13:30Z

@beasleyr-vmw I think this is great, but is it also possible to specify cpu and ram in a dynamic way, similar to how local_ram_resources and local_cpu_resources can be set dynamically? I.e. something like HOST_CPUS-1 etc. As I am often used to setting multi-threaded apps based on the available host cpus available rather than a fully static amount.

kastiglione · 2019-12-30T16:56:37Z

Note that the cpu:n feature works only for tests, not other actions. From Ulf's comment above, I thought it was generally available, but it's not.

To do some local testing, I hacked in cpu:n support for Starlark actions, but in my initial testing of swift builds, the total build time was not better for our build (which has many small modules). I'm not sure if I'll continue pursuing it. kastiglione@0a1e78b

IlyaNiklyaev · 2020-02-25T16:41:53Z

We have a big project with heavy compilations and heavy tests. Tests also require some resources which are not CPU or RAM (for example, GPU memory or special CPU mode).
For RAM-heavy tests we initially set exclusive tag. I've implemented ram:<int> tag just like cpu:<int> and it works like a charm. But there are some issues:

this option isn't flexible, I'd like to be able to multiply RAM requirements for all targets globally (for example, for sanitizers or code coverage runs)
this way of setting requirements by tags tends to become more complex if we'll add some special resources (like GPU mem)

So I guess we need two things:

adjustable memory estimations for test sizes instead of hardcoded ones (by option like --test_sizes=100M,500M,1G,5G)
some mechanism of describing resources/requirements in more flexible way

As for latter, I guess it may look something like this in rules:

cc_test(
  ...
  resources: {
    "cpu": "2",
    "ram": "2G",        # not an ideal solution, but may be useful 
    "gpu_ram": "2",
    ...
  }
  ...
)

As for new types of resources like gpu_ram we can use some new workspace rule like this:

resource(
  name = "gpu_ram",
  capacity = 16,
)

I think we can contribute adjustable memory estimations for now. A mechanism of describing custom resources looks much bigger.

juliexxia · 2020-02-25T19:21:03Z

I haven't looked too too deeply into this comment thread but the new Action Groups proposal might be of interest to you specifically the bit on exec_properties

brentleyjones · 2020-03-18T21:03:32Z

@kastiglione

To do some local testing, I hacked in cpu:n support for Starlark actions, but in my initial testing of swift builds, the total build time was not better for our build (which has many small modules). I'm not sure if I'll continue pursuing it.

Were you setting it to mimic what was being passed in for -j? I assume that having that value match what swift is using would be best. I also assume that we would want the value to be dynamic, to reflect that some compiles will use fewer jobs.

IlyaNiklyaev · 2020-07-14T08:03:48Z

@juliexxia
Sorry for the late response. Yes, exec_properties would cover the case of "custom resources" like GPU memory (actually, even without the proposal you mentioned), but AFAIK Bazel's local ResourceManager doesn't respect exec_properties. Maybe you know about any plans/proposals of supporting exec_properties in ResourceManager?

BTW is it a good idea to do that?

juliexxia · 2020-07-27T17:54:48Z

Ah, You are correct. Apologies for not more deeply understanding the needs of this post earlier. exec_properties is meant to be read by remote execution machinery and isn't supported by the local ResourceManager. Bazel is general doesn't do any real parsing of exec_properties, just forwards it along. I am not aware of any plans to make ResourceManager respect exec_properties and agreed, I'm not sure it's a great idea to do that.

emidln · 2020-08-20T23:14:52Z

Asking for the underinformed, but why is it a problem if the local ResourceManager respects certain exec_properties? If this is a needed piece of a functionality (and I'd argue that it is, bazel habitually underestimates my static CppLink actions to the point I might have to run a fork to make the memory estimate configurable), we'd need some way of specifying this. Is have a local_exec_properties and a remote_exec_properties better than declaring a single exec_properties which is interpreted to the best of a ResourceManager's abilities?

uri-canva · 2020-09-24T04:43:16Z

From the discussion so far it sounds like having ResourceManager interpret exec_properties would be exactly what we're looking for. @juliexxia what are the reasons why it might not be a good idea?

ulfjack · 2020-09-29T20:34:29Z

I'm in favor of making ResourceManager obey exec_properties where appropriate. It seems like a good way to unify local and remote execution, rather than treating them as inherently separate.

IIRC, @katre said in another thread that exec_properties is purely for remote execution; adding him here for visibility.

katre · 2020-09-29T21:02:26Z

I think I wasn't clear in my previous emails with @ulfjack: exec_properties isn't only for remote execution, it's for any execution strategy that is interested in the execution platform (and with the Execution Platforms vs Strategies proposal in development, that's all of them).

However, Bazel does not currently set any meanings for the keys in exec_properties: that is up to the execution strategy. Currently the keys used by Bazel's Remote Execution API are defined in the API description, and are only meaningful in the context of that strategy.

We could have the local execution strategies use some of these keys as well, but this raises the problem that Google's internal remote execution uses a completely different set of keys. We'd need to figure out a way to parameterize those keys to work with the existing internal/external remote execution systems. This makes things a bit harder for those of us working with both systems, and I'm sure is frustrating to people only using Bazel, but it does need to be considered for us.

IlyaNiklyaev · 2020-09-30T14:31:43Z

As far as I can see, we can go further in one of two ways:

Design and support a predefined set of properties. In that case local or remote execution systems may support a subset of it, and each property should have some certain meaning.
Let users describe resources in BUILD/WORKSPACE/bzl files. These descriptions may, for example, contain some executable scripts to interact with hardware or smth.

benjaminp · 2021-12-09T17:23:27Z

d7f0724

joeljeske · 2022-03-11T17:46:00Z

@benjaminp and @wilwell thanks! That looks a fantastic step in the right direction.

This looks like it will be very helpful for allowing rule authors to determine the resources needed for a specific action.

For generic rulesets, it does seem like it might be a bit more difficult to adopt. If I understand the constraints correctly, the rule itself must reference a global starlark function that is callable within the rule at the time of the declaring the run action. This makes it difficult to the consumer of a rule to determine what resources should be reserved for a particular action.

For example, rules_nodejs has a npm_package_bin generic rule that allows a consumer to execute an arbitrary NPM tool with arbitrary arguments and file inputs. I don't see a good way for npm_package_bin to allow its consumers to specify the resources that would be needed for this rule given the constraints of always requiring a global starlark function.

It does make sense why you could not allow lambdas or nested functions. Would it be feasible to allow rule authors to specify the resource_set dictionary directly, as opposed to a function that defines it? That would make it feasible for these generic rules to accept some form of attrs so the consumers can specify their constraints.

c-nuro · 2022-03-24T16:51:15Z

+1 for @joeljeske's ask above.

Another scenario is for rules_foreign_cc that invokes make to compile. It needs -j parameter to explicitly run parallized, so the resource set could be directly determined by that parameter, but it's hard to infer from the input of the global function.

cloudhan · 2022-03-30T15:21:42Z

+1 for @joeljeske's ask above.
+1 for @c-nuro's case.

Why pass a callable, while the callable cannot have a full overview of the action. Why not just let the rule author pass the resource set dict in directly, or even better, a ResourceSetInfo. At the moment of actions.run is in construction, the rule author almost have all the information he can get, the so called input_size is also available.

And use a callable is a weird design choice, bazel is not designed a round higher order function in any means...

And, what does the resource set mean? Does it impose the hard limit? Or just a hint to the scheduler? In case I still want to over subscribe the CPU, you know, not everything can be always embarrassingly parallel, there is always something must be serial, and it take time, and sometimes you cannot easily break them into multiple actions.run

github-actions · 2023-06-13T01:33:02Z

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

fmeum · 2023-06-13T05:58:00Z

@bazelbuild/triage not stale

jin added team-Performance Issues for Performance teams untriaged type: feature request labels Oct 23, 2018

meisterT added P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Execution and removed untriaged labels Nov 29, 2018

jin added team-Local-Exec Issues and PRs for the Execution (Local) team and removed team-Execution labels Jan 14, 2019

dslomov removed the team-Performance Issues for Performance teams label Feb 15, 2019

jmmv mentioned this issue Mar 14, 2019

bazel 0.10.0 too aggressive with memory, causes OOM-killer to be invoked #4616

Closed

ghost mentioned this issue Apr 21, 2019

Starlark API to allow rules and genrules to declare ResourceSets #8107

Closed

EricCousineau-TRI mentioned this issue Nov 26, 2019

Cpu of workers not taken into account #3688

Closed

jwnimmer-tri mentioned this issue May 1, 2020

Limit CPU usage of actions (or all sandboxed operations) #11275

Closed

HackAttack mentioned this issue Sep 26, 2020

Parallel build support bazel-contrib/rules_foreign_cc#329

Open

EricCousineau-TRI mentioned this issue Oct 18, 2020

Canonicalize flags after reading configuration files? (to enable composing tag filters) #12296

Closed

nacl mentioned this issue Jan 29, 2021

pkg_tar gzip would be faster with a concurrent (pigz) mode bazelbuild/rules_pkg#59

Closed

EricCousineau-TRI mentioned this issue Feb 26, 2021

Implement extra resources for test actions #11963

Closed

cloudhan mentioned this issue Apr 6, 2022

resource_set should allow specify the resource dict directly. #15187

Open

jmmv mentioned this issue Jan 31, 2023

Computed C++ linker RAM is way off reality #17368

Open

fmorency mentioned this issue Feb 6, 2023

tags and build_script_tags crate annotations bazelbuild/rules_rust#1821

Open

sluongng mentioned this issue Apr 17, 2023

[feature-request] fine-grained exclusive buildbuddy-io/buildbuddy#3777

Open

github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Jun 13, 2023

sgowroji removed the stale Issues or PRs that are stale (no activity for 30 days) label Jun 13, 2023

iancha1992 added the not stale Issues or PRs that are inactive but not considered stale label Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API for Skylark rules and genrules to declare ResourceSets #6477

API for Skylark rules and genrules to declare ResourceSets #6477

ghost commented Oct 23, 2018

jin commented Oct 23, 2018

jmmv commented Oct 24, 2018

ittaiz commented Oct 24, 2018 via email

ulfjack commented Oct 24, 2018 •

edited

Loading

dslomov commented Feb 15, 2019

Globegitter commented Jul 3, 2019 •

edited

Loading

kastiglione commented Dec 30, 2019 •

edited

Loading

IlyaNiklyaev commented Feb 25, 2020

juliexxia commented Feb 25, 2020 •

edited

Loading

brentleyjones commented Mar 18, 2020

IlyaNiklyaev commented Jul 14, 2020

juliexxia commented Jul 27, 2020

emidln commented Aug 20, 2020 •

edited

Loading

uri-canva commented Sep 24, 2020

ulfjack commented Sep 29, 2020

katre commented Sep 29, 2020

IlyaNiklyaev commented Sep 30, 2020

benjaminp commented Dec 9, 2021

joeljeske commented Mar 11, 2022

c-nuro commented Mar 24, 2022

cloudhan commented Mar 30, 2022 •

edited

Loading

github-actions bot commented Jun 13, 2023

fmeum commented Jun 13, 2023

API for Skylark rules and genrules to declare ResourceSets #6477

API for Skylark rules and genrules to declare ResourceSets #6477

Comments

ghost commented Oct 23, 2018

Description of the problem / feature request:

Feature requests: what underlying problem are you trying to solve with this feature?

What operating system are you running Bazel on?

What's the output of bazel info release?

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

Have you found anything relevant by searching the web?

Any other information, logs, or outputs that you want to share?

jin commented Oct 23, 2018

jmmv commented Oct 24, 2018

ittaiz commented Oct 24, 2018 via email

ulfjack commented Oct 24, 2018 • edited Loading

dslomov commented Feb 15, 2019

Globegitter commented Jul 3, 2019 • edited Loading

kastiglione commented Dec 30, 2019 • edited Loading

IlyaNiklyaev commented Feb 25, 2020

juliexxia commented Feb 25, 2020 • edited Loading

brentleyjones commented Mar 18, 2020

IlyaNiklyaev commented Jul 14, 2020

juliexxia commented Jul 27, 2020

emidln commented Aug 20, 2020 • edited Loading

uri-canva commented Sep 24, 2020

ulfjack commented Sep 29, 2020

katre commented Sep 29, 2020

IlyaNiklyaev commented Sep 30, 2020

benjaminp commented Dec 9, 2021

joeljeske commented Mar 11, 2022

c-nuro commented Mar 24, 2022

cloudhan commented Mar 30, 2022 • edited Loading

github-actions bot commented Jun 13, 2023

fmeum commented Jun 13, 2023

What's the output of `bazel info release`?

If `bazel info release` returns "development version" or "(@non-git)", tell us how you built Bazel.

What's the output of `git remote get-url origin ; git rev-parse master ; git rev-parse HEAD` ?

ulfjack commented Oct 24, 2018 •

edited

Loading

Globegitter commented Jul 3, 2019 •

edited

Loading

kastiglione commented Dec 30, 2019 •

edited

Loading

juliexxia commented Feb 25, 2020 •

edited

Loading

emidln commented Aug 20, 2020 •

edited

Loading

cloudhan commented Mar 30, 2022 •

edited

Loading