-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API for Skylark rules and genrules to declare ResourceSets #6477
Comments
I think this makes a lot of sense. (The need for this came up as well during the Bazel hackathon but I cannot remember the details right now.) I don't like the first proposal of using a tuple: unnamed parameters are very hard to read. I prefer either a dictionary with named quantities, the separate attributes, or maybe better, a higher-level data type to represent the resource set with named fields (a la I'd also try to avoid floats even if the backing |
Should this plug into the same/similar configuration of configuring
“requirements” from remote workers? Sounds related to configurability but I
might be wrong (cc @katre)
…On Wed, 24 Oct 2018 at 4:44 Julio Merino ***@***.***> wrote:
I think this makes a lot of sense. (The need for this came up as well
during the Bazel hackathon but I cannot remember the details right now.)
I don't like the first proposal of using a tuple: unnamed parameters are
very hard to read. I prefer either a dictionary with named quantities, the
separate attributes, or maybe better, a higher-level data type to represent
the resource set with named fields (a la collections.namedtuple).
I'd also try to avoid floats even if the backing ResourceSet supports
them (because I think we should get rid of them). For CPU reservations,
floats are very hard to deal with, and for I/O, we can just represent them
as an integer in the [0,100] range.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#6477 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF_TrjabaiQc1wv7QkV6FvDLCLECqks5un8XvgaJpZM4X10Sq>
.
|
We already have a mechanism to configure cpu, but not one to configure ram. I don't think adding a mechanism to configure I/O makes any sense, given that the use of the "IO" resource in Bazel is inconsistent and more to address one-off restriction cases than as any sort of general I/O model. Basically, what I/O is is completely undefined and people have been abusing the setting to implement random restrictions. I think those are basically all wrong because they inherently don't scale. Bazel assumes a machine has "1.0 I/O", so a setting of 0.3 means "don't run more than 3 of these at the same time", regardless of whether I have an uber-powerful 52 core workstation with x-point high throughput memory in a raid-0 configuration for disk or a puny 2 core laptop with a spinning platter. The way to configure cpu is to set tags = ["cpu:3"]. It's possible that this isn't properly processed everywhere, but that would be easy to fix. Arguably, using tags isn't ideal because it's not typesafe. However, adding this to all sorts of rules and actions as custom type-safe fields isn't ideal either, and I think this solution is a reasonable compromise. Certainly, if someone were to add "mem:" or similar, I would insist that the user must provide a unit at the end, e.g., "mem:5g" or "mem:3kb". |
Please do not assign issues to more than one team. |
@beasleyr-vmw I think this is great, but is it also possible to specify cpu and ram in a dynamic way, similar to how |
Note that the To do some local testing, I hacked in |
We have a big project with heavy compilations and heavy tests. Tests also require some resources which are not CPU or RAM (for example, GPU memory or special CPU mode).
So I guess we need two things:
As for latter, I guess it may look something like this in rules:
As for new
I think we can contribute adjustable memory estimations for now. A mechanism of describing custom resources looks much bigger. |
I haven't looked too too deeply into this comment thread but the new Action Groups proposal might be of interest to you specifically the bit on exec_properties |
Were you setting it to mimic what was being passed in for |
@juliexxia BTW is it a good idea to do that? |
Ah, You are correct. Apologies for not more deeply understanding the needs of this post earlier. |
Asking for the underinformed, but why is it a problem if the local |
From the discussion so far it sounds like having |
I'm in favor of making IIRC, @katre said in another thread that |
I think I wasn't clear in my previous emails with @ulfjack: However, Bazel does not currently set any meanings for the keys in We could have the local execution strategies use some of these keys as well, but this raises the problem that Google's internal remote execution uses a completely different set of keys. We'd need to figure out a way to parameterize those keys to work with the existing internal/external remote execution systems. This makes things a bit harder for those of us working with both systems, and I'm sure is frustrating to people only using Bazel, but it does need to be considered for us. |
As far as I can see, we can go further in one of two ways:
|
@benjaminp and @wilwell thanks! That looks a fantastic step in the right direction. This looks like it will be very helpful for allowing rule authors to determine the resources needed for a specific action. For generic rulesets, it does seem like it might be a bit more difficult to adopt. If I understand the constraints correctly, the rule itself must reference a global starlark function that is callable within the rule at the time of the declaring the run action. This makes it difficult to the consumer of a rule to determine what resources should be reserved for a particular action. For example, It does make sense why you could not allow lambdas or nested functions. Would it be feasible to allow rule authors to specify the resource_set dictionary directly, as opposed to a function that defines it? That would make it feasible for these generic rules to accept some form of attrs so the consumers can specify their constraints. |
+1 for @joeljeske's ask above. Another scenario is for rules_foreign_cc that invokes |
+1 for @joeljeske's ask above. Why pass a callable, while the callable cannot have a full overview of the action. Why not just let the rule author pass the resource set dict in directly, or even better, a And use a callable is a weird design choice, bazel is not designed a round higher order function in any means... And, what does the resource set mean? Does it impose the hard limit? Or just a hint to the scheduler? In case I still want to over subscribe the CPU, you know, not everything can be always embarrassingly parallel, there is always something must be serial, and it take time, and sometimes you cannot easily break them into multiple |
Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team ( |
@bazelbuild/triage not stale |
Description of the problem / feature request:
My project uses multithreaded code emitters written in Java. They may eat CPU and RAM, and when running concurrently may overwhelm the build host. It'd be great if I could somehow signal to Bazel's scheduler that these processes are heavier so that it could adjust accordingly.
Feature requests: what underlying problem are you trying to solve with this feature?
When porting a project from Maven to Bazel, I observe longer build times with Bazel. Looking at the critical path, I see that a few actions--most notably one of the code emitters--takes significantly longer under Bazel (23s) than when run from the command line (15s) or Maven.
If I understand the code correctly (GenRuleAction.java), Bazel treats all genrules the same. They're assumed to consume ~300M of RAM and a single CPU. My tasks, however, typically occupy 2-4 CPUs, so I'd like Bazel to take this into account when scheduling actions running on my build host.
I'm still too new to Bazel to suggest a concrete API, but maybe something like
would be feasible? (Ditto for plain
rule
s, too!)What operating system are you running Bazel on?
Linux (CentOS 6.6)
What's the output of
bazel info release
?(This is 0.18.0rc8.)
If
bazel info release
returns "development version" or "(@non-git)", tell us how you built Bazel.n/a. This is just a feature request.
What's the output of
git remote get-url origin ; git rev-parse master ; git rev-parse HEAD
?n/a. This is just a feature request.
Have you found anything relevant by searching the web?
#988 touches on a similar topic.
Any other information, logs, or outputs that you want to share?
The text was updated successfully, but these errors were encountered: