-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Overhaul command line args for choosing a spawn strategy #4153
Comments
Thanks a lot! I'll read and think through it in the next days. :) |
There is another dimension which you didn't take into account. In particular, we allow action 'tags' to enable or disable specific features. For example, an action marked as 'nosandbox' can disable sandboxing for that action, and an action marked as 'worker' enables persistent workers for that action. My plan was to have one flag to specify strategy order and fallback, and another flag to add or remove tags from actions identified by mnemonic. (And of course, per-strategy flags to configure the strategy.) One difference to your proposal is that my proposal does not unify worker sandboxing and sandboxing. TODO: I need to write down a more complete description. |
Two use-cases I'd like to add here are:
1. Trying to read from a remote cache and falling back to "local remote
executor" (LRE). This serves a few purposes. It enables network isolation
on OS X, more closely aligns one with remote execution and sidesteps
sandboxing issues which don't exist in remote execution.
2. Another use-case which is related to the above is the ability to decide
on remote execution for different filters, in my case based on the test
tags. This is to enable to run fast unit-tests locally with sandboxing
(since they're usually very isolated from the environment) and to run
integration tests in the LRE.
#2 might be a bit disconnected from this issue but I think #1 is definitely
related here
…On Thu, 23 Nov 2017 at 20:21 Ulf Adams ***@***.***> wrote:
There is another dimension which you didn't take into account. In
particular, we allow action 'tags' to enable or disable specific features.
For example, an action marked as 'nosandbox' can disable sandboxing for
that action, and an action marked as 'worker' enables persistent workers
for that action.
My plan was to have one flag to specify strategy order and fallback, and
another flag to add or remove tags from actions identified by mnemonic.
(And of course, per-strategy flags to configure the strategy.) One
difference to your proposal is that my proposal does not unify worker
sandboxing and sandboxing.
TODO: I need to write down a more complete description.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4153 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIFxSaFBSoq30dIGYhCTXNF9mdW_Y7ks5s5beqgaJpZM4QoCBw>
.
|
Ulf, you're right that I neglected to mention tags. As far as I know, the
currently available tags could be thought of as constraints on the set of
available strategies for that action (e.g. "workers are allowed" or
"sandboxing is not allowed"). I was picturing it working as follows: Let's
say the command line flags include "--sandbox" or something, meaning to use
sandboxing when possible, but a specific action has "nosandbox," meaning
that rule does not allow sandboxing — then that rule would just run
non-sandboxed.
(As an aside, I didn't know about "nosandbox", because it is not in the
documentation. I'll add an issue asking that it be added to the docs. When
you say "worker" are you referring to the "supports-workers" value with an
action's "execution_requirements"? That's all I know about.)
Regarding unifying of worker sandboxing and non-worker sandboxing, that's
one part of my proposal that I'm a little hesitant about. From a user
perspective that feels like the right thing to do — if a user wants
sandboxing then they probably want it everywhere, or, at a minimum, they
don't want to be surprised to learn that even though they thought they had
turned sandboxing on, it was still off for some of their rules. From an
implementation perspective, I haven't looked at the code but I wouldn't be
surprised if worker sandboxing and non-worker sandboxing are quite
different beasts, so it might feel wrong to put them under the same
umbrella. The thing about avoiding "surprise" feels pretty important to me,
though.
Strategy order and fallback might be a reasonable way to do things. I know
it's a bit unusual for me to write a super-long proposal out of the blue,
especially since it's unlikely that I would be doing the work; feel free to
ignore whatever you want to ignore, but I do hope that the use cases I
spelled out are informative, and that you consider them if you work on a
strategy-order-and-fallback set of command line flags.
My motivations for writing this were that (a) it took us a long time to get
comfortable and familiar with the nuances of strategy selection, and (b) we
keep running into scenarios that are either messy or impossible with the
current set of flags. (Having the remote cache fall back to a sandboxed
build was a very high priority for us, so I got that working in our fork in
a slightly hacky way, but I'd love not to have our own fork.)
Thanks! - Mike
…On Thu, Nov 23, 2017 at 10:21 AM, Ulf Adams ***@***.***> wrote:
There is another dimension which you didn't take into account. In
particular, we allow action 'tags' to enable or disable specific features.
For example, an action marked as 'nosandbox' can disable sandboxing for
that action, and an action marked as 'worker' enables persistent workers
for that action.
My plan was to have one flag to specify strategy order and fallback, and
another flag to add or remove tags from actions identified by mnemonic.
(And of course, per-strategy flags to configure the strategy.) One
difference to your proposal is that my proposal does not unify worker
sandboxing and sandboxing.
TODO: I need to write down a more complete description.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#4153 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAMrN6IqwSzz7WbN0X9loHaKopUcvFphks5s5bexgaJpZM4QoCBw>
.
|
One more point about strategies, as food for thought:
I think that for those of you who are the actual developers on Bazel, it's
quite natural to think in terms of strategies. From the perspective of the
Bazel source code, strategies are an important abstraction — they define
what needs to happen in order to get the build artifacts. But from a user's
perspective, I could see a possible world where strategies are not an
essential concept for the user to think about. It's obvious to the user,
for example, that if you can't find something in the cache then you'll have
to run a compiler; it's obvious that sandboxing will involve doing some
setup stuff before running the compiler, and some cleanup after; and so on.
The way I think of how Bazel runs one action is: "It will look in the
remote cache. If it can't find what it needs, it will run a build tool
somewhere. There will be some qualifications on how it runs that build
tool: is it sandboxed, will it use a worker, etc." Strategies don't come
into my mental model (or at least, they didn't until I got much more
familiar with Bazel).
That's why, in a strategy-centered view of the world, it feels a little
unnatural to unify worker sandboxing and non-worker sandboxing: From a
strategy perspective, those are just different things. But viewed a
different way, they are more closely related.
…On Thu, Nov 23, 2017 at 11:43 PM, Mike Morearty ***@***.***> wrote:
Ulf, you're right that I neglected to mention tags. As far as I know, the
currently available tags could be thought of as constraints on the set of
available strategies for that action (e.g. "workers are allowed" or
"sandboxing is not allowed"). I was picturing it working as follows: Let's
say the command line flags include "--sandbox" or something, meaning to use
sandboxing when possible, but a specific action has "nosandbox," meaning
that rule does not allow sandboxing — then that rule would just run
non-sandboxed.
(As an aside, I didn't know about "nosandbox", because it is not in the
documentation. I'll add an issue asking that it be added to the docs. When
you say "worker" are you referring to the "supports-workers" value with an
action's "execution_requirements"? That's all I know about.)
Regarding unifying of worker sandboxing and non-worker sandboxing, that's
one part of my proposal that I'm a little hesitant about. From a user
perspective that feels like the right thing to do — if a user wants
sandboxing then they probably want it everywhere, or, at a minimum, they
don't want to be surprised to learn that even though they thought they had
turned sandboxing on, it was still off for some of their rules. From an
implementation perspective, I haven't looked at the code but I wouldn't be
surprised if worker sandboxing and non-worker sandboxing are quite
different beasts, so it might feel wrong to put them under the same
umbrella. The thing about avoiding "surprise" feels pretty important to me,
though.
Strategy order and fallback might be a reasonable way to do things. I know
it's a bit unusual for me to write a super-long proposal out of the blue,
especially since it's unlikely that I would be doing the work; feel free to
ignore whatever you want to ignore, but I do hope that the use cases I
spelled out are informative, and that you consider them if you work on a
strategy-order-and-fallback set of command line flags.
My motivations for writing this were that (a) it took us a long time to
get comfortable and familiar with the nuances of strategy selection, and
(b) we keep running into scenarios that are either messy or impossible with
the current set of flags. (Having the remote cache fall back to a sandboxed
build was a very high priority for us, so I got that working in our fork in
a slightly hacky way, but I'd love not to have our own fork.)
Thanks! - Mike
On Thu, Nov 23, 2017 at 10:21 AM, Ulf Adams ***@***.***>
wrote:
> There is another dimension which you didn't take into account. In
> particular, we allow action 'tags' to enable or disable specific features.
> For example, an action marked as 'nosandbox' can disable sandboxing for
> that action, and an action marked as 'worker' enables persistent workers
> for that action.
>
> My plan was to have one flag to specify strategy order and fallback, and
> another flag to add or remove tags from actions identified by mnemonic.
> (And of course, per-strategy flags to configure the strategy.) One
> difference to your proposal is that my proposal does not unify worker
> sandboxing and sandboxing.
>
> TODO: I need to write down a more complete description.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#4153 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAMrN6IqwSzz7WbN0X9loHaKopUcvFphks5s5bexgaJpZM4QoCBw>
> .
>
|
big +1 |
Ha! I disagree that it's natural to think in terms of strategies. We've actually had a lot of problems in the past, and I've independently come to the conclusion that we'll need to redo the way this is currently done. In general, I agree with the points you're making, but I wanted to point out that the proposal is incomplete, and there are a few additional use cases we need to cover. |
- remove BaseSpawn.Local; instead, all callers pass in the full set of execution requirements they want to set - disable caching and sandboxing for the symlink tree action - it does not declare outputs, so it can't be cached or sandboxed (fixes #4041) - centralize the existing execution requirements in the ExecutionRequirements class - centralize checking for execution requirements in the Spawn class (it's possible that we may need a more decentralized, extensible design in the future, but for now having them in a single place is simple and effective) - update the documentation - forward the relevant tags to execution requirements in TargetUtils (progress on #3960) - this also contributes to #4153 PiperOrigin-RevId: 177288598
- remove BaseSpawn.Local; instead, all callers pass in the full set of execution requirements they want to set - disable caching and sandboxing for the symlink tree action - it does not declare outputs, so it can't be cached or sandboxed (fixes bazelbuild#4041) - centralize the existing execution requirements in the ExecutionRequirements class - centralize checking for execution requirements in the Spawn class (it's possible that we may need a more decentralized, extensible design in the future, but for now having them in a single place is simple and effective) - update the documentation - forward the relevant tags to execution requirements in TargetUtils (progress on bazelbuild#3960) - this also contributes to bazelbuild#4153 PiperOrigin-RevId: 177288598
cc @buchgr @ishikhman with https://blog.bazel.build/2019/06/19/list-strategy.html and other work on strategy selection recently, is this issue resolved? |
Yes, I think so. It seems that it's been covered by #7480 Closing the issue for now. |
This is a proposal for a pretty big change to the command line arguments that are used to choose a build strategy. Please consider it to be a design doc of sorts. It's a strawman; probably needs improvement. It's based on our experiences at Asana, and a number of complex scenarios we have run into.
Introduction
There are currently a number of command line arguments that are used to choose which spawn strategy (e.g. standalone, sandboxed, worker, remote) is used for a given action. Each of these command line arguments serves a specific need; but taken as a whole, they (a) are somewhat difficult to learn, and (b) make it difficult to express some commonly desired workflows.
Here are some (all?) of the current flags:
--spawn_strategy=...
: specify the overall default strategy--genrule_strategy=...
: specify the strategy to use for genrules, if different fromspawn_strategy
--strategy=<mnemonic>=...
: specify an override, the strategy to use with a specific mnemonic--worker_sandboxing
: doesn't actually choose a spawn strategy, but it's related — turns on sandboxing for persistent workersAlso, currently the strategy names
local
andstandalone
are synonyms. Apparentlylocal
is the old, deprecated name, andstandalone
is the preferred name meaning "a regular, local, non-sandboxed, non-worker build." In this document, if I use the word "local", I mean "any action that is running on the same machine that Bazel is running on." In other words, "local" is the opposite of "remote"; but in my terminology, a local build might bestandalone
orsandboxed
orworker
.Some current problems
If you want to use a remote cache, and fall back to a local build (
--spawn_strategy=remote --remote_rest_cache=...
), the local build will always bestandalone
. There is no way to tell Bazel to fall back toworker
orsandboxed
.If you specify
worker
but, for whatever reason, the action can't be run in worker, it always falls back tostandalone
. No way to specifysandboxed
.Worker sandboxing (
--worker_sandboxing
) is separate from other sandboxing (--spawn_strategy=sandboxed
). This could lead to lack of clarity on the part of the developer: They might think they have turned on safe, fully sandboxed builds with--spawn_strategy=sandboxed
, when in fact anything they run in a worker isn't actually safe.In more complex scenarios, switching between
sandboxed
andstandalone
can require numerous changes to the command line. For example, as mentioned above, if you are using workers then it might require changing both the--spawn_strategy
and the--worker_sandboxing
arguments. Or, suppose you usually useremote
, but you have several specific actions that you don't want to be cached remotely. So you might have a bazelrc file that looks like this:Switching that to use sandboxing will require changing five lines. Putting
standalone
did not actually express your goal; your goal was actually "do the build locally, somehow; don't use remote."Finally, a different kind of problem: Today's command-line flags can take a long time to get comfortable with. For example, the difference between
--spawn_strategy
and--strategy
; questions such as "if an item is not in the cache, what sort of local build does it fall back to, and can I control that" (I learned by reading the source); and so on.Strategy selection goals
I feel that it would be desirable for the strategy-related command line arguments had the following characteristics:
--spawn_strategy
and--strategy
.It would then be up to Bazel to take this set of guidelines, and resolve it to an appropriate build action.
Examples
A complex scenario might play out like this:
--remote_rest_cache=...
, no more args needed). So Bazel queries the remote cache, but does not find the build artifacts.--remote_local_fallback
, on by default). So Bazel needs to decide which local strategy to use.--workers
or something). But this action does not.--sandboxing
). So we do a local, sandboxed build.--remote_upload_sandboxed_results
). So Bazel uploads the build artifacts.Another scenario, this time with a non-sandboxed local build:
--sandboxing
plus--nosandboxing=ThisMnemonic
). So, no sandboxing.standalone
build.--noremote_upload_standalone_results
; the default). So Bazel does not upload the build artifacts.Another scenario, with no remote cache, and with a worker:
--workers
or something). This action does support workers. So, great.--sandboxing
). So, we do the build in a sandboxed worker.Strategy selection algorithm
The strategy selection approach is pretty clearly indicated by the above examples. To spell it out:
Suggested flags
Although I'm happy with the spirit of the following suggestions, I'm not sure if these are the best names for the flags.
Remote execution:
--remote_executor
, with nothing more, would imply that remote execution should be used.Remote caching:
--remote_rest_cache=...
: Same as today. Implies allowing downloads from the remote cache, and falling back to a local build.--[no]remote_upload_sandboxed_results
(default true) and--[no]remote_upload_nonsandboxed_results
(default false). Note, it saysnonsandboxed
instead ofstandalone
, because e.g. non-sandboxed worker builds should fit in theremote_upload_nonsandboxed_results
category.--[no]remote_upload_worker_results
(default true). (As an aside, It doesn't make any sense to add a--[no]remote_upload_nonworker_results
, because that is implied by the presence of--remote_rest_cache
.)With those defaults for remote-cache uploads, a sandboxed worker build would upload; a non-sandboxed worker build would not.
Sandboxing:
--[no]sandboxing
: This would mean: "For all mnemonics: If we are doing a local build, and if sandboxes are supported, do it in a sandbox." Or with "no" prefix, never use a sandbox. Applies to local worker worker in addition to local non-worker actions.--[no]sandboxing=<mnemonic>
: Same, but for a specific mnemonic.(Open issue: Maybe it should be
--strategy_sandboxing
instead of--sandboxing
)Workers:
--[no]workers
: This would mean: "For all mnemonics: If we are doing a local build, and if persistent workers are supported, do it in a persistent worker." Or with "no" prefix, never use a persistent worker.--[no]workers=<mnemonic>
: Same, but for a specific mnemonic.--[no]worker_sandboxing
: Whether to allow worker+sandbox. It's important to default to true (the opposite of today's default) in order to make the model consistent and easy to undertstand.(Open issue: Maybe it should be
--strategy_workers
instead of--workers
. But that would not affect--worker_sandboxing
, which is in a separate option group.)No need for strategy-specific "fallback" strategies
One result of all of the above is that the list of strategies now becomes a "flat" list, with no need for one strategy to know about any others. (The worker-with-sandboxing scenario is a little subtle; but that is basically just the worker strategy with an option for whether to use sandboxing.)
As the code is written today, many strategies -- remote, worker, and sandbox -- need a "fallback" capability right in the runner code (and they usually fall back to standalone). With this new approach, the selection process would happen before we get to the runner; the runner can just do its work.
Not only does this make the code simpler, but it also (as described extensively above) simplifies the mental model for the user. There is no need to ask questions like "what does a remote build fall back to"; it just falls back to an appropriate kind of local build.
In fact, to some degree, the concept of "strategies" becomes less important to the end user. To the user, it's more of just a multidimensional selection process of different characteristics of the build (remote execution yes/no, remote cache yes/no, worker yes/no, sandboxed yes/no). It's a little easier (for me anyway) to think of it that way than to think of it as "I'm using the remote strategy, which fell back to worker" or "I'm using the worker strategy, which fell back to standalone".
The "problem" scenarios from above
With these new flags, here is how the problem scenarios from above would be handled:
--remote_rest_cache=...
; the rest of the selection will fall out naturally from other flags.worker
but, for whatever reason, the action can't be run in worker, it will fall back to sandboxed if that has been specified and is available.sandboxed
andstandalone
now just requires--[no]sandboxing
, and nothing more.The text was updated successfully, but these errors were encountered: