-
-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Declare v2 products on v1 Tasks #4769
Comments
I'm placing this in the |
Attached an image from the design discussion. |
#4973 is related, because this new design for tasks to request products should incorporate |
Another aspect that we should consider incorporating here is the ability for a frontend task to run in a loop... possibly by turning the execute method into a context manager? |
@stuhood Can you describe the motivation for running in a loop? |
@illicitonion : I was thinking in particular of hot-reload usecases such as jrebel, play-framework, or mobile simulators: not needing to fork/daemonize the JVM would make a task that wanted to continuously reload a child much easier to implement... on each iteration you'd have a (partially) new classpath to feed to the hot reload framework. |
...and LSP servers. |
Alright, I think that this looks eminently feasible. The proposed initial API (can add more selector types in future) on a Task is:
The restriction that allows us to avoid walking the task graph to determine whether we need to compute a Pre-fork, we will warm the Post-fork, we will re-compute the above "no v1 tasks" property to skip actually constructing the So, this actually seems relatively straightforward. The only thing is: I don't feel good about diving in here until #4401 is resolved. I'm going to dive into that over the next few days. |
As long as this is experimental, sounds good :) I'm sure we'll want to change this as we start working on more migration, but I don't have any suggestions, so let's push ahead |
I have something working here, although it needs a bit of polish. Will post once #4401 is (more) solidly landed. |
Sounds good to me too. |
From #5003 (and other discussion): there is a fundamental question of how root selection will need to change in order for us to succeed in a meaningful way for a command like I've been thinking for a while that it might be the case that the |
On #5639, @illicitonion had the interesting suggestion of making frontend tasks a "special" The big, big sticking point is that in order for any of the requests the task makes to be memoized, it would need to be running pre-fork. And that's a very high bar... all usage of Subsystems would have to be via #5788, and no other mutable APIs would be safe for usage. But, jumping over this high bar would also allow us to kill forking for these frontend tasks... and that would be super exciting. And in terms of doing clean-room implementations of tasks (as opposed to ports of existing v1 tasks), this would be a pretty killer API to use: it totally solves the "how do I write a declaration of what some imperative code will need in order to execute?" problem. Getting more excited about it. |
Hm... one unfortunate downside of frontend tasks operating that way would be that in order to avoid interleaving outputs, Certainly for something like |
It feels like there are two distinct modes that tasks should be run in:
and separating the design of the output models here could be useful, and also not terribly difficult, given we already model By default, we could batch up the output of any particular task, and atomically output it when it completes. This means things execute in parallel, don't interleave their output, but that we don't get great responsiveness. That's only really a problem for things which you're interactively debugging (e.g. test output, which should be handled by mode 2), or things which are showing progress (e.g. artifact upload, which should be handled somehow from the new UX design) By in some situations (perhaps if there's one goal? Or one goal+target? Or if a flag is specified?) we should only run one frontend task at a time, and print lines as they're yielded. This allows us to not worry too much about interleaving, but still provide the power to debug where needed. |
Unless there is exactly one running maybe? Yea, I think that makes sense. Because certainly we're going to have some other UX for the progress of |
### Problem As described in #5736, some of the metrics we've recently added introduce dependencies on having constructed a `BuildGraph`. In order to unblock removing the `BuildGraph` requirement in #4769, we'd like to replace those metrics with forward compatible metrics computed in the native engine. ### Solution Split out a `Session` type which represents a scope for metrics capture. Currently the `Session` records the root requests that were made during its lifetime, which allows it to later determine which nodes are reachable from those roots. In future, additional metrics could be captured in the `Session` object at the beginning (ie, when it is created) or "end" (when `session.metrics()` is called) of the `Session`. The python objects representing the scheduler were renamed to `Scheduler` and `SchedulerSession` to represent this split. `pantsd` reuses a `Scheduler` for its entire lifetime, but creates a new `SchedulerSession` per client request, which is then used post fork as well. ### Result The `affected_target_file_count` metric was replaced with `affected_file_count` metric, based on the count of digests reachable from all of the roots in a session. Fixes #5736 and unblocks adding further scheduler metrics.
A few open questions around the proposed "frontend rules" model:
The last one is the big sticking point for me right now, so it would be interesting to brainstorm that a bit with @kwlzn and anyone else who might be interested. |
regarding parallelism, I'm leaning towards this rough sketch for the case of all existing
if something needs interactivity (e.g. another interesting idea might be to port one possible further idea for #2 might involve live streaming other stray thoughts:
or to be able to compute those requests pre-fork somehow, such that we could warm them pre-fork and then re-request post-fork cheaply? It's not immediately clear to me what sort of dynamic product requests one of these sorts of rules would be making that couldn't be computed pre-fork or via a regular, underlying
this makes perfect sense to me and I think it'll all come down to the UI/UX. I think there's also a question of what sort of logic the
they're probably installed like any other rule + some from there, kickoff subject types are probably options, targetroots, ?
trying to understand a concrete case where these would need real access to the filesystem vs the ability to operate in a sandboxed snapshot as subprocess - or just delegating that to a sub-
I think we virtualize stderr/stdout, for sure. ttys are trickier, particularly if we need to hand a real tty to a subprocess like in the case of v1 repls. thinking that initially it'd probably be easiest to just synchronize access to the main tty for any 'interactive' thing, then make everything else non-interactive by default. we'd also talked about the idea of directly
I think all of the console access here by rules is fully virtualized, so we'd ultimately have control over how to display process in the UI/UX impl. there are many console UI techniques we could employ here, and thinking about the potential there is making me pretty antsy to begin work on that :) |
Thanks for thinking about this! Much appreciated. It would likely be good to transfer some of this into a doc for easier commenting, but to respond quickly to a few items.
Sortof. The issue is that unless an
In the curses mode, interleaving is a non-issue. In the "single-output-file" mode (as in naive jenkins without tailing for multiple output files), this results in interleaving. I think I'm convinced that appropriately indented-and-labeled interleavings would be ok, but this is a case that we'll need to be careful of.
I think my feeling on this is that in order to get something that renders individual test runs as they complete, individual test runs would have to be rendered in the underlying The reason I say this is because there is no API currently to "stream" results to an This is a new realization (for me): will comment more on it in the last paragraph.
This is essentially what I had been thinking of doing on this ticket in the first place, but the big big challenge in that case is: what does the definition language look like for the queries that run pre-fork? Basically, it reduces to exactly the problem we had with In spite of what I've said above about While writing this response, the fact that
|
Interesting - I had been assuming that
I'm increasingly thinking that we're going to need to change the fork model here. I feel like we're having to dance around a lot of circles, which aren't good circles to be dancing around, to make sure that things happen at the right time relative to the fork. ("Either we can run the tests per-fork and debugging them is hard, or we run them post-fork and lose our caching") Either keeping the cache out of memory, making the post-fork cache populate the pre-fork cache, or making a rust process be the master process, and scheduling executions in threads rather than processes, would make this dancing go away.
Why is filesystem access required in either? |
All publishing of files into |
If we're going to have a concept of
Formatters definitely feel like things which fit into a "You only run one of these, and it takes a lock" kind of use-cases - I can't see how anything else can be deterministic, as |
Yea, it's possible that both of those usecases could be explicit. |
@kwlzn will be splitting this ticket up into a bunch of UX related tickets, with "exposing an alternative to v1 Tasks" as the highest priority. |
.@kwlzn split this up into tickets with the v2 python pipeline label: While it's possible that we'll still need to add a "(cacheable) v2 products from v1 tasks" API in order to support remoting, I think that can be added as we go to unblock particular usecases (@illicitonion's work on codegen in #5994 seems like the most immediate beneficiary, as running codegen pre-fork to make it cacheable in memory would clean things up significantly). |
I've had a realization that the original goal of this ticket is actually trivially accomplishable using a mechanism similar to The difference from what I originally suggested here is pretty slim: it's essentially just that rather than using a list of |
aka "Remove requirement for tasks to fully hydrate a BuildGraph"
Currently, all pants Tasks (including things like
list
,filter
, etc) implicitly require a complete BuildGraph to be hydrated.To avoid actually walking the entire filesystem in order to hydrate that graph for tasks that don't examine source globs, the v1 engine uses lazily expanded
LazyFilesetWithSpec
objects. The downside to this laziness is that it cannot be cached in the daemon, so the v2 engine eagerly expands globs and createsEagerFilesetWithSpec
objects, but this means that all file fingerprinting happens eagerly in the v2 engine.The root cause is that Tasks are not specific about which products they need in order to run: instead, they implicitly require a
BuildGraph
.list
, for example, only requiresAddresses
for the target spec roots, which are very cheap to compute. Andfilter
only requires un-hydrated target objects (in the current v2 Rule datamodel, this means it only needsUnhydratedStructs
).Exposing a new API from Task that allows it to explicitly declare what v2 products it needs (perhaps with a default implementation that indicates that the Task requires a complete BuildGraph) would allow the v2 engine codepaths to avoid instantiating a BuildGraph for tasks that do not require it. It's also possible that Tasks should be able to expose v2
Rules
that can assist in calculating its product requirements.The text was updated successfully, but these errors were encountered: