-
-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn on parallelism by default, ensure Commands always run serially at the end #3265
Conversation
Since it's not forbidden, it's possible to depend on commands from other targets, even if this is not recommended. When we now delay command execution to happen serially at the end, it could be problematic. We should have at least one test with a target depending on a command and assert correct behavior. |
We have one such test in |
All tests are green, should be ready for a review. This can go into a binary compatible 0.12.x |
// given but run the commands in linear order | ||
evaluateTerminals( | ||
tasks, | ||
if (sys.env.contains("MILL_TEST_SUITE")) _ => "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment with some explanation here?
I guess, time is ripe to have a table with all used environment variables in our readme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. This is an internal environment variable, so users should not need to know about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had other developers (myself included) in mind. I thought, our readme is targeting devlopers.
Besides, nothing we need to do in this PR, just some loud thinking.
val (_, tasksTransitive0) = Plan.plan(Agg.from(tasks0.map(_.task))) | ||
|
||
val tasksTransitive = tasksTransitive0.toSet | ||
val (tasks, commands) = terminals0.partition { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we either add some comment or rename commands
to make it more obvious, it's only those commands that don't have downstream dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed it leafCommands
. Ideally I would want terminalCommands
, but Terminal
already has another meaning in this code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, IIUC, we don't evaluate all command serially, only those, that don't have downstream dependencies. That's a bit odd, as I was hoping, we could the user some guarantee, that commands are always executed serially.
Now, that I think about it, we probably only want commands, that are interactive to run serially. But we currently can't determite, if a tasks wants to be interactive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just interactive commands, there's probably lots of commands to do side effects, tests that are not well behaved (e.g. writing to shared paths on disk), and commands that print stuff (and probably don't want the [#threadid]
prefix). Those are all solvable, but I think "only commands at the end of the evaluation graph run serially" is a reasonable compromise that gets us most of the way there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood.
I don't know how often targets depend on commands, but I could imagine, it would be fine to just run those downstream dependencies serially too. So we could make this guarantee for all commands without implementing multiple parallel/serial blocks.
As a consequence of this PR, users will no longer be able to run command-based tests in parallel. I guess, we should promote our |
Yeah we can add |
Remove the hardcoded setting of
threadCount = Some(1)
Split
Task
s andCommand
s into two separate batches, to only runCommand
s serially after allTask
s have runIntroduce a
MILL_TEST_DEST_FOLDER
environment variable in our standardTestModule
. This is something we needed to do to avoid collisions between parallel test modules in Mill's own build (e.g. everyone stomping overtarget/worksources
) and will be something that Mill users will need as wellCommands are typically the less well-behaved tasks, whereas Targets and others are meant to be pure functional and thread safe without side effects. So running commands serially at the end is a decent tradeoff the do the right thing by default, and if someone wants parallelism they can always use a task (e.g.
.testCached
instead of.test
)Commands called from other tasks, e.g. those tested by
example/misc/5-module-run-task
, are run in parallel together with the main group. We assume that these commands are well behaved, as their results will potentially be cached (as part of downstream tasks), and so I run them as part of the primary task graph.(An alternative could have been to run these sequentially as well, but given the nature of commands-run-as-part-of-tasks I think that would be unnecessarily conservative)
I disable the thread-number log prefixes in the integration tests, since all they would do there is add clutter.
Pull request: #3265