-
-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API for spawning task-futures, use it for grouping and parallelization of test classes within a single module #3478
Conversation
@@ -404,6 +404,7 @@ trait MillScalaModule extends ScalaModule with MillJavaModule with ScalafixModul | |||
def moduleDeps = outer.testModuleDeps | |||
def ivyDeps = super.ivyDeps() ++ outer.testIvyDeps() | |||
def forkEnv = super.forkEnv() ++ outer.forkEnv() | |||
// override def testForkGrouping = discoveredTestClasses().grouped(1).toSeq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can turn this on properly once we re-bootstrap
def testForkGrouping = discoveredTestClasses().grouped(3).toSeq is not a great grouping strategy because it will spawn too many JVMs. def testForkGrouping = T {
val classes = discoveredTestClasses()
classes.grouped(classes.size / 3)
} |
Mill does not have the information to do balancing "properly" though: It's possible we could do something like dynamic work stealing, or to store historical test run times and group based on that, but for an initial pass I didn't want to be too sophisticated. At my previous job we did per-class forking and despite the per-JVM overhead it generally worked pretty well, and this is essentially the API that SBT provides, so IMO it's a reasonable first pass and we can try to be more sophisticated in a follow up |
This PR exposes an
T.fork.async
/.await
/.awaitAll
API for tasks to spawn futures that share the task evaluator's execution context, integrates it with thePromptLogger
, and uses it to allow Mill to parallelize test suites on a per-test-class basis. This is only the first such use case for this.sandboxedFuture
API, and I expect there'll be many more (e.g. I've been wanting to parallelize the sonatype uploader)--jobs
configuration, folder sandboxing, and terminal UI/logging. Rather than the status quo where such thread pools are invisible things running silently in the background doing god-knows-whatPromptLogger
, such that the the prompt shows them grouped under their parent task, and their prefix[106-0]
or[106-1]
associated with their parent task[106]
. To do this, thePromptLogger
now treats the keys askey: Seq[String]
rather thankey: String
def async[T](dest: Path, key: String, message: String)(t: => T): Future[T]
(naming could be better?) is designed to fit into Mill, ensuring you provide a place to put filesystem logs, a filesystem sandbox folder, TUI log prefix and prompt label, etc. That means that such Futures are as observable and sandboxed as normal Mill tasks, and have most of the same propertiesThe test grouping is roughly a port of the SBT
testGrouping
feature https://www.scala-sbt.org/1.x/docs/Testing.html#Forking+tests and serves the same purpose. This is most useful for codebases with large modules each of which has a lot of test classes within them, and the flexible nature ofdef testForkGrouping
gives the user room to tune exactly how the tests are grouped for maximal performance:To test the results for
testForkGrouping
, I rantime ./mill -i scalalib.test
with and without test grouping on my 10 core macbook pro (after breaking upHelloWorldTests.scala
for better granularity), we see about a 3x speedup.Without Test Grouping, all test classes in 1 JVM (default)
With Test Grouping, 3 test classes per JVM (
def testForkGrouping = discoveredTestClasses().grouped(3).toSeq
)With Test Grouping, 1 test class per JVM (
def testForkGrouping = discoveredTestClasses().grouped(1).toSeq
)The limited speedup is likely due to the heavy nature of Mill tests meaning that running sequentially they already use multiple cores, and I would expect a greater speedup for most projects whose tests would be more lightweight. We can also see that 1-test-class-per-JVM is somewhat slower than 3-test-classes-per-JVM in this case, likely due to JVM overhead becoming significant
This feature is opt-in via
def testForkGrouping = discoveredTestClasses().map(Seq(_))
. The default behavior of running all tests in a single JVM is the unchangedImplementation Notes
We re-use the same
ExecutionContext
that Mill uses internally for scheduling its targets, allowing the scheduling to be cooperative.We convert the default
FixedThreadPool
into aForkJoinPool
provide ablocking{...}
operation to allow theForkJoinPool
to spawn an additional thread when an existing thread is blocked waiting.Future
s spawned for each test class, as the task-level thread is idle and we want to continue making use of the available CPUs.scala.concurrent.ExecutionContext.global
does, butglobal
s implementation is private and not re-usable link and so I have to duplicate the small amount of code wiring it upForkJoinPool
concurrency and thread-management model is pretty battle-tested in Java land, so although it's new here I'm not too worried about its performance and robustness, and anyway Mill is a pretty low-concurrrency system overall so it shouldn't be pushing any limitsEach test class runs in a subprocess in a separate JVM with a separate sandbox folder, and their outputs are then all read and consolidated back into the combined output for the original test task.
This is controlled by the target
def testForkGrouping: Seq[Seq[String]]
, which defaults toSeq(discoveredTestClasses())
to put them all in one group, but can be customized in arbitrary waysThis is covered by additional unit tests and java/scala/kotlin example tests included in the docsite