-
-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add --use-native-image option to scalafmt and add thread parallelism to RewriteBase #8772
Conversation
The intent is to convert this to v2 very soon, and this PR is to demonstrate the specific effect of sharding scalafmt across threads or processes. This should be immediately and directly applicable to converting scalafmt into a v2 rule. |
Note that #8760 has a legitimate implementation of both fetching |
71a3901
to
505f01e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, except for the lack of tests - can you add some? :) Particularly one where the number of files being formatted isn't an even divisor of the number of parallel processes, and one showing that if we say the process limit is more than the number of files, we don't start extra processes?
505f01e
to
1d950f2
Compare
Thank you! Done!! |
self.set_options(skip=False, **kwargs) | ||
self.set_options_for_scope('scalafmt', use_native_image=True) | ||
|
||
uuid = '_' + str(uuid4()).replace('-', '_') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we introduce randomness into these tests? I'd rather either use a constant, or num_sources
, so that when we re-run the tests we expect identical code to run.
@@ -18,20 +84,22 @@ class ScalaFmt(RewriteBase): | |||
process_result to run different scalafmt commands. | |||
""" | |||
|
|||
def __init__(self, *args, **kwargs): | |||
super().__init__(*args, **kwargs) | |||
self._all_command_lines = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than modifying the actual production code to also act like a spy, could we in the test code use a fake binary for scalafmt which logs its command line or something? Or unit test the thread-splitting logic specifically?
Closing due to being stale + the blocking review. Please feel free to reopen if you'd like to revive this. Graal is indeed very neat. |
Problem
We would like to experiment with using the native-image version of scalafmt, and especially we would like to investigate parallelism in scalafmt. We suspect that this may be significantly faster and use much less memory than a JVM execution, as per this conference doc for the Graal workshop at the CGO 2019 conference. At Twitter's 2018 hackweek, @ShaneDelmore and I were able to demonstrate better performance with a Graal
native-image
-compiled version ofscalafmt
than the nailgun approach.Solution
ScalaFmtSubsystem
to contain thescalafmt
JVM tool and the native-image version as aNativeTool
(this is similar to what we've done forZinc
!).--files-per-process
and--total-number-parallel-processes
options toRewriteBase
, implemented with threading, since formatting is an embarrassingly parallel problem!Benchmarking
Not complete yet, but:
This demonstrates that native-image beats a "warm" nailgun (a nailgun that has already been executed against several times) with
--files-per-process=2
, which is about half of the time when using scalafmt with nailgun and no splitting of files (~1.6s).Note that the
main:fmt:scalafmt
time is what matters, notmain:fmt:scalafmt:scalafmt
, which describes the cumulative time over all threads.Benchmarking in a larger repo
In a larger monorepo, we observed the following speedup on a 6-core i9 macbook pro (with hyper-threading):
Result
Scalafmt can now be run with configurable performance, and the native-image version can be used transparently by any pants user without extra configuration!