-
-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] split NailgunProtocol into PailgunProtocol versions of everything, add PGRP chunk #6579
Closed
cosmicexplorer
wants to merge
2
commits into
pantsbuild:master
from
cosmicexplorer:differentiate-nailgun-pailgun-pgrp
Closed
[WIP] split NailgunProtocol into PailgunProtocol versions of everything, add PGRP chunk #6579
cosmicexplorer
wants to merge
2
commits into
pantsbuild:master
from
cosmicexplorer:differentiate-nailgun-pailgun-pgrp
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Oct 5, 2018
81066cd
to
81260fe
Compare
cosmicexplorer
added a commit
that referenced
this pull request
Jan 18, 2019
…classpaths (#7092) *Resolves #7089.* ### Problem `RscCompile` is the one task in pants which invokes multiple JVM tools over the course of its run, as a consequence of [using outlining to generate semantic information with rsc which zinc can compile against](https://github.com/twitter/rsc/blob/master/docs/compiler.md#typechecking-in-rsc-mid-2018). This doesn't play nice with our `NailgunExecutor`, and causes error messages like the following when `--worker-count` > 1 (see #7089): ``` metacp(jdk) failed: Problem launching via NailgunClient(host=u'127.0.0.1', port=55511, workdir=u'/Users/dmcclanahan/tools/pants') command scala.meta.cli.Metacp --verbose --out .pants.d/tmp/tmpX8iJid.pants.d/compile/rsc/a04416cba788/--jdk--/index /Library/Java/JavaVirtualMachines/TwitterJDK/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/TwitterJDK/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/TwitterJDK/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/TwitterJDK/Contents/Home/lib/tools.jar: (u'Problem talking to nailgun server (address: 127.0.0.1:55511, remote_pid=<remote PID chunk not yet received!>, remote_pgrp=<remote PGRP chunk not yet received!>): TruncatedHeaderError(u"Failed to read nailgun chunk header (TruncatedRead(u\'Expected 5 bytes before socket shutdown, instead received 0\',)).",)', TruncatedHeaderError(u"Failed to read nailgun chunk header (TruncatedRead(u'Expected 5 bytes before socket shutdown, instead received 0',)).",)) E Traceback: E File "/Users/dmcclanahan/tools/pants/src/python/pants/backend/jvm/tasks/jvm_compile/execution_graph.py", line 276, in worker E work() E E File "/Users/dmcclanahan/tools/pants/src/python/pants/backend/jvm/tasks/jvm_compile/execution_graph.py", line 44, in __call__ E self.fn() E E File "/Users/dmcclanahan/tools/pants/src/python/pants/backend/jvm/tasks/jvm_compile/rsc/rsc_compile.py", line 339, in work_for_vts_rsc_jdk E output_dir=rsc_index_dir) E E File "/Users/dmcclanahan/tools/pants/src/python/pants/backend/jvm/tasks/jvm_compile/rsc/rsc_compile.py", line 819, in _runtool E dist=distribution E E File "/Users/dmcclanahan/tools/pants/src/python/pants/backend/jvm/tasks/nailgun_task.py", line 111, in runjava E raise TaskError(e) E [info] Compiling 1 Java source to /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpX8iJid.pants.d/compile/rsc/a04416cba788/examples.src.java.org.pantsbuild.example.hello.greet.greet/current/zinc/classes ... E [info] Done compiling. E [info] Compile success at Jan 16, 2019 6:36:30 PM [2.258s] E E FAILURE: Compilation failure: Failed jobs: metacp(jdk) ``` ### Solution - Introduce `JvmToolMixin.register_combined_jvm_tools()` as an API for accessing specific JVM tools as a combined classpath (a suggestion from @xeno-by) but different main classes, allowing the use of a single nailgun instance (resolving #7089). - Introduce `NailgunTaskBase#do_for_execution_strategy_variant()` which allows specifying different actions to perform for different values of the `--execution-strategy` option in a structured way. - Use the above to add a code path for a nailgun execution for `RscCompile`, and add testing for `--worker-count` > 1. - Introduce `ZincCompile#get_zinc_compiler_classpath()` to allow `RscCompile` to override it with the combined classpath to persist the nailgun. - Remove the mysterious `or self.cmd != self._distribution.java` in `nailgun_executor.py` -- this is necessary for this PR to work, but an option can be plumbed in if it breaks pantsd (this might motivate #6579). ### Result `RscCompile` can be invoked with > 1 nailgun instance at a time by bundling all the tool jars into a single classpath, without affecting the performance for any of the other execution strategies.
d6f6b17
to
2ddcffd
Compare
…erything we were using something that kinda supported either without making it clear
2ddcffd
to
fa98448
Compare
These chunks were removed in #9389. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
WIP because I wasn't able to get the
java/
tests all the way passing last night, there's probably hella refactoring I could doProblem
Our implementation of the Nailgun protocol has an extension in the form of a PID chunk, currently, which we will only worry about if we receive a chunk of that type (types are defined in
ChunkType
). The only difference in the way we use pailgun clients is that the nailgun client (e.g. in RemotePantsRunner) will secretly wait for a PID chunk (even if it's for a nailgun invocation that doesn't support that). This actually works pretty well, and I only ran into issues when I wanted to try adding the PGRP chunk or doing anything based on it, because we currently wait for the PID chunk in the execution loop inNailgunClientSession#process_session()
, and that doesn't really make it clear when or whether we can expect that chunk or whether there should be more than one.I realized that the step of receiving the PID/PGRP chunk can be separated into a distinct initialization phase, after the execution request is sent from the nailgun client to pantsd, after our pantsd-runner process self-daemonizes. Not doing that now means we don't reliably know the remote pid or pgrp, when we can instead require that these are sent before any execution occurs (this was already deterministically true). So I ran with that and here's the justification for the current changes:
There is (or rather, there's not now at all, but should be) some implicit sequencing required in negotiating a pailgun execution given a live (what is now called)
PailgunClient
:One contribution of this PR is to make that sequencing (which is specific to pailgun) explicit by composing contextmanagers -- this makes ownership of the underlying resources and the IPC needed to obtain them much more clear, imo. The PID/PGRP dissonance at the top is a prime example of how trying to munge them together is hard (and it ends up being not too much work to un-munge them, even if the diff ends up being big).
If you're concerned that this sounds like it's introducing too much state into the session, I was actually able to remove most (all?) mutations by using contextmanagers, which also makes a pailgun execution simply a matter of a
with
statement inside a nailgun execution (which if you were to write it in pseudocode could maybe look like that as well). Please let me know if there are any minor to sweeping changes or misconceptions to fix on nailgun, python, or IPC in general.TODO
datatype
to be specific)? This sounds like the right thing to do.(explain the context of the problem and why you're making this change. include
references to all relevant github issues.)
Solution
PailgunChunkType
classPailgunProtocol
,PailgunClient
, etc (was trying to avoid this lengthy mirroring to whatever extent possible but not sure if that is a good idea or feasible at this point)(describe the modifications you've made.)
Result
(describe how your changes affect the end-user behavior of the system. this section is
optional, and should generally be summarized in the title of the pull request.)
TODO