-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pantsd] Launch the daemon via the thin client. #4931
Conversation
354bc37
to
1daf133
Compare
noting that some changes here further perpetuate #4917 by moving options from subsystem scopes to bootstrap options. my intent is to circle back to #4917 ASAP to resolve this unilaterally for all affected options - but we've declared a "fixit week" at Twitter this week so I likely won't be able to get started on that until next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -59,7 +59,7 @@ def register_goals(): | |||
# Pantsd. | |||
kill_pantsd = task(name='kill-pantsd', action=PantsDaemonKill) | |||
kill_pantsd.install() | |||
kill_pantsd.install('clean-all') | |||
kill_pantsd.install('clean-all', first=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you comment here as to why this is necessary to run first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -36,6 +37,8 @@ def register_bootstrap_options(cls, register): | |||
status as "bootstrap options" is only pertinent during option registration. | |||
""" | |||
buildroot = get_buildroot() | |||
default_distdir = os.path.join(buildroot, 'dist') | |||
default_rel_distdir = '/{}/'.format(fast_relpath(default_distdir, get_buildroot())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is computed a few lines above, so using fast_relpath
to tear it back apart doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -164,6 +165,9 @@ def _run_services(self, services): | |||
self.shutdown(service_thread_map) | |||
raise self.StartupFailure('service {} failed to start, shutting down!'.format(service)) | |||
|
|||
# Once all services are strted, write our pid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
started
Also, does it make sense to wait until everything starts successfully before writing the PID? If you fail to start up, you'd still want to be killable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
today, yes. any service failure at or just after startup will automatically cause the daemon to teardown by design. once the pid is written, the thin client will immediately try to connect - so it's important to wait until everything is successfully started otherwise we'll be racing a pailgun run vs the teardown.
self._subproject_roots = bootstrap_options.subproject_roots | ||
self._metadata_base_dir = bootstrap_options.pants_subprocessdir | ||
|
||
# TODO(kwlzn): Thread filesystem path ignores here to Watchman's subscription registration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can link to #3479 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/python/pants/pantsd/watchman.py
Outdated
@@ -147,8 +149,12 @@ def watch_project(self, path): | |||
try: | |||
return self.client.query('watch-project', os.path.realpath(path)) | |||
finally: | |||
self.client.setTimeout(self._timeout) | |||
self._logger.debug('setting post-startup watchman timeout to %s', self._timeout) | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nested try-catches not terribly clear... maybe break into another method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
1daf133
to
d03e488
Compare
@kwlzn : Fine with me! |
merging this based on a green CI mod 1 master breakage and 1 flaky test |
### Problem #5169 describes a deadlock that occurs when any native-engine lock is acquired by a non-main thread during a fork. Pre #4931 the `context_lock` was acquired for the entire length of a request, but during that change that lock acquisition was removed. Kris fixed this as part of #5156, but there is not yet a test confirming that the issue is fixed. ### Solution Add a test to cover #5169. ### Result Covers #5169 with an integration test.
…d#5192) ### Problem pantsbuild#5169 describes a deadlock that occurs when any native-engine lock is acquired by a non-main thread during a fork. Pre pantsbuild#4931 the `context_lock` was acquired for the entire length of a request, but during that change that lock acquisition was removed. Kris fixed this as part of pantsbuild#5156, but there is not yet a test confirming that the issue is fixed. ### Solution Add a test to cover pantsbuild#5169. ### Result Covers pantsbuild#5169 with an integration test.
Problem
Currently, the daemon is launched in the middle of a pants run using a Subsystem. Because this happens in the middle of the run and relies on things like Subsystem and options initialization the first run is always "throw away" and doesn't use the daemon. Furthermore, having the lifecycle in the middle of the run vs at the beginning hobbles our ability to cleanly perform lifecycle operations like inline daemon restarts.
Solution
Move all daemon related options to bootstrap options to enable use of the previous daemon-supporting Subsystem instances earlier in the run. Move the responsibility of launching the daemon to the thin client runner, so that it can synchronously start and use the daemon inline.
Result
The thin client can now launch the daemon itself and use it for the first run.