-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store rank 0 instance URIs in enclosing instance KVS #1429
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1429 +/- ##
==========================================
- Coverage 78.73% 78.71% -0.02%
==========================================
Files 163 163
Lines 30113 30114 +1
==========================================
- Hits 23708 23704 -4
- Misses 6405 6410 +5
|
That sounds perfect!
…On 5 Apr 2018, at 7:31, Jim Garlick wrote:
This PR is an attempt to address #1422 by writing an instance's rank 0
FLUX_URI and a derived ssh:// URI to `lwj.X.X.X.flux.local-uri` and
`lwj.X.X.X.flux.remote-uri` in the enclosing instance.
I need to provide a test but thought I'd post early to get any
feedback from @trws.
You can view, comment on, or merge this pull request online at:
#1429
-- Commit Summary --
* wreck: set FLUX_JOB_KVSPATH
* rc1: inform enclosing instance of useful URIs
-- File Changes --
M configure.ac (6)
M etc/Makefile.am (3)
A etc/rc1.d/01-enclosing-instance (16)
M src/modules/wreck/wrexecd.c (1)
-- Patch Links --
https://github.com/flux-framework/flux-core/pull/1429.patch
https://github.com/flux-framework/flux-core/pull/1429.diff
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#1429
|
Great. I'll add a |
Nice! Is there a requirement here to add to our requirements/use case doc? |
It might be good for a use-case doc. This is a good step one for
working with nested flux instances, especially when users need them for
job step execution in regular jobs, we’re doing a lot of that for the
splash app and I found myself wishing for this frequently.
…On 5 Apr 2018, at 9:49, Mark Grondona wrote:
Nice! Is there a requirement here to add to our requirements/use case
doc?
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#1429 (comment)
|
OK, subcommand looks like this:
|
That looks good. Would it be possible to get a variant where I provide
a job-id and it prints just the uri for that job? It would help me with
scripting helpers if that isn’t too much trouble.
…On 5 Apr 2018, at 10:16, Jim Garlick wrote:
OK, subcommand looks like this:
```
$ flux wreck uri
ID NTASKS STATE FLUX_URI
COMMAND
1 1 exited -
hostname
2 1 exited -
hostname
3 1 exited ssh://jimbo//tmp/flux-qCbPz5 flux
```
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#1429 (comment)
|
Yes, good idea. |
89eb4c1
to
c8f627e
Compare
OK, I added a Added a couple tests also. This is probably close to ready if travis agrees.
I've made a note to return to this after the splash fire drill - probably! And it may be that we'll want something that works for deeper recursion long term... |
Dropped the lua |
Rebased. |
Problem: a child instance of Flux doesn't know the path to its KVS directory in the enclosing instance. Set the path in the env variable FLUX_JOB_KVSPATH for each job. For example, in the current exec implementation, this might have the value "lwj.0.0.1".
Problem: it is inconvenient to determine the URI to use to connect to a sub-instance. Write URIs to the job's KVS directory in the enclosing instance: lwj.X.X.X.flux.local_uri=local://... lwj.X.X.X.flux.remote_uri=ssh://... Fixes flux-framework#1422
Add a subcommand that lists the lwj.X.Y.Z.flux.remote_uri value, if available. Usage is similar to flux wreck ls, e.g. flux wreck uri [-n, --max=count] [b, --bare] [JOBIDS...] If called with --bare, only the URI for exactly one job is listed, by itself for easy parsing. It is an error if there is not exactly one job specified, or if the job is not a Flux instance. If called without --bare, each job is listed with minimal state information. The FLUX_URI fields is left blank for jobs that are not Flux instances. $ flux wreck uri ID NTASKS STATE FLUX_URI COMMAND 1 1 exited hostname 2 1 exited hostname 3 1 exited ssh://jimbo//tmp/flux-qCbPz5 flux
Add a few tests to ensure that Flux running Flux results in KVS content that can be found by "flux wreck uri". N.B. this sharness script needed to drop the "wreck" personality and use the default full personality in order to execute the rc1.d script that updates the enclosing KVS.
Thanks! |
This looks really great, looking forward to testing it, thanks! |
This PR is an attempt to address #1422 by writing an instance's rank 0 FLUX_URI and a derived ssh:// URI to
lwj.X.X.X.flux.local-uri
andlwj.X.X.X.flux.remote-uri
in the enclosing instance.I need to provide a test but thought I'd post early to get any feedback from @trws.