Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store rank 0 instance URIs in enclosing instance KVS #1429

Merged
merged 5 commits into from
Apr 6, 2018

Conversation

garlick
Copy link
Member

@garlick garlick commented Apr 5, 2018

This PR is an attempt to address #1422 by writing an instance's rank 0 FLUX_URI and a derived ssh:// URI to lwj.X.X.X.flux.local-uri and lwj.X.X.X.flux.remote-uri in the enclosing instance.

I need to provide a test but thought I'd post early to get any feedback from @trws.

@coveralls
Copy link

coveralls commented Apr 5, 2018

Coverage Status

Coverage decreased (-0.02%) to 79.016% when pulling d4d1090 on garlick:ssh_uri into 95f8d8f on flux-framework:master.

@codecov-io
Copy link

codecov-io commented Apr 5, 2018

Codecov Report

Merging #1429 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1429      +/-   ##
==========================================
- Coverage   78.73%   78.71%   -0.02%     
==========================================
  Files         163      163              
  Lines       30113    30114       +1     
==========================================
- Hits        23708    23704       -4     
- Misses       6405     6410       +5
Impacted Files Coverage Δ
src/modules/wreck/wrexecd.c 75.7% <100%> (+0.02%) ⬆️
src/bindings/lua/flux-lua.c 80.97% <0%> (-0.79%) ⬇️
src/broker/overlay.c 74.45% <0%> (-0.32%) ⬇️
src/modules/connector-local/local.c 74.38% <0%> (-0.21%) ⬇️
src/broker/module.c 84.07% <0%> (+0.27%) ⬆️
src/common/libflux/message.c 81.6% <0%> (+0.58%) ⬆️

@trws
Copy link
Member

trws commented Apr 5, 2018 via email

@garlick
Copy link
Member Author

garlick commented Apr 5, 2018

Great. I'll add a flux wreck uri command to make it easier to access.

@grondo
Copy link
Contributor

grondo commented Apr 5, 2018

Nice! Is there a requirement here to add to our requirements/use case doc?

@trws
Copy link
Member

trws commented Apr 5, 2018 via email

@garlick
Copy link
Member Author

garlick commented Apr 5, 2018

OK, subcommand looks like this:

$ flux wreck uri
    ID NTASKS STATE     FLUX_URI                                 COMMAND
     1      1 exited                                             hostname
     2      1 exited                                             hostname
     3      1 exited    ssh://jimbo//tmp/flux-qCbPz5             flux

@trws
Copy link
Member

trws commented Apr 5, 2018 via email

@garlick
Copy link
Member Author

garlick commented Apr 5, 2018

Yes, good idea.

@garlick garlick force-pushed the ssh_uri branch 2 times, most recently from 89eb4c1 to c8f627e Compare April 5, 2018 20:56
@garlick
Copy link
Member Author

garlick commented Apr 5, 2018

OK, I added a --bare option to print just the URI for a single job.

Added a couple tests also. This is probably close to ready if travis agrees.

Nice! Is there a requirement here to add to our requirements/use case doc?

I've made a note to return to this after the splash fire drill - probably! And it may be that we'll want something that works for deeper recursion long term...

@garlick
Copy link
Member Author

garlick commented Apr 6, 2018

Dropped the lua f:kvs_rawget() function, which had no tests of its own, because it's not really necessary. I just put the values in the kvs with flux kvs put --json instead and the the old f:kvs_get() works and this PR becomes smaller.

@garlick
Copy link
Member Author

garlick commented Apr 6, 2018

Rebased.

garlick added 5 commits April 6, 2018 14:28
Problem: a child instance of Flux doesn't know the path
to its KVS directory in the enclosing instance.

Set the path in the env variable FLUX_JOB_KVSPATH for each
job.  For example, in the current exec implementation, this
might have the value "lwj.0.0.1".
Problem: it is inconvenient to determine the URI to
use to connect to a sub-instance.

Write URIs to the job's KVS directory in the enclosing
instance:

lwj.X.X.X.flux.local_uri=local://...
lwj.X.X.X.flux.remote_uri=ssh://...

Fixes flux-framework#1422
Add a subcommand that lists the lwj.X.Y.Z.flux.remote_uri
value, if available.

Usage is similar to flux wreck ls, e.g.
  flux wreck uri [-n, --max=count] [b, --bare] [JOBIDS...]

If called with --bare, only the URI for exactly one job is
listed, by itself for easy parsing.  It is an error if there
is not exactly one job specified, or if the job is not a
Flux instance.

If called without --bare, each job is listed with minimal
state information.  The FLUX_URI fields is left blank for
jobs that are not Flux instances.

$ flux wreck uri
ID NTASKS STATE     FLUX_URI                                 COMMAND
 1      1 exited                                             hostname
 2      1 exited                                             hostname
 3      1 exited    ssh://jimbo//tmp/flux-qCbPz5             flux
Add a few tests to ensure that Flux running Flux results
in KVS content that can be found by "flux wreck uri".

N.B. this sharness script needed to drop the "wreck"
personality and use the default full personality
in order to execute the rc1.d script that updates
the enclosing KVS.
@chu11 chu11 merged commit 0e0f83f into flux-framework:master Apr 6, 2018
@garlick
Copy link
Member Author

garlick commented Apr 6, 2018

Thanks!

@garlick garlick deleted the ssh_uri branch April 7, 2018 12:40
@trws
Copy link
Member

trws commented Apr 8, 2018

This looks really great, looking forward to testing it, thanks!

@grondo grondo mentioned this pull request May 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants