Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create an "atlas" or "cheat sheet" for accessing workflow-relevant information #92

Open
SteVwonder opened this issue Feb 3, 2021 · 2 comments

Comments

@SteVwonder
Copy link
Member

We have a CLI for poking at job-specific information stored within the KVS: job info, but @Larofeticus pointed out that having some documentation about where to find job-specific information within the KVS might also be helpful. To start, maybe we can make a table of various workflow/job-relevant information, where it is stored in the KVS and if/where it is stored in the environment, and maybe even where you used to be able to find it in Slurm etc.

Data that comes to mind:

  • jobid
  • job size (in nodes)
  • rank id
  • hostnames for all nodes in job
  • job endtime
  • cpu mask (i.e., allocated cores)
  • job account/bank
  • job working directory
  • job stdout/stderr location (either file or KVS stream)
@wihobbs
Copy link
Member

wihobbs commented Jul 7, 2022

This was discussed today at our Flux weekly developers meeting.

@garlick mentioned that several of these items are documented in Flux RFCs. Specifically, RFCs 16, 18, and 21 seem relevant. Based on my first read I will try to summarize how each one relates to this issue:

RFC 16 - More generally describes the KVS event log and the content produced by it.
RFC 18 - Provides some keywords for the KVS log, such as timestamp, name, and context, and what they do.
RFC 21 - Describes the states that get logged to KVS, with example KVS entries for each logged change in state.

@chu11 suggested that we have some flux-specific commands to get at this information, specifically:

  • flux job info (described above in initial comment)
  • flux job attach for getting stdout/stderr location.

Just starting a thread here to document as we get more information about KVS out.

@chu11
Copy link
Member

chu11 commented Jul 7, 2022

Just to add, flux jobs is the more user friendly facing tool for getting tons of job information (see manpage for list of things you can retrieve). flux job list or flux job list-ids <JOBID> can also be used to get all raw job list data in a json dump.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants