-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
job-archive: gap analysis for fields needed to create job usage reports #3136
Comments
I think an assumption of Flux is that there is a global userid space across the center. Hopefully the system on which the flux-accounting scripts are run has a fully populated passwd file, in which case you can use the same code as def get_username(userid):
try:
return pwd.getpwuid(userid).pw_name
except KeyError:
return str(userid)
We'll need to determine a spot where the group/account gets set in jobspec. Then you can pull it from there, or it may be important enough to add to We may want to open an issue on this in the rfc repo.
Yes, and it looks like job-archive also stores the "ranks" which Also, I wonder if instead a |
Let me go ahead and cc @ryanday36 on this thread so we can get his input on this as well!
Thank you for pointing this out. I can fetch a username by passing in the try:
userid = pwd.getpwnam(username).pw_uid
except KeyError:
return str(username) |
Accounting by total number of cores works fine as long as users get charged for all of the cores when we give them a whole node. I.e., in slurm, a user can run something like 'srun -n1 ...' and only ask for 1 task on one core, but, on most of our clusters, we're giving them the whole node anyway, so we need to charge them for all 36 (or whatever) cores. Something else that you probably don't need to worry about right now, but might want to think about in terms of designing in some flexibility, is accounting for GPUs or other resources. Some folks from Sandia were recently asking about whether we did anything to account for users requesting GPUs in the context of Slurm. Since we're scheduling our GPU clusters by node currently and don't have any plans to change that, it's not something that we're worried about immediately, but it's not hard to imagine some sort of Sierra-like cluster that we allocate at a sub-node level and want to charge jobs more if they request a GPU. |
Flux records all assigned resources in the resource set R, so accounting for all resources should be possible. |
We could also add any number of |
Thanks @chu11! Yeah, having an
|
Yes, Fluxion will have hostnames in its |
Now that flux-core's interfaces to job-list are more evolved and documented, would it make sense for flux-accounting to take ownership for the job archive db, either by folding its tables in with the accounting db or by starting it as a separate, standalone systemd service? It feels to me like that puts the database schema in a place where the stakeholders are going to feel more empowered to adapt it to their needs. If, as discussed in #4336, we end up with another database implementation in flux-core for historical job information, then we can think about whether it makes sense for flux-accounting to switch over to that, but it wouldn't be required. Edit: I guess I didn't mention that the archive db gets its job information from job-list. I was just thinking that the fact that the job-list now presents as an external interface with a relatively stable API makes it seem like the better interface between core and accounting, rather than using the archive database schema as the interface. |
Just a random side comment about something I did in #4336 relevant to this thread. I dump all data from |
I like this idea. This would allow both services to "fork" from each other for their specific needs and there's less need to worry about breaking one another with changes.
I would actually be inclined to say that |
This is an interesting thought! Does EDIT: i.e here's a query I currently use for the select_stmt = (
"SELECT userid,id,t_submit,t_run,t_inactive,ranks,R,jobspec FROM jobs "
) |
I think you'd still want the archive tables since the data available from job-list gets pruned from time to time. |
But it could be a python script in flux-accounting that pulls data periodically from job-list and stores it. So the above query returns all jobs since the beginning of time. Is that going to become a problem? Should there be a window outside of which jobs should be ignored? If so then if you own the archive tables, you could prune the database... |
Oh, sorry, I should have been clearer. The query gets refined to only look for jobs after a certain timestamp because there is a window where jobs are no longer considered a part of a user/bank's usage; I believe it can be adjusted by an administrator, but the default cutoff is one month. After a job becomes gets older than the cutoff it is no longer affects a user/bank's usage and fair share, so I don't think the flux-accounting code that calculates a user/bank's total job usage looks for those old jobs any more. |
Nice, then if you own the db, you can nix those old records (after backing them up to csv for @ryanday36 of course). Right now we do not ever prune the archive db 😱 Edit: sorry, I didn't mean to over simplify. There are reporting considerations too that would influence that decision. |
An aside on this discussion, if |
They could install flux accounting. |
Sorry, haha, I think I'm a little confused on the end-goal responsibility of flux-accounting and I want to make sure I understand you correctly. Is the consideration here whether flux-accounting should become responsible of implementing its own job-archive DB instead of fetching job records from flux-core's job-archive module where the current job-archive DB is located? |
Yep that's what I was throwing out there to see if it stuck to the wall :-) |
Now that flux-accounting has its own job archive database, seems like we can close this. |
I am in the process of developing a front-end for the job-archive (as required by stakeholders like Ryan Day to create job usage reports). This involves building a
JobRecord
class which holds job data:I have been able to grab most of these fields from the job-archive, but a couple either are not present, or I have a question on how to fetch that data -
user
- Currently I can grab the user ID from the job-archive, but not a username. Is it planned to include a username field along with a userid? Is there a way to do a lookup to find a username given a user ID?group
(project
probably fits in here too) - these are actually theaccount
(and in the future,wckey
) fields from flux-accounting. From speaking with Ryan, I believe this is specified when users are submitting jobs; if not specified, it defaults to some bank. Realistically, probably not a high priority item at this moment, but something we will need in the near future.nnodes
- Is there a way to getnnodes
from theR
orjobspec
columns from job-archive? Is it therank
field inR
?The text was updated successfully, but these errors were encountered: