Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python: split flux.job module into multiple files #3162

Merged
merged 2 commits into from
Aug 27, 2020

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Aug 24, 2020

At this point, this work is a request for comments. I don't know if there are drawbacks to splitting a single python module into multiple files -- will this slow down import flux.job significantly?

This does prepare the way for an easier addition of the flux.job.JobInfo class (and related functions) to the bindings though.

@grondo grondo force-pushed the python-split-job-module branch 3 times, most recently from eafe664 to f0a67dc Compare August 25, 2020 01:53
@grondo
Copy link
Contributor Author

grondo commented Aug 25, 2020

Ok, sorry for the noise. It took me a few tries to get the Makefile right. The recursive Makefile.am files under src/bindings/python/flux are now collapsed into a single Makefile.am using nobase_fluxpy_PYTHON, obviating the need for definition of "extra" automake dirs like fluxpycoredir fluxpyjobdir. Also, all python sources can now go under the same rule. Hopefully this simplifies future efforts as well.

@grondo
Copy link
Contributor Author

grondo commented Aug 25, 2020

Restarted two builders that hit an error in the initial clone from github.

@garlick
Copy link
Member

garlick commented Aug 25, 2020

I had not encountered nobase before. Nice cleanup!

@grondo grondo requested review from SteVwonder and trws August 25, 2020 14:47
@grondo
Copy link
Contributor Author

grondo commented Aug 25, 2020

Ok, I think this is largely working. No code changes, just reorganization, is probably a good place to stop for this PR, so probably one of the resident Python experts should review this. (@SteVwonder and/or @trws)

I'm a little worried this change could have an adverse performance impact.

Copy link
Member

@trws trws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, and should have pretty small performance impact. I'd say zero, but packaged as separate files it will have a small impact at load time unless the directory is zipped as part of the install process, then it would be basically zero.

My only thought is that the wrapper file, or the RAW in it might be good to be renamed as _RAW or _wrapper.py or both. I realize I started the whole RAW thing 🤦 but making it something that doesn't get loaded from a * import and at least conventionally signals it's internal would be nice.

The actual re-homing looks really nice btw. 👍

@SteVwonder
Copy link
Member

Awesome! This looks great to me! job.py was getting unwieldy, and this really cleans things up.

In terms of performance, there will be an added cost since the python interpreter will need to find and import 9 more files than it used to, but my guess is that will be completely in the noise except in pathological cases (i.e., having a Spack environment loaded and running across 1000s of nodes). For all of the other imports that are now duplicated across the new files (e.g., import json, import yaml, import flux), python caches imported modules, so they should essentially be free.

For the pathological cases, I suspect Spindle will be more than enough to handle those cases.

@grondo
Copy link
Contributor Author

grondo commented Aug 25, 2020

My only thought is that the wrapper file, or the RAW in it might be good to be renamed as _RAW or _wrapper.py or both.

Great suggestion! I was wondering exactly how I should do that. I'm fine renaming both.

@trws
Copy link
Member

trws commented Aug 25, 2020

In terms of performance, there will be an added cost since the python interpreter will need to find and import 9 more files than it used to, but my guess is that will be completely in the noise except in pathological cases

Yeah, this should be right. One thing I mentioned as an aside above but found really, really helped when doing some of the initial mummi stuff was setting up the big directory of modules in a zip file: import modules from zip archives.

The interpreter can treat a zip as a directory, and it caches the whole file along with its internal directory listing on initial load then does lookups in it rather than loading all from disk. So if it became an issue a bit of packaging rework could probably squash it without too much trouble.

@grondo grondo force-pushed the python-split-job-module branch from f0a67dc to 43c4563 Compare August 25, 2020 18:22
@grondo
Copy link
Contributor Author

grondo commented Aug 25, 2020

Thanks @trws and @SteVwonder! I've force-pushed a change to rename flux.job.wrapper to flux.job._wrapper and also RAW to _RAW. The previous users of RAW now use from flux.job._wrapper import _RAW as RAW.

@grondo grondo changed the title WIP: python: split flux.job module into multiple files python: split flux.job module into multiple files Aug 25, 2020
@grondo grondo force-pushed the python-split-job-module branch from 43c4563 to 5f75ad3 Compare August 25, 2020 19:19
@digitalresistor
Copy link
Contributor

digitalresistor commented Aug 26, 2020 via email

@grondo
Copy link
Contributor Author

grondo commented Aug 26, 2020 via email

@grondo
Copy link
Contributor Author

grondo commented Aug 27, 2020

It looks like this one was approved so I'm setting MWP since I have other work requiring this PR.

@grondo
Copy link
Contributor Author

grondo commented Aug 27, 2020

Actually this won't pass travis until #3169 is merged.

Combine the Makefile.am files in src/bindings/python/flux and
src/bindings/python/flux/core into a single Makefile.am. Use nobase_
prefix so that all Python source files can be listed in the same
rule. The special `fluxpycoredir` automake variable is no longer
needed, so it can be dropped.

This should make the bindings Makefiles a bit easier to maintain
by reducing duplication and the number of files to edit when new
Python source files are added.
@grondo grondo force-pushed the python-split-job-module branch from 5f75ad3 to 548ded9 Compare August 27, 2020 20:18
Problem: The Python 'flux.job' module is large (>1200 lines)
and thus getting difficult to read and/or modify. This module will
only get larger since there will certainly be more classes and
methods related to Flux jobs added to the Python API.

Split `src/bindings/python/flux/job.py` into multiple files, while
attempting to keep related code together without making any one component
too large.

For backwards compatibility, recreate the previous `flux.job` namespace
in the __init__.py in the new job directory.
@grondo grondo force-pushed the python-split-job-module branch from 548ded9 to 6cc9dda Compare August 27, 2020 20:45
@codecov-commenter
Copy link

Codecov Report

Merging #3162 into master will increase coverage by 0.03%.
The diff coverage is 93.80%.

@@            Coverage Diff             @@
##           master    #3162      +/-   ##
==========================================
+ Coverage   81.21%   81.24%   +0.03%     
==========================================
  Files         286      294       +8     
  Lines       44670    44366     -304     
==========================================
- Hits        36277    36045     -232     
+ Misses       8393     8321      -72     
Impacted Files Coverage Δ
src/bindings/python/flux/job/JobID.py 85.36% <85.36%> (ø)
src/bindings/python/flux/job/wait.py 92.59% <92.59%> (ø)
src/bindings/python/flux/job/event.py 93.50% <93.50%> (ø)
src/bindings/python/flux/job/submit.py 93.75% <93.75%> (ø)
src/bindings/python/flux/job/__init__.py 100.00% <100.00%> (ø)
src/bindings/python/flux/job/_wrapper.py 100.00% <100.00%> (ø)
src/bindings/python/flux/job/kill.py 100.00% <100.00%> (ø)
src/bindings/python/flux/job/kvs.py 100.00% <100.00%> (ø)
src/bindings/python/flux/job/list.py 100.00% <100.00%> (ø)
src/modules/job-info/watch.c 70.98% <0.00%> (-1.56%) ⬇️
... and 12 more

@mergify mergify bot merged commit e31fab0 into flux-framework:master Aug 27, 2020
@grondo grondo deleted the python-split-job-module branch August 28, 2020 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants