You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, SeisFlows is built upon sys.modules, where each sub-class (e.g., seisflows.workflow.inversion) is imported and then stored in sys.modules to be globally accessible to all other modules. Parameters and paths are also read in and shared in this way. Some pseudocode example of how this works looks like:
With this implementation, you MUST import file_1.py BEFORE importing file_2.py.
The Problem
Although this approach makes SeisFlows flexible in how it passes information between modules, it simultaneously makes it difficult to use SeisFlows in a Pythonic manner. Some key issues include:
Unable to import SeisFlows modules without first establishing the entire workflow. This makes it impossible to use individual pieces of SeisFlows without first instantiating the entire package. In other words, you cannot run the following code:
fromseisflows.optimize.LBFGSimportLBFGS# this will throw a KeyError
Modules must be imported in a specific order so that they may access one another through sys.modules. This forces a mandatory set up procedure each time the code is to be used, and leads to undesirable in-function runtime import statements for modules which need to access each other out of order:
seisflows.preprocess.pyatoa.Pyatoa
defsetup_event_pyaflowa(self, source_name=None):
# Late import because preprocess is loaded before optimize,# Optimize required to know which iteration/step_count we are atsolver=sys.modules["seisflows_solver"]
optimize=sys.modules["seisflows_optimize"]
iteration=optimize.iterifsource_nameisNone:
source_name=solver.source_names[0]
...
Unit testing becomes a problem since each module may be reliant on others. E.g., you cannot test the optimization library alone, as it is coupled to the solver module (related issue: Improve unit testing #12)
Goals
The proposed solution would decouple the package from the sys.modules implementation in an effort to improve the modularity of SeisFlows. The end goals of this would be that:
Each module (solver, optimize etc.) should be importable and usable on its own, without any reliance on the other modules.
fromseisflows.optimize.LBFGSimportLBFGSasoptimize# we want this to workdirection=optimize.compute_direction(gradient=g_new)
Modules should communicate to one another via function call I/O, rather than globally accessible imports.
This will involve a large code refactor of SeisFlows that will affect almost every file and effectively re-design the core code implementation. Similarly I foresee the public API changing on how SeisFlows will be implemented.
Drive this refactor with tests (TDD). Each rewritten component should be unit tested.
Remove sys.modules implementation. Any previous reliance on globally accessible modules should be rewritten as parameter passing via function inputs/outputs
Concentrate parameter, path and module setup in a main workflow script controlled by the User.
Remove global parameter accessibility in favor of initializing parameters via init() and a separate check() function which checks the validity of input parameters.
Decouple the hardcoded directory structure. Expose paths explicitly in a main workflow script, rather than having them be implicitly defined throughout the code.
Remove the complicated start up procedure that occurs during seisflows submit. Expose any simplified start up in the main workflow script.
Ideally I would like to have a main workflow script that looked something like this:
importosfromseisflows.configimportStatus# Used to keep track of processesfromseisflows.configimportcustom_importfromseisflows.tools.wrappersimportloadyamlif__name__=="__main__":
workdir=os.getcwd()
paths, parameters=loadyaml("parameters.yaml")
statusfile=Status(path="status.txt")
# Standardized 'setup' chunksystem=custom_import(parameters.system)(**parameters)
workflow=custom_import(parameters.workflow)(**parameters)
solver=custom_import(parameters.solver)(**parameters)
optimize=custom_import(parameters.optimize)(**parameters)
preprocess=custom_import(parameters.preprocess)(**parameters)
postprocess=custom_import(parameters.postprocess)(**parameters)
# Check that the input parameters are set correctlyformodulein [system, workflow, solver, optimize, preprocess, postprocess]:
module.check()
foriterationinrange(parameters.begin, parameters.end):
system.run(solver.generate_data, path=paths.solver, status=statusfile)
system.run(solver.generate_initial_model, paths=path.model_init, status=statusfile)
system.run(solver.forward_simulation, paths=path.gradient, status=statusfile)
gradient_new=solver.load_model(path=path.gradient)
system.run(postprocess.smooth_gradient, gradient=gradient, status=statusfile)
...
The text was updated successfully, but these errors were encountered:
Background
Currently, SeisFlows is built upon
sys.modules
, where each sub-class (e.g., seisflows.workflow.inversion) is imported and then stored insys.modules
to be globally accessible to all other modules. Parameters and paths are also read in and shared in this way. Some pseudocode example of how this works looks like:file_1.py
file_2.py
With this implementation, you MUST import file_1.py BEFORE importing file_2.py.
The Problem
Although this approach makes SeisFlows flexible in how it passes information between modules, it simultaneously makes it difficult to use SeisFlows in a Pythonic manner. Some key issues include:
seisflows.preprocess.pyatoa.Pyatoa
Goals
The proposed solution would decouple the package from the sys.modules implementation in an effort to improve the modularity of SeisFlows. The end goals of this would be that:
main.py
Plan of Attack
This will involve a large code refactor of SeisFlows that will affect almost every file and effectively re-design the core code implementation. Similarly I foresee the public API changing on how SeisFlows will be implemented.
seisflows submit
. Expose any simplified start up in the main workflow script.Ideally I would like to have a main workflow script that looked something like this:
The text was updated successfully, but these errors were encountered: