You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Leveraging the package loader in Jinja2 is possible if we force pipelines to be valid python packages. This is a big change to pipelines, but it opens up the possibility of cross-pipeline imports.
Current pipeline minimal requirements:
template.jst
pipeline.yaml
Proposed pipeline requirements:
__init__.py
setup.py
templates/
template.jst
To aid in the development of pipelines, a new pipeline helper script could be developed that would create the initial boilerplate.
Benefits
Pipelines can be managed with pip, this allows them to be easily distributed from github or pypi. for example:
pip install jetstream-phoenix
Pipelines can include arbitrary code in their __init__.py that can be used setup their configuration
data used for rendering the template. This adds an incredible amount of power to pipelines:
"""Standard boilerplate __init__.py created with helper scriptPipelines could be discovered and introspected with the plugin interface. Afterdiscovery, the jinja2 loader could be configured to include all of the pipelinesthat are currently installed."""frompkg_resourcesimportresource_filenamemanifest=load_yaml(resource_filename('jetstream-phoenix', 'pipeline.yaml)
# Allowing customization here has potential for amazing things...frommagicpackageimportdownload_databasedb='temp/data.txt'download_database(db)
data= {
'foo': 'bar',
'database': db# Always download the latest database to the project before rendering
}
Drawbacks
As with all python packages, supporting multiple versions of the same package in a single environment is problematic. Python scripts typically import package not import package@v1.0. There are some features in pkg_resources and the __require__ dunder which may allow this (idk much about it), but it's not easy to pull off. The typical solution to this problem is virtual environments.
Removing the ability to have several versions of a pipeline installed (without resorting to a virtual
environment) is probably not a very painful change at this point.
Keep in mind, if a true import system is made available (imports work from any installed pipeline),
there are bigger problems to solve if multiple versions were somehow made possible. This is the
same problem with any dependency system:
a requires x v1.0
b requires x v2.0
c requires a and b
what happens?
How do we deal with eggs? There may be some mechanism in setup.py to indicate that pipelines
cannot be packaged into eggs.
Conclusion
It's still going to take a lot of planning, but in principal it's a small change with some massive benefits. I would appreciate any thoughts on the idea.
The text was updated successfully, but these errors were encountered:
Proposal
Leveraging the package loader in Jinja2 is possible if we force pipelines to be valid python packages. This is a big change to pipelines, but it opens up the possibility of cross-pipeline imports.
Current pipeline minimal requirements:
Proposed pipeline requirements:
To aid in the development of pipelines, a new pipeline helper script could be developed that would create the initial boilerplate.
Benefits
__init__.py
that can be used setup their configurationdata used for rendering the template. This adds an incredible amount of power to pipelines:
Drawbacks
As with all python packages, supporting multiple versions of the same package in a single environment is problematic. Python scripts typically
import package
notimport package@v1.0
. There are some features inpkg_resources
and the__require__
dunder which may allow this (idk much about it), but it's not easy to pull off. The typical solution to this problem is virtual environments.Removing the ability to have several versions of a pipeline installed (without resorting to a virtual
environment) is probably not a very painful change at this point.
Keep in mind, if a true import system is made available (imports work from any installed pipeline),
there are bigger problems to solve if multiple versions were somehow made possible. This is the
same problem with any dependency system:
How do we deal with eggs? There may be some mechanism in
setup.py
to indicate that pipelinescannot be packaged into eggs.
Conclusion
It's still going to take a lot of planning, but in principal it's a small change with some massive benefits. I would appreciate any thoughts on the idea.
The text was updated successfully, but these errors were encountered: