Adding a pipeline manager to AgriFoodPy #56

jucordero · 2023-12-14T13:18:45Z

jucordero
Dec 14, 2023
Maintainer

One of the goals of AgriFoodPy is to provide an accessible tool to model agrifood systems.
A pipeline manager can help us achieve this by closing the gap between the non-expert user wanting to test the effect of certain interventions to the food system, and the low level API we are developing.

My very basic view of the pipeline manager is as follows:

The user generates a configuration file describing the list of interventions, models, outputs
Each of these is described by a set of parameters. These can be set side by side with the above list
Interventions, models, chosen outputs, etc, are applied/extracted from a set of provided data.

A known example of how this could work is CosmoSIS, which is used in Astrophysics as a parameter inference tool. It connects a sampler with a series of user selected modules which run in sequence and return a likelihood to the sampler.
The part I would be interested in replicating is the module sequence which, in very simple terms, works by reading and writing data to a central structure called the Datablock.

In AgriFoodPy, our Datablock can be a dictionary containing the different datasets each module will require, the metrics and outputs of each module / model. These outputs can be used as intermediate input data for the next modules in the pipeline, or can be extracted at the end to be analysed, plotted, etc.

I'm not aware of any generic tool to facilitate this. As far as I know, CosmoSIS was written from scratch by Joe Zuntz.
If it follows a well known strategy or paradigm, we could explore that instead.

Ian suggested some pipeline management tools by email some time ago:

There is a list here
https://github.com/pditommaso/awesome-pipeline

Things I have heard people mention / use:

snakemake
Other sensible things(?) from skimming the list:

pydra

taskgraph

scipipe

dagster

I will start investigating these in detail now. We can use the Dashboard as a testing canvas for pipeline execution.

JP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a pipeline manager to AgriFoodPy #56

{{title}}

Replies: 0 comments

Select a reply

Adding a pipeline manager to AgriFoodPy #56

jucordero Dec 14, 2023 Maintainer

Replies: 0 comments

jucordero
Dec 14, 2023
Maintainer