Skip to content

added explanatory text to the jobflow tutorial #80

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 24, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 153 additions & 66 deletions example_workflows/arithmetic/jobflow.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,58 +24,63 @@
{
"id": "982a4fbe-7cf9-45dd-84ae-9854149db0b9",
"cell_type": "markdown",
"source": "# jobflow",
"source": [
"# jobflow"
],
"metadata": {}
},
{
"id": "e6180712-d081-45c7-ba41-fc5191f10427",
"cell_type": "markdown",
"source": "## Define workflow with jobflow",
"source": [
"## Define workflow with jobflow\n",
"\n",
"This tutorial will demonstrate how to use the PWD with `jobflow` and load the workflow with `aiida` and `pyiron`.\n",
"\n",
"[`jobflow`](https://joss.theoj.org/papers/10.21105/joss.05995) was developed to simplify the development of high-throughput workflows. It uses a decorator-based approach to define the “Job“s that can be connected to form complex workflows (“Flow“s). `jobflow` is the workflow language of the workflow library [`atomate2`](https://chemrxiv.org/engage/chemrxiv/article-details/678e76a16dde43c9085c75e9), designed to replace [atomate](https://www.sciencedirect.com/science/article/pii/S0927025617303919), which was central to the development of the [Materials Project](https://pubs.aip.org/aip/apm/article/1/1/011002/119685/Commentary-The-Materials-Project-A-materials) database."
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"First, we start by importing the job decorator and the Flow class from jobflow, as welll as the necessary modules from the python workflow definition and the example arithmetic workflow."
],
"metadata": {
"collapsed": false
},
"id": "69bedfb9ec12c092"
},
{
"id": "000bbd4a-f53c-4eea-9d85-76f0aa2ca10b",
"cell_type": "code",
"source": "from jobflow import job, Flow",
"source": [
"from jobflow import job, Flow"
],
"metadata": {
"trusted": true,
"ExecuteTime": {
"end_time": "2025-04-24T10:30:16.328511Z",
"start_time": "2025-04-24T10:30:16.309562Z"
"end_time": "2025-04-24T12:51:34.747117656Z",
"start_time": "2025-04-24T12:51:33.203979325Z"
}
},
"outputs": [
{
"ename": "ModuleNotFoundError",
"evalue": "No module named 'jobflow'",
"output_type": "error",
"traceback": [
"\u001B[31m---------------------------------------------------------------------------\u001B[39m",
"\u001B[31mModuleNotFoundError\u001B[39m Traceback (most recent call last)",
"\u001B[36mCell\u001B[39m\u001B[36m \u001B[39m\u001B[32mIn[4]\u001B[39m\u001B[32m, line 1\u001B[39m\n\u001B[32m----> \u001B[39m\u001B[32m1\u001B[39m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[34;01mjobflow\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m job, Flow\n",
"\u001B[31mModuleNotFoundError\u001B[39m: No module named 'jobflow'"
]
}
],
"execution_count": 4
"outputs": [],
"execution_count": 1
},
{
"id": "06c2bd9e-b2ac-4b88-9158-fa37331c3418",
"cell_type": "code",
"source": "from python_workflow_definition.jobflow import write_workflow_json",
"source": [
"from python_workflow_definition.jobflow import write_workflow_json"
],
"metadata": {
"trusted": true
},
"outputs": [],
"execution_count": 2
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-04-24T10:30:04.618439Z",
"start_time": "2025-04-24T10:30:04.598701Z"
}
},
"metadata": {},
"cell_type": "code",
"source": [
"from workflow import (\n",
Expand All @@ -85,7 +90,17 @@
],
"id": "f9217ce7b093b5fc",
"outputs": [],
"execution_count": 1
"execution_count": null
},
{
"cell_type": "markdown",
"source": [
"Using the job object decorator, the imported functions from the arithmetic workflow are transformed into jobflow “Job”s. These “Job”s can delay the execution of Python functions and can be chained into workflows (“Flow”s). A “Job” can return serializable outputs (e.g., a number, a dictionary, or a Pydantic model) or a so-called “Response” object, which enables the execution of dynamic workflows where the number of nodes is not known prior to the workflow’s execution. "
],
"metadata": {
"collapsed": false
},
"id": "2639deadfae9c591"
},
{
"metadata": {
Expand All @@ -95,7 +110,9 @@
}
},
"cell_type": "code",
"source": "workflow_json_filename = \"jobflow_simple.json\"",
"source": [
"workflow_json_filename = \"jobflow_simple.json\""
],
"id": "1feba0898ee4e361",
"outputs": [],
"execution_count": 2
Expand All @@ -110,31 +127,17 @@
"get_prod_and_div = job(_get_prod_and_div)"
],
"metadata": {
"trusted": true,
"ExecuteTime": {
"end_time": "2025-04-24T10:30:05.169761Z",
"start_time": "2025-04-24T10:30:05.043635Z"
}
"trusted": true
},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'job' is not defined",
"output_type": "error",
"traceback": [
"\u001B[31m---------------------------------------------------------------------------\u001B[39m",
"\u001B[31mNameError\u001B[39m Traceback (most recent call last)",
"\u001B[36mCell\u001B[39m\u001B[36m \u001B[39m\u001B[32mIn[3]\u001B[39m\u001B[32m, line 1\u001B[39m\n\u001B[32m----> \u001B[39m\u001B[32m1\u001B[39m get_sum = \u001B[43mjob\u001B[49m(_get_sum)\n\u001B[32m 2\u001B[39m get_prod_and_div = job(_get_prod_and_div, data=[\u001B[33m\"\u001B[39m\u001B[33mprod\u001B[39m\u001B[33m\"\u001B[39m, \u001B[33m\"\u001B[39m\u001B[33mdiv\u001B[39m\u001B[33m\"\u001B[39m])\n",
"\u001B[31mNameError\u001B[39m: name 'job' is not defined"
]
}
],
"execution_count": 3
"outputs": [],
"execution_count": null
},
{
"id": "ecef1ed5-a8d3-48c3-9e01-4a40e55c1153",
"cell_type": "code",
"source": "obj = get_prod_and_div(x=1, y=2)",
"source": [
"obj = get_prod_and_div(x=1, y=2)"
],
"metadata": {
"trusted": true
},
Expand All @@ -144,7 +147,9 @@
{
"id": "2b88a30a-e26b-4802-89b7-79ca08cc0af9",
"cell_type": "code",
"source": "w = get_sum(x=obj.output.prod, y=obj.output.div)",
"source": [
"w = get_sum(x=obj.output.prod, y=obj.output.div)"
],
"metadata": {
"trusted": true
},
Expand All @@ -154,17 +159,31 @@
{
"id": "a5e5ca63-2906-47c9-bac6-adebf8643cba",
"cell_type": "code",
"source": "flow = Flow([obj, w])",
"source": [
"flow = Flow([obj, w])"
],
"metadata": {
"trusted": true
},
"outputs": [],
"execution_count": 8
},
{
"cell_type": "markdown",
"source": [
"As jobflow itself is only a workflow language, the workflows are typically executed on high-performance computers with a workflow manager such as [Fireworks](https://onlinelibrary.wiley.com/doi/full/10.1002/cpe.3505) or [jobflow-remote](https://github.com/Matgenix/jobflow-remote). For smaller and test workflows, simple linear, non-parallel execution of the workflow graph can be performed with jobflow itself. All outputs of individual jobs are saved in a database. For high-throughput applications typically, a MongoDB database is used. For testing and smaller workflows, a memory database can be used instead."
],
"metadata": {
"collapsed": false
},
"id": "27688edd256f1420"
},
{
"id": "e464da97-16a1-4772-9a07-0a47f152781d",
"cell_type": "code",
"source": "write_workflow_json(flow=flow, file_name=workflow_json_filename)",
"source": [
"write_workflow_json(flow=flow, file_name=workflow_json_filename)"
],
"metadata": {
"trusted": true
},
Expand All @@ -174,7 +193,9 @@
{
"id": "bca646b2-0a9a-4271-966a-e5903a8c9031",
"cell_type": "code",
"source": "!cat {workflow_json_filename}",
"source": [
"!cat {workflow_json_filename}"
],
"metadata": {
"trusted": true
},
Expand All @@ -187,16 +208,34 @@
],
"execution_count": 10
},
{
"cell_type": "markdown",
"source": [
"Finally, you can write the workflow data into a JSON file to be imported later."
],
"metadata": {
"collapsed": false
},
"id": "65389ef27c38fdec"
},
{
"id": "87a27540-c390-4d34-ae75-4739bfc4c1b7",
"cell_type": "markdown",
"source": "## Load Workflow with aiida",
"source": [
"## Load Workflow with aiida\n",
"\n",
"In this part, we will demonstrate how to import the `jobflow` workflow into `aiida` via the PWD."
],
"metadata": {}
},
{
"id": "66a1b3a6-3d3b-4caa-b58f-d8bc089b1074",
"cell_type": "code",
"source": "from aiida import load_profile\n\nload_profile()",
"source": [
"from aiida import load_profile\n",
"\n",
"load_profile()"
],
"metadata": {
"trusted": true
},
Expand All @@ -215,17 +254,32 @@
{
"id": "4679693b-039b-45cf-8c67-5b2b3d705a83",
"cell_type": "code",
"source": "from python_workflow_definition.aiida import load_workflow_json",
"source": [
"from python_workflow_definition.aiida import load_workflow_json"
],
"metadata": {
"trusted": true
},
"outputs": [],
"execution_count": 12
},
{
"cell_type": "markdown",
"source": [
"We import the necessary modules from `aiida` and the PWD, as well as the workflow JSON file."
],
"metadata": {
"collapsed": false
},
"id": "cc7127193d31d8ef"
},
{
"id": "68c41a61-d185-47e8-ba31-eeff71d8b2c6",
"cell_type": "code",
"source": "wg = load_workflow_json(file_name=workflow_json_filename)\nwg",
"source": [
"wg = load_workflow_json(file_name=workflow_json_filename)\n",
"wg"
],
"metadata": {
"trusted": true
},
Expand All @@ -246,10 +300,22 @@
],
"execution_count": 13
},
{
"cell_type": "markdown",
"source": [
"Finally, we are now able to run the workflow with `aiida`."
],
"metadata": {
"collapsed": false
},
"id": "4816325767559bbe"
},
{
"id": "05228ece-643c-420c-8df8-4ce3df379515",
"cell_type": "code",
"source": "wg.run()",
"source": [
"wg.run()"
],
"metadata": {
"trusted": true
},
Expand All @@ -265,13 +331,19 @@
{
"id": "2c942094-61b4-4e94-859a-64f87b5bec64",
"cell_type": "markdown",
"source": "## Load Workflow with pyiron_base",
"source": [
"## Load Workflow with pyiron_base\n",
"\n",
"In this part, we will demonstrate how to import the `jobflow` workflow into `pyiron` via the PWD."
],
"metadata": {}
},
{
"id": "ea102341-84f7-4156-a7d1-c3ab1ea613a5",
"cell_type": "code",
"source": "from python_workflow_definition.pyiron_base import load_workflow_json",
"source": [
"from python_workflow_definition.pyiron_base import load_workflow_json"
],
"metadata": {
"trusted": true
},
Expand All @@ -281,7 +353,10 @@
{
"id": "8f2a621d-b533-4ddd-8bcd-c22db2f922ec",
"cell_type": "code",
"source": "delayed_object_lst = load_workflow_json(file_name=workflow_json_filename)\ndelayed_object_lst[-1].draw()",
"source": [
"delayed_object_lst = load_workflow_json(file_name=workflow_json_filename)\n",
"delayed_object_lst[-1].draw()"
],
"metadata": {
"trusted": true
},
Expand All @@ -300,7 +375,9 @@
{
"id": "cf80267d-c2b0-4236-bf1d-a57596985fc1",
"cell_type": "code",
"source": "delayed_object_lst[-1].pull()",
"source": [
"delayed_object_lst[-1].pull()"
],
"metadata": {
"trusted": true
},
Expand All @@ -322,14 +399,24 @@
"execution_count": 17
},
{
"id": "9d819ed0-689c-46a7-9eff-0afb5ed66efc",
"cell_type": "code",
"source": "",
"cell_type": "markdown",
"source": [
"Here, the procedure is the same as before: Import the necessary `pyiron_base` module from the PWD, import the workflow JSON file and run the workflow with pyiron."
],
"metadata": {
"trusted": true
"collapsed": false
},
"id": "9414680d1cbc3b2e"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"execution_count": null
"source": [],
"metadata": {
"collapsed": false
},
"id": "c199b28f3c0399cc"
}
]
}
Loading