Output nodes, global parameter names, and default port handling in PWD format

In its current state, the PWD also contains the inputs with concrete values, e.g., for the simple arithmetic workflow

```json
{
  "nodes": [
    {"id": 0, "function": "workflow.get_prod_and_div"},
    {"id": 1, "function": "workflow.get_sum"},
    {"id": 2, "value": 1},
    {"id": 3, "value": 2}
  ],
  "edges": [
    {"target": 0, "targetPort": "x", "source": 2, "sourcePort": null},
    {"target": 0, "targetPort": "y", "source": 3, "sourcePort": null},
    {"target": 1, "targetPort": "x", "source": 0, "sourcePort": "prod"},
    {"target": 1, "targetPort": "y", "source": 0, "sourcePort": "div"}
  ]
}
```

We see this as problematic for a few reasons: On one hand, it is somewhat inconsistent, as it means that data nodes should explicitly be part of the workflow definition. However, neither the `result` output is part of the PWD graph representation as data nodes nor are the intermediate values, e.g., `prod` and `div`. Instead, in the current representation, the `prod` and `div` outputs are only represented by the edges. If _some_ data objects are part of the PWD, then _all_ of them should be.

Further, providing concrete input values means that the definition contained in the JSON PWD corresponds to a concrete workflow "instance" rather than just the general workflow logic. I recall that this was also brought up in one of the comments on the paper draft, which mentioned that we should showcase that it’s possible to modify the input values after loading a workflow into a framework from the PWD.

Therefore, we propose to remove the data nodes from the "nodes" section of the PWD JSON and instead, to still keep the relevant information, to add it to the "edges" section. This would give the following PWD:

```json
{
  "nodes": [
    {"id": 0, "function": "workflow.get_prod_and_div"},
    {"id": 1, "function": "workflow.get_sum"}
  ],
  "edges": [
    {"target": 0, "targetPort": "x", "source": null, "sourcePort": null},
    {"target": 0, "targetPort": "y", "source": null, "sourcePort": null},
    {"target": 1, "targetPort": "x", "source": 0, "sourcePort": "prod"},
    {"target": 1, "targetPort": "y", "source": 0, "sourcePort": "div"},
    {"target": null, "targetPort": null, "source": 1, "sourcePort": "result"}
  ]
}
```

That means that the "global" inputs and outputs of the workflow are represented by "dangling" edges of the workflow graph. While it's not ideal, we think this is fine for now to keep the modifications minimal. Thus, with this modification, any unused return values of intermediate functions would automatically become global outputs of the workflow. One could instead add additional keys to the edges such as `workflow_input_name` and `workflow_output_name` to explicitly expose those, e.g.:

```json
{
  "nodes": [
    {"id": 0, "function": "workflow.get_prod_and_div"},
    {"id": 1, "function": "workflow.get_sum"}
  ],
  "edges": [
    {"target": 0, "targetPort": "x", "source": null, "sourcePort": null, "workflowInputName": "a"},
    {"target": 0, "targetPort": "y", "source": null, "sourcePort": null, "workflowInputName": "b"},
    {"target": 1, "targetPort": "x", "source": 0, "sourcePort": "prod"},
    {"target": 1, "targetPort": "y", "source": 0, "sourcePort": "div"},
    {"target": null, "targetPort": null, "source": 1, "sourcePort": "result", "workflowOutputName": "c"}
  ]
}
```

Or add a `global_ports` section to the PWD. To be thought about in the future.
Ping @mbercx and @giovannipizzi.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Output nodes, global parameter names, and default port handling in PWD format #88

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Output nodes, global parameter names, and default port handling in PWD format #88

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions