Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Updated workflows documentation for issue #47 #51

Merged
merged 4 commits into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/python-test-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ jobs:
- uses: s-weigand/setup-conda@v1
- name: Install built package
run: |
conda install -c /tmp/output/noarch/*.conda --update-deps --use-local dewret
conda install -c /tmp/output/noarch/*.conda --update-deps --use-local dewret -y
conda install pytest
$CONDA/bin/pytest
python -m pytest --doctest-modules --ignore=example
Expand Down
175 changes: 161 additions & 14 deletions docs/workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ graph TD
In code, this would be:

```python
>>> import sys
>>> import yaml
>>> from dewret.tasks import task, construct
>>> from dewret.renderers.cwl import render
>>> @task()
... def increment(num: int) -> int:
... """Increment an integer."""
Expand Down Expand Up @@ -123,6 +127,31 @@ Notice that the `increment` tasks appears twice in the CWL workflow definition,
This duplication can be avoided by explicitly indicating that the parameters are the same, with the `param` function.

```python
>>> import sys
>>> import yaml
>>> from dewret.workflow import param
>>> from dewret.tasks import task, construct
>>> from dewret.renderers.cwl import render
>>> @task()
... def increment(num: int) -> int:
... """Increment an integer."""
... return num + 1
>>>
>>> @task()
... def double(num: int) -> int:
... """Double an integer."""
... return 2 * num
>>>
>>> @task()
... def mod10(num: int) -> int:
... """Take num mod 10."""
... return num % 10
>>>
>>> @task()
... def sum(left: int, right: int) -> int:
... """Add two integers."""
... return left + right
>>>
>>> num = param("num", default=3)
>>> result = sum(
... left=double(num=increment(num=num)),
Expand Down Expand Up @@ -191,6 +220,11 @@ While global variables are implicit input to the Python function **note that**:

For example:
```python
>>> import sys
>>> import yaml
>>> from dewret.workflow import param
>>> from dewret.tasks import task, construct
>>> from dewret.renderers.cwl import render
>>> INPUT_NUM = 3
>>> @task()
... def rotate(num: int) -> int:
Expand Down Expand Up @@ -248,7 +282,16 @@ graph TD
As code:

```python
>>> from dewret.tasks import nested_task
>>> import sys
>>> import yaml
>>> from dewret.tasks import task, construct, nested_task
>>> from dewret.renderers.cwl import render
>>> INPUT_NUM = 3
>>> @task()
... def rotate(num: int) -> int:
... """Rotate an integer."""
... return (num + INPUT_NUM) % INPUT_NUM
>>>
>>> @nested_task()
... def double_rotate(num: int) -> int:
... """Rotate an integer twice."""
Expand Down Expand Up @@ -304,6 +347,7 @@ nested task does not have an impact on the return value,
it will disappear__.
For example, the following code renders the same workflow as in the previous example:


```python
@nested_task()
def double_rotate(num: int) -> int:
Expand Down Expand Up @@ -336,8 +380,12 @@ graph TD
As code:

```python
>>> import sys
>>> import yaml
>>> from attrs import define
>>> from numpy import random
>>> from dewret.tasks import task, construct
>>> from dewret.renderers.cwl import render
>>> @define
... class PackResult:
... hearts: int
Expand Down Expand Up @@ -431,9 +479,13 @@ steps:

Here, we show the same example with `dataclasses`.

```python
```python
>>> import sys
>>> import yaml
>>> from dataclasses import dataclass
>>> from numpy import random
>>> from dewret.tasks import task, construct
>>> from dewret.renderers.cwl import render
>>> @dataclass
... class PackResult:
... hearts: int
Expand All @@ -453,6 +505,7 @@ Here, we show the same example with `dataclasses`.
>>> @task()
... def sum(left: int, right: int) -> int:
... return left + right
>>>
>>> red_total = sum(
... left=shuffle(max_cards_per_suit=13).hearts,
... right=shuffle(max_cards_per_suit=13).diamonds
Expand Down Expand Up @@ -531,9 +584,33 @@ A special form of nested task is available to help divide up
more complex workflows: the *subworkflow*. By wrapping logic in subflows,
dewret will produce multiple output workflows that reference each other.

```
>>> from dewret.tasks import subworkflow
>>> my_param = param("num", typ=int)
```python
>>> import sys
>>> import yaml
>>> from attrs import define
>>> from numpy import random
>>> from dewret.tasks import task, construct, subworkflow
>>> from dewret.renderers.cwl import render
>>> @define
... class PackResult:
... hearts: int
... clubs: int
... spades: int
... diamonds: int
>>>
>>> @task()
... def sum(left: int, right: int) -> int:
... return left + right
>>>
>>> @task()
... def shuffle(max_cards_per_suit: int) -> PackResult:
... """Fill a random pile from a card deck, suit by suit."""
... return PackResult(
... hearts=random.randint(max_cards_per_suit),
... clubs=random.randint(max_cards_per_suit),
... spades=random.randint(max_cards_per_suit),
... diamonds=random.randint(max_cards_per_suit)
... )
>>> @subworkflow()
... def red_total():
... return sum(
Expand Down Expand Up @@ -585,7 +662,48 @@ As we have used subworkflow to wrap the colour totals, the outer workflow
contains references to them only. The subworkflows are now returned by `render`
as a second term.

```
```python
>>> import sys
>>> import yaml
>>> from attrs import define
>>> from numpy import random
>>> from dewret.tasks import task, construct, subworkflow
>>> from dewret.renderers.cwl import render
>>> @define
... class PackResult:
... hearts: int
... clubs: int
... spades: int
... diamonds: int
>>>
>>> @task()
... def shuffle(max_cards_per_suit: int) -> PackResult:
... """Fill a random pile from a card deck, suit by suit."""
... return PackResult(
... hearts=random.randint(max_cards_per_suit),
... clubs=random.randint(max_cards_per_suit),
... spades=random.randint(max_cards_per_suit),
... diamonds=random.randint(max_cards_per_suit)
... )
>>> @task()
... def sum(left: int, right: int) -> int:
... return left + right
>>>
>>> @subworkflow()
... def red_total():
... return sum(
... left=shuffle(max_cards_per_suit=13).hearts,
... right=shuffle(max_cards_per_suit=13).diamonds
... )
>>> @subworkflow()
... def black_total():
... return sum(
... left=shuffle(max_cards_per_suit=13).spades,
... right=shuffle(max_cards_per_suit=13).clubs
... )
>>> total = sum(left=red_total(), right=black_total())
>>> workflow = construct(total, simplify_ids=True)
>>> cwl, subworkflows = render(workflow)
>>> yaml.dump(subworkflows["red_total-1"], sys.stdout, indent=2)
class: Workflow
cwlVersion: 1.2
Expand Down Expand Up @@ -662,10 +780,25 @@ the chosen renderer has the capability.

Below is the default output, treating `Pack` as a task.

```
>>> from dewret.tasks import subworkflow, factory
>>> my_param = param("num", typ=int)
```python
>>> import sys
>>> import yaml
>>> from dewret.tasks import subworkflow, factory, nested_task, construct, task
>>> from attrs import define
>>> from dewret.renderers.cwl import render
>>> @define
... class PackResult:
... hearts: int
... clubs: int
... spades: int
... diamonds: int
>>>
>>> Pack = factory(PackResult)
>>>
>>> @task()
... def sum(left: int, right: int) -> int:
... return left + right
>>>
>>> @nested_task()
... def black_total(pack: PackResult):
... return sum(
Expand Down Expand Up @@ -740,10 +873,24 @@ steps:
The CWL renderer is also able to treat `pack` as a parameter, if complex
types are allowed.

```
>>> from dewret.tasks import subworkflow, factory
>>> my_param = param("num", typ=int)
```python
>>> import sys
>>> import yaml
>>> from dewret.tasks import task, factory, nested_task, construct
>>> from attrs import define
>>> from dewret.renderers.cwl import render
>>> @define
... class PackResult:
... hearts: int
... clubs: int
... spades: int
... diamonds: int
>>>
>>> Pack = factory(PackResult)
>>> @task()
... def sum(left: int, right: int) -> int:
... return left + right
>>>
>>> @nested_task()
... def black_total(pack: PackResult):
... return sum(
Expand All @@ -759,7 +906,7 @@ cwlVersion: 1.2
inputs:
PackResult-1:
label: PackResult-1
type: PackResult
type: record
outputs:
out:
label: out
Expand All @@ -776,4 +923,4 @@ steps:
- out
run: sum

```
```
17 changes: 14 additions & 3 deletions example/workflow_complex.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,26 @@
```
"""

from dewret.tasks import nested_task
from dewret.tasks import subworkflow
from workflow_tasks import sum, double, increase

STARTING_NUMBER: int = 23


@nested_task()
@subworkflow()
def nested_workflow() -> int | float:
"""Creates a graph of task calls."""
"""Creates a complex workflow with a nested task.

Workflow Steps:
1. **Increase**: The starting number (`STARTING_NUMBER`) is incremented by 1 using the `increase` task.
2. **Double**: The result from the first step is then doubled using the `double` task.
3. **Increase Again**: Separately, the number 17 is incremented twice using the `increase` task.
4. **Sum**: Finally, the results of the two branches (left and right) are summed together using the `sum` task.

Returns:
- `int | float`: The result of summing the doubled and increased values, which may be an integer or a float depending on the operations.

"""
left = double(num=increase(num=STARTING_NUMBER))
right = increase(num=increase(num=17))
return sum(left=left, right=right)
Loading