Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support placeholders for input_path and output_path for all States (except Fail) and items_path for MapState #158

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

ca-nguyen
Copy link
Contributor

@ca-nguyen ca-nguyen commented Sep 3, 2021

Description

With this change, it will be possible to use:

  1. Placeholders for input_path and output_path for all States (except Fail)
  2. Placeholders for items_path for Map State
  3. Context Object Data for Map states

Fixes #101

Why is the change necessary?

This enables the capacity to define input_path and output_path values dynamically for all States (except Fail State). This also supports using placeholder for items_path and context object for MapState.

Solution

Support Placeholders for input_path, output_path and items_path

During workflow definition serialization, replace placeholder with json path when the parsed argument is one of the three (input_path, output_pat, items_path).

Support Context Object Data for Map State

Add new Placeholder objects MapItemValue and MapItemIndex with a json string template to use during workflow definition serialization.

Placeholder Gets replaced by json string template
MapItemValue Value of the array item that is being processed in the current iteration $$.Map.Item.Value{}
MapItemIndex Index number of the array item that is being processed in the current iteration $$.Map.Item.Index
Example
map_item_value = MapItemValue(schema={
        'name': str,
        'age': str
    })

map_item_index = MapItemIndex()

map_state = Map(
    'MapState01',
    parameters={
        "MapIndex": map_item_index,
        "Name": map_item_value['name'],
        "Age": map_item_value['age']
    }
)
iterator_state = Pass(
    'TrainIterator'
)

map_state.attach_iterator(iterator_state)
workflow_definition = Chain([map_state])

workflow = Workflow(
    name="MapItemExample",
    definition=workflow_definition,
    role=workflow_execution_role
)

Workflow definition will be:

{
    "StartAt": "MapState01",
    "States": {
        "MapState01": {
            "Parameters": {
                "MapIndex.$": "$$.Map.Item.Index",
                "Name.$": "$$.Map.Item.Value['name']",
                "Age.$": "$$.Map.Item.Value['age']"
            },
            "Type": "Map",
            "End": true,
            "Iterator": {
                "StartAt": "TrainIterator",
                "States": {
                    "TrainIterator": {
                        "Type": "Pass",
                        "End": true
                    }
                }
            }
        }
    }
}

Testing

  • Added integ and unit tests

Pull Request Checklist

Please check all boxes (including N/A items)

Testing

  • Unit tests added
  • integration test added

Documentation

  • docs: All relevant docs updated
  • docstrings: All public APIs documented

Title and description

  • Change type: Title is prefixed with change type: and follows conventional commits
  • References: Indicate issues fixed via: Fixes #xxx

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license.

@ca-nguyen ca-nguyen marked this pull request as draft September 3, 2021 23:55
@ca-nguyen ca-nguyen changed the title feat: Support placeholders for Map state [DRAFT] feat: Support placeholders for Map state Sep 3, 2021
@ca-nguyen ca-nguyen requested a review from wong-a September 8, 2021 17:22
@ca-nguyen ca-nguyen marked this pull request as ready for review September 8, 2021 17:33
@ca-nguyen ca-nguyen changed the title [DRAFT] feat: Support placeholders for Map state feat: Support placeholders for Map state Sep 8, 2021
Copy link
Contributor

@shivlaks shivlaks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for putting this PR together! left some feedback, let me know if you have any questions.

some higher level thoughts:

  • I think we should also update the docs related to placeholders. Might make sense to re-purpose the example in your commit body to illustrate its usage.
  • does it make sense to add an integration test that exercises placeholders? as we expand the use cases and scenarios we support, I'm thinking it would be useful to have some basic integ tests as unit tests are more brittle.

}
)

map_state.attach_iterator(iterator_state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: what happens with this? - does it become part of the workflow definition somewhere? I thought the Iterator and ItemsPath properties were required for Map states

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ItemsPath is optional. Similar to InputPath and OutputPath, the default is $.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator is added to the Map State here and will be added to the workflow definition when the Map state itself is added to the workflow definition

The docstring will need to be updated - it seems like iterator is required in the Map State constructor when in fact, it gets creates successfully without it. However, the workflow will fail to create if no iterator is set - we could add a validation here and raise an exception if the iterator is not set. At the moment, if fails here with the following message

ValueError: Expected branch to be a State or a Chain, but got `None`

This can be done in another PR to avoid making the scope larger

}
)

map_state.attach_iterator(iterator_state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ItemsPath is optional. Similar to InputPath and OutputPath, the default is $.

@wong-a
Copy link
Contributor

wong-a commented Sep 10, 2021

Clarified what "support placeholders for Map state" means in the initial issue comments: #101 (comment). So far, this PR only addresses the 3rd use case, which isn't actually what the requester was trying to do.

@ca-nguyen
Copy link
Contributor Author

I think we should also update the docs related to placeholders. Might make sense to re-purpose the example in your commit body to illustrate its usage.

Agreed - will update the placeholder docs

does it make sense to add an integration test that exercises placeholders? as we expand the use cases and scenarios we support, I'm thinking it would be useful to have some basic integ tests as unit tests are more brittle.

Yes it does - will include one for Map state

@wong-a
Copy link
Contributor

wong-a commented Sep 10, 2021

@ca-nguyen This change doesn't just affect Map state. There's the 3 things I mentioned here: #101 (comment)

InputPath and OutputPath are allowed in all state types except Fail. Please update the PR title, description, and tests accordingly.

wong-a
wong-a previously requested changes Sep 10, 2021
@ca-nguyen ca-nguyen changed the title feat: Support placeholders for Map state feat: Support placeholders for input_path, output_path for all States (except Fail) and items_path for MapState Sep 10, 2021
@ca-nguyen ca-nguyen changed the title feat: Support placeholders for input_path, output_path for all States (except Fail) and items_path for MapState feat: Support placeholders for input_path and output_path for all States (except Fail) and items_path for MapState Sep 10, 2021
ca-nguyen and others added 3 commits September 11, 2021 00:55
…cstring and tests for all states that support input_path and output_path
Co-authored-by: Adam Wong <55506708+wong-a@users.noreply.github.com>
@ca-nguyen
Copy link
Contributor Author

InputPath and OutputPath are allowed in all state types except Fail. Please update the PR title, description, and tests accordingly.

Updated the PR title and description

@ca-nguyen ca-nguyen dismissed stale reviews from wong-a and shivlaks September 15, 2021 22:29

Requested changes were addressed in the last commits

Copy link
Contributor

@wong-a wong-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not finished reviewing, but here are some comments. Implementation seems fine at a quick glance.

@ca-nguyen ca-nguyen requested a review from wong-a November 1, 2021 18:56
@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-sEHrOdk7acJc
  • Commit ID: b4ddbbd
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Support placeholders for Map state
4 participants