Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Update AWS Glue Databrew StartJobRun step to use integration pattern input #176

Merged
merged 8 commits into from
Nov 15, 2021

Conversation

ca-nguyen
Copy link
Contributor

Description

Update AWS Glue Databrew service integration to use integration_pattern input instead or wait_for_completion flag.

Fixes #(issue) (N/A)

Why is the change necessary?

This change is necessary for consistency with the new service integration implementation pattern introduced in commit (Add support for Nested Step Functions) that uses the integration_pattern arg in the step constructor to build the resource.

Support for AWS Glue Databrew service integration was added in this commit, but not released yet.
A later commit (Add support for Nested Step Functions) introduced a new implementation pattern using the IntegrationPattern enum as input to construct the step instead of the wait_for_completion flag. (See PR for more detail on rationale behind the implementation).

Solution

Replace the wait_for_completion flag with integration_pattern arg in StartJobRun step construction.

The IntegrationPattern is used to build the Resource arn as follow:

IntegrationPattern Resource Doc
WaitForCompletion "arn:aws:states:::states:databrew:startJobRun.sync" Run A job
CallAndContinue "arn:aws:states:::states:databrew:startJobRun" Request Response

See Service Integration Patterns for more details

Normally, replacing a constructor argument would be a breaking change, but since we have not released support for AWS Glue Databrew service integration yet, it is acceptable to do so. After next release, it making such changes will be considered as not being backward compatible.

Testing

  • Updated unit tests for all resources
  • No integ tests added as external resources were required for testing (Databrew Job)
  • Manual test done for all resources

Manual Tests

Create a Databrew job for manual tests (<databrew_job_name>).
For each test, create a workflow and execute as follow:

workflow = Workflow(
    name=<state_machine_name>,
    definition=<step>,
    role=<workflow_execution_role>
)

workflow.create()
workflow.execute()
  • Call And Continue:
# Create a workflow with the following steps & execute
step_call_and_continue = GlueDataBrewStartJobRunStep(
        'Start Glue DataBrew Job Run - CallAndContinue',
        integration_pattern=IntegrationPattern.CallAndContinue, parameters={
            "Name": <databrew_job_name>
        }
)

step_default = GlueDataBrewStartJobRunStep('Start Glue DataBrew Job Run - Default', parameters={
        "Name":  <databrew_job_name>
    }
)

# Validate that the workflows end after starting the  <databrew_job_name> job
  • Wait For Completion
# Create a workflow with the following step & execute
step_wait_for_completion = GlueDataBrewStartJobRunStep(
        'Start Glue DataBrew Job Run - WaitForCompletion',
        integration_pattern=IntegrationPattern.WaitForCompletion, parameters={
            "Name":  <databrew_job_name>
        }
)
# Validate that the workflow only ends after the  <databrew_job_name> job completes

Pull Request Checklist

Please check all boxes (including N/A items)

Testing

  • Unit tests added
  • Integration test added - N/A
  • Manual testing - why was it necessary? could it be automated? No integ tests added as external resources were required for testing

Documentation

  • docs: All relevant docs updated
  • docstrings: All public APIs documented

Title and description

  • Change type: Title is prefixed with change type: and follows conventional commits
  • References: Indicate issues fixed via: Fixes #xxx - N/A

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license.

tests/unit/test_service_steps.py Show resolved Hide resolved
@ca-nguyen ca-nguyen requested a review from shivlaks November 2, 2021 23:29
@ca-nguyen ca-nguyen dismissed shivlaks’s stale review November 2, 2021 23:29

Suggested changes were added in last commit

tests/unit/test_service_steps.py Outdated Show resolved Hide resolved
tests/unit/test_service_steps.py Outdated Show resolved Hide resolved
ca-nguyen and others added 2 commits November 4, 2021 12:10
Co-authored-by: Shiv Lakshminarayan <shivlaks@amazon.com>
@ca-nguyen ca-nguyen requested a review from shivlaks November 4, 2021 19:24
shivlaks
shivlaks previously approved these changes Nov 5, 2021
@ca-nguyen ca-nguyen requested a review from shivlaks November 8, 2021 18:25
kwargs[Field.Resource.value] = get_service_integration_arn(GLUE_DATABREW_SERVICE_NAME,
GlueDataBrewApi.StartJobRun,
integration_pattern)
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (non-blocking): would be good to have the comment precede the Field.Resource assignment.

@StepFunctions-Bot
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-sEHrOdk7acJc
  • Commit ID: 5c5f235
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@ca-nguyen ca-nguyen merged commit 382241a into aws:main Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants