Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(aws-stepfunctions-tasks): GlueStartJobRunProps missing support of AllocatedCapacity or MaxCapacity #12757

Closed
Labels
@aws-cdk/aws-stepfunctions-tasks effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p1

Comments

@wzhishen
Copy link

wzhishen commented Jan 28, 2021

❓ General Issue

The Question

Today in CDK GlueStartJobRunProps does not support AllocatedCapacity or MaxCapacity to configure Glue job run DPU. But based on Step Functions service integration, SFn does support parameter AllocatedCapacity when invoking Glue StartJobRun API. Will GlueStartJobRunProps support accepting AllocatedCapacity or MaxCapacity parameters? (see details of these two parameters here)

Environment

  • CDK CLI Version: 1.83.0
  • Module Version: 1.83.0
  • Node.js Version: 14.14.20
  • OS: Linux 4.9.230-0.1.ac.223.84.332.metal1.x86_64 x86_64
  • Language (Version): TypeScript (3.6.4)

Other information

@wzhishen wzhishen added guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels Jan 28, 2021
@kirintwn
Copy link
Contributor

The documentation is outdated. There are a few new parameters introduced by StartJobRun API, some of which I have tested & works well with CloudFormation. We can do it with CDK without a doubt.

This is related to: awsdocs/aws-step-functions-developer-guide#46

@wzhishen
Copy link
Author

I understood that we can use AllocatedCapacity parameter in the state machine definition of a CF template directly. But we cannot do the same in CDK, as GlueStartJobRunProps doesn't expose this property we can set in order to generate the CF template with it.

@wzhishen
Copy link
Author

Hi any update?

@cprice404-aws
Copy link

Hi I'm running into this as well; I really need to increase the number of workers for certain jobs.

As far as I can tell, there is no way to drop down to the Cfn* resource types for just this one action in the state machine. Is that accurate? If so, that means that I have to either port my entire state machine definition OFF of the high-level stepfunctions/stepfunctions-tasks CDK APIs and down to the low-level, or else it's just not possible to override the number of workers when starting the glue job?

@johnhill2424
Copy link

@cprice404-aws have you found any workaround for this?

@cprice404-aws
Copy link

Unfortunately I think the (awful) solution that I came up with was to set the default number of workers to the highest number I might need, when I defined the job. When invoking the job via other means besides step functions I pass in an override to reduce the number. Not a very good approach but I didn't find any other way to do it.

@johnhill2424
Copy link

@shivlaks any update on this?

@NGL321
Copy link
Contributor

NGL321 commented May 31, 2021

A workaround has been given by the Glue team. Since AllocatedCapacity has been deprecated, users should use the NumberOfWorkers and WorkerType fields together. However, those are not presently included in the CDK.

With that in mind, I am changing this to a Feature Request and am investigating what it would take to add those fields as we speak.

😸 😷

@NGL321 NGL321 added effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p1 and removed guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels May 31, 2021
@shivlaks
Copy link
Contributor

shivlaks commented May 31, 2021

the documentation for the service pattern needs to be updated as well.

@NGL321 - the main question to answer re: adding support is whether the new properties can be specified as part of the state input path (dynamically rather than static allocation to a type). That is what will determine whether NumberOfWorkers should be a number o r a more complex type. Aside from that WorkerType might need to be an enum.

I'm happy to take a look. Going to have to try this out myself since it seems we have some documentation gaps here.

@johnhill2424 Another workaround would be to bypass the use of the GlueStartJobRun construct and use ASL directly via a custom state escape hatch. You can take the output that is generated from GlueStartJobRun by using the toStateJson method and supplementing any properties you like before using it to define your custom state.

@joehillen
Copy link
Contributor

@shivlaks I think you meant @johnhill2424

@shivlaks
Copy link
Contributor

shivlaks commented Jun 7, 2021

@joehillen thanks for that! corrected.

@shivlaks
Copy link
Contributor

shivlaks commented Jun 7, 2021

update: checking with the Step Functions team whether they're aware of these changes given that it's still not documented

Remaining

  • model the NumberOfWorkers and WorkerType properties
    • one thing to consider is whether they're supported as dynamic properties (can be supplied through JSON Path) which will determine what types they should be
  • expose properties through GlueStartJobRunProps
  • get Step Functions documentation updated

@ericsun95
Copy link

Do we have any updates on this?

@AlJohri
Copy link

AlJohri commented Dec 10, 2022

Bumping this issue- there are a bunch of parameters exposed in the JobRun structure that aren't exposed in CDK L2 constructs.

https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-runs.html

Here is a custom construct my team uses to customize NumberOfWorkers and WorkerType:

import { Construct } from 'constructs';
import * as sfn from 'aws-cdk-lib/aws-stepfunctions';

export interface RetryProps {
    maxAttempts: number;
    backoffRate: number;
}

export interface GlueStartJobRunProps {
    name: string;
    arguments: { [name: string]: string };
    retry: RetryProps;
    numWorkers: string;
    workerType: string;
}

/**
 * Custom construct for a glue job. A custom construct is required,
 * because fn.GlueStartJobRun is missing key features, like the
 * ability to configure NumberOfWorkers and WorkerType.
 */
export class GlueStartJobRun extends sfn.CustomState {
    constructor(scope: Construct, id: string, props: GlueStartJobRunProps) {
        let stateJson = {
            Type: 'Task',
            ResultPath: null,
            Retry: [
                {
                    ErrorEquals: ['States.ALL'],
                    MaxAttempts: props.retry.maxAttempts,
                    BackoffRate: props.retry.backoffRate,
                },
            ],
            Resource: 'arn:aws:states:::glue:startJobRun.sync',
            Parameters: {
                'JobName': props.name,
                'Arguments': props.arguments,
                'NumberOfWorkers.$': props.numWorkers,
                'WorkerType.$': props.workerType,
            },
        };
        super(scope, id, { stateJson: stateJson });
    }
}

@yuntaoL
Copy link

yuntaoL commented Sep 19, 2023

Is there any update for this issue? Here is a workaround we are using now.

export interface GlueStartJobRunProps extends tasks.GlueStartJobRunProps {
  readonly numberOfWorkers?: number;
  readonly workerType?: WorkerType;
}

export class GlueStartJobRun extends tasks.GlueStartJobRun {
  constructor(scope: Construct, id: string, private readonly enhancedProps: GlueStartJobRunProps) {
    super(scope, id, enhancedProps);
  }

  protected override _renderTask(): any {
    const state = super._renderTask();
    if (this.enhancedProps.workerType) {
      state.Parameters.WorkerType = this.enhancedProps.workerType.name;
    }

    if (this.enhancedProps.numberOfWorkers) {
      state.Parameters.NumberOfWorkers = this.enhancedProps.numberOfWorkers;
    }

    return state;
  }
}

@Fares-Tabet
Copy link

Any updates on this ?

@mergify mergify bot closed this as completed in #30319 May 27, 2024
mergify bot pushed a commit that referenced this issue May 27, 2024
…StartJobRun class (#30319)

### Issue # (if applicable)

Closes #12757.

### Reason for this change
Missing property


### Description of changes
Add workerType and numberOfWorkers to GlueStartJobRun class.

The reasons for this change are as follows:

* AllocatedCapacity is deprecated.
* MaxCapacity can only be used with Glue version 1 and earlier, which have already reached end of support (EOS).
* Glue version 2 and later use WorkerType and NumberOfWorkers.

For mor information, see also the documents below.

https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-runs.html#aws-glue-api-jobs-runs-StartJobRun

https://docs.aws.amazon.com/glue/latest/dg/glue-version-support-policy.html




### Description of how you validated changes
Add unit tests and integ tests.



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

atanaspam pushed a commit to atanaspam/aws-cdk that referenced this issue Jun 3, 2024
…StartJobRun class (aws#30319)

### Issue # (if applicable)

Closes aws#12757.

### Reason for this change
Missing property


### Description of changes
Add workerType and numberOfWorkers to GlueStartJobRun class.

The reasons for this change are as follows:

* AllocatedCapacity is deprecated.
* MaxCapacity can only be used with Glue version 1 and earlier, which have already reached end of support (EOS).
* Glue version 2 and later use WorkerType and NumberOfWorkers.

For mor information, see also the documents below.

https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-runs.html#aws-glue-api-jobs-runs-StartJobRun

https://docs.aws.amazon.com/glue/latest/dg/glue-version-support-policy.html




### Description of how you validated changes
Add unit tests and integ tests.



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
vdahlberg pushed a commit to vdahlberg/aws-cdk that referenced this issue Jun 10, 2024
…StartJobRun class (aws#30319)

### Issue # (if applicable)

Closes aws#12757.

### Reason for this change
Missing property


### Description of changes
Add workerType and numberOfWorkers to GlueStartJobRun class.

The reasons for this change are as follows:

* AllocatedCapacity is deprecated.
* MaxCapacity can only be used with Glue version 1 and earlier, which have already reached end of support (EOS).
* Glue version 2 and later use WorkerType and NumberOfWorkers.

For mor information, see also the documents below.

https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-runs.html#aws-glue-api-jobs-runs-StartJobRun

https://docs.aws.amazon.com/glue/latest/dg/glue-version-support-policy.html




### Description of how you validated changes
Add unit tests and integ tests.



### Checklist
- [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md)

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment