Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-stepfunctions-tasks: EmrContainersStartJobRun could not use Role from SF input #21319

Open
2 tasks
SerjChaoz opened this issue Jul 25, 2022 · 2 comments
Open
2 tasks
Assignees
Labels
@aws-cdk/aws-stepfunctions-tasks effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p2

Comments

@SerjChaoz
Copy link

Describe the feature

EmrContainersStartJobRunProps support readonly executionRole?: iam.IRole;
I was wondering if EmrContainersStartJobRunProps can accept optional executionRoleArn and use it if executionRole not provided. The reason for this request is a limitation in Role.fromRoleArn()

In the Role documentation for fromRoleArn()
https://docs.aws.amazon.com/cdk/api/v1/docs/@aws-cdk_aws-iam.Role.html#static-fromwbrrolewbrnamescope-id-rolename

If the imported Role ARN is a Token (such as a CfnParameter.valueAsString or a Fn.importValue()) 
and the referenced role has a path (like arn:...:role/AdminRoles/Alice), 
the roleName property will not resolve to the correct value. Instead it will resolve to the first path component. 
We unfortunately cannot express the correct calculation of the full path name as a CloudFormation expression. 
In this scenario the Role ARN should be supplied without the path in order to resolve the correct role resource.

I tried different options but all of them failed during build/deployment. I was able to make it work using CustomState.
Without this feature it is not possible to use EmrContainersStartJobRun with roles extracted from StepFunction input.

Use Case

This is needed because "cdk deploy" throws an error when Role must be picked from SF input:

executionRole: Role.fromRoleArn(this, "EmrExecutionRole", JsonPath.stringAt("$.executionRoleArn")),

Code snippet and error message provided under "Other Information"

Proposed Solution

In order to handle Arn from JSON Input I suggest to add extra property to EmrContainersStartJobRunProps:

export interface EmrContainersStartJobRunProps extends sfn.TaskStateBaseProps {
    ...
    /**
     * The execution role for the job run.
     *
     * If `virtualClusterId` is from a JSON input path, an execution role must be provided.
     * If an execution role is provided, follow the documentation to update the role trust policy.
     * @see https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-trust-policy.html
     *
     * @default - Automatically generated only when the provided `virtualClusterId` is not an encoded JSON path
     */
    readonly executionRole?: iam.IRole;
    /**
     * The execution role arn for the job run.
     *
     * Used when both `virtualClusterId` and `executionRoleArn` needs to be taken from a JSON input path.
     * Conflicts with `executionRole` when both provided.
     */
    readonly executionRoleArn?: String;
    ...

Other Information

Code:

export class SparkSubmitJobDriverClass implements SparkSubmitJobDriver {
    entryPoint: TaskInput;
    entryPointArguments: TaskInput;
    sparkSubmitParameters?: string;

    constructor(entryPoint: TaskInput, entryPointArguments: TaskInput, sparkSubmitParameters: string) {
        this.entryPoint = entryPoint;
        this.entryPointArguments = entryPointArguments;
        this.sparkSubmitParameters = sparkSubmitParameters;
    }
}

export class JobDriverClass implements JobDriver {
    sparkSubmitJobDriver: SparkSubmitJobDriver;

    constructor(sparkDriver: SparkSubmitJobDriver) {
        this.sparkSubmitJobDriver = sparkDriver;
    }
}

export class EmrMonitoring implements Monitoring {
    logGroup: ILogGroup;
    logStreamNamePrefix: string;
    persistentAppUI: boolean;

    constructor(logGroup: ILogGroup, logStreamNamePrefix: string, persistentAppUI: boolean) {
        this.logGroup = logGroup;
        this.logStreamNamePrefix = logStreamNamePrefix;
        this.persistentAppUI = persistentAppUI;
    }
}

const startJobRunSync = new EmrContainersStartJobRun(this, "StartJobRunSync", {
    virtualCluster:  VirtualClusterInput.fromVirtualClusterId(JsonPath.stringAt(`$.virtualClusterId`)),
    jobName: JsonPath.stringAt(`$.name`),
    // JsonPath returns a $Token which should be picked up from execution input
    executionRole: Role.fromRoleArn(this, "EmrExecutionRole", JsonPath.stringAt("$.executionRoleArn")),
    // hardcoded role works
    // executionRole: Role.fromRoleArn(this, "EmrExecutionRole", "arn:aws:iam::0123:role/SomeExistingRole"),
    releaseLabel: new ReleaseLabel(JsonPath.stringAt(`$.releaseLabel`)),
    jobDriver: new JobDriverClass(new SparkSubmitJobDriverClass(
        TaskInput.fromJsonPathAt(`$.entryPoint`),
        TaskInput.fromObject(JsonPath.listAt(`$.entryPointArgs`)),
        JsonPath.stringAt("$.sparkSubmitParameters"))),
    monitoring: new EmrMonitoring(
        LogGroup.fromLogGroupName(this, "StartJobRunLogGroup", JsonPath.stringAt(`$.logGroupName`)),
        JsonPath.stringAt(`$.logStreamNamePrefix`),
        true)
});

This throws an error during "cdk deploy":

Stack failed: Error [ValidationError]: Template error: Fn::Select  cannot select nonexistent value at index 4

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.14.0

Environment details (OS name and version, etc.)

macOS Big Sur 11.6.6

@SerjChaoz SerjChaoz added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jul 25, 2022
@peterwoodworth peterwoodworth added p2 effort/small Small work item – less than a day of effort and removed needs-triage This issue or PR still needs to be triaged. labels Sep 1, 2022
@msambol
Copy link
Contributor

msambol commented Aug 29, 2023

@peterwoodworth Can you assign this to me?

@peterwoodworth
Copy link
Contributor

Got it 👍🏻 Thanks for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-stepfunctions-tasks effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants