Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts about resource IDs #1424

Closed
eladb opened this issue Dec 23, 2018 · 22 comments
Closed

Thoughts about resource IDs #1424

eladb opened this issue Dec 23, 2018 · 22 comments
Labels
@aws-cdk/core Related to core CDK functionality guidance Question that needs advice or information.

Comments

@eladb
Copy link
Contributor

eladb commented Dec 23, 2018

@abrooksv provided some feedback on resource IDs:

Construct name must be unique within a subtree. When joined with the parent construct it will also be unique. Is there a reason why we append random characters to the end of a logical name?

If my parent Construct is called “CreateReealseEAP” and my construct is “BuildNotificationFailed”, why do we append “4402183C”? It leads to harder to read resource names that are already pretty long. For human readable resources such as CodeBuild project and Pipelines this leads me to using the name override features. According to the docs, it mentions that it is done due to the delimiter can’t be added into the logical ID. If I am understanding this right, it seems like a rare edge case between two separate constructs Foo-BarLambda vs FooBar-Lambda causing a collision. This seems like such a small edge case that we could handle it by adding in the suffix in that case only, and not in the common case.

One of the main value propositions of the CDK is to enable composition of abstract cloud architectures into higher level abstractions, stacks and apps. Since CloudFormation's logical ID namespace is flat, we needed a way to make sure that resource IDs do not collide when reusable components from multiple uncoordinated sources (i.e. 3rd parties, etc) are composed together and synthesized into the same eventual CloudFormation template.

The natural solution would have been to use a delimited full path as the logical ID of each resource. However, this wouldn't have worked because CloudFormation puts rather strict constraints on logical IDs (only alphanumeric, limited in length).

We couldn't only add a suffix in the case where there is a conflict because this means that logical IDs would stop being stable: users might add a resource and suddenly some other resource would have their logical ID change. That's unacceptable behavior, since a change of logical IDs mean recreation of physical resources - definitely something we can't afford.

We've explored options with CloudFormation, and realized they have good reasons for these constraints and that changing them would not be possible without a huge breakage potential across their huge surface area. To overcome those limitations we came up with this model where we render logical IDs that contain both a “human” part and a hash of the full delimited path. The human part is rendered by concatenating the path and making it “nice” while the hash is basically a short MD5 hash, to ensure uniqueness within the template.

CreateReleaseEAPBuildNotificationFailed4402183C vs CreateReleaseEAPBuildNotificationFailed
The concatenating leads to resource names that are arguably “worthless”. Some of this is CFN’s fault due to the stack prefix, but the long logical IDs don’t help.
: Example: I have 2 sibling Lambdas that have the Construct tree Stack->Pipeline->Stage->Action->Lambda since that is logically how it is laid out. This leads to two Lambdas:

  • sam-dev-AWSToolk-AwsToolkitJetBrainsRelea-XX8M2X3P1ET7
  • sam-dev-AWSToolk-AwsToolkitJetBrainsRelea-1EHAWJ25E4M58
    There is no way to differentiate the Lambdas without adding in descriptions. Descriptions are not useful when attempting to find the metrics in CloudWatch or the log streams though.

Yes, definitely not a very useful set of names. Would love to explore any ideas you might have on helping resolve this. Things that come to mind:

  • Auto-tag the resources using construct full path so people can at least look them up as tags?
  • Allow you to override the human part of the logical ID. We still need the hash suffix for the reasons I outlined above, but the human part can be anything. What would be a nice API to do that? I am not sure.
  • Render the human part of the logical ID in reverse order - this might make it easier to distinguish between multiple resources that are in the same level of the tree...

The full Logical name for one of the Lambdas is:

AwsToolkitJetBrainsReleasePreReleaseCloseReleaseGateCloseReleaseGateLambdaC27581F5. It is full of duplicates because the logical name for each construct should be descriptive enough on its own for it to understand what it does.

  1. Pipeline: AwsToolkitJetBrainsRelease
  2. Stage: PreRelease
  3. Action: CloseReleaseGate
  4. Lambda: CloseReleaseGateLambda

I'd argue that the “full logical ID” is:

AwsToolkitJetBrainsRelease/PreRelease/CloseReleaseGate/CloseReleaseGateLambda

and I wonder, for example, why not use shorter logical IDs? (I know it won't solve your problem, but maybe eliviate it in certain cases):

AwsToolkitJetBrains/PreRelease/CloseGate/Lambda*

This makes me think the Construct tree should never be more than 2-3 layers deep at the most in order to prevent the long name. What are the teams’ thoughts on this?

As mentioned above, this will create a limitation that we can't have. Being able to use arbitrarily deep trees is probably the single most important part of the CDK, as this is how abstractions are composed.

If we use a name override (such as codebuild Project.withProjectName()) in order to alleviate issues mentioned above, and changing the parent constructs name lead to a logical name change, would this lead to a stack that can no longer be deployed due to the logical names getting altered but the resource name override remaining the same? Due to the long names mentioned above, I could see people commonly overriding the name for resources such as Lambdas and Projects.

Yes, that's a CloudFormation limitation that we can't really get around. If you use a physical name, CFN won't be able to replace it with another resource with the same physical name.

Again, this brings up the idea of perhaps allowing you to override only the human part of the logical ID... Definitely worth exploring this direction.

@eladb eladb added ❓ question @aws-cdk/core Related to core CDK functionality labels Dec 24, 2018
@eladb eladb changed the title Thoughts about the human part of logical IDs Thoughts about resource IDs Dec 24, 2018
@srchase srchase added guidance Question that needs advice or information. and removed question labels Jan 3, 2019
@mipearson
Copy link

I’d want to go further and allow users to override the entire logical ID as a standard feature of constructs, with the caveat that if another resource ends up using the same ID, it’s up to the user to deal with that.

This would have the same effect as the renaming “hack”, without needing to put the renames at the top of the stack.

I see it as a the stack author signposting “I care about the root CFN resource this construct creates outside of this app”

@eladb
Copy link
Contributor Author

eladb commented Jan 15, 2019

Yes. The current experience is not good enough.

How about something like resource.overrideLogicalId(newId). We can implement something like that if logicalId would be a lazy token instead of a concrete value.

@abrooksv
Copy link

I think I can achieve want by providing my own naming system instead of using overrides everywhere but I have not tried it yet:

https://awslabs.github.io/aws-cdk/refs/_aws-cdk_cdk.html?highlight=iaddressingscheme#logicalids

But is there a way to get the actual Resource from inside of https://awslabs.github.io/aws-cdk/refs/_aws-cdk_cdk.html?highlight=iaddressingscheme#@aws-cdk/cdk.IAddressingScheme ?

@mipearson
Copy link

@eladb yep, I like that as a solution - better than having to put them all at the top of the stack.

Potential use case is using CDK to generate templates for other teams to consume, and in that case "clean" resource IDs will matter.

@eladb
Copy link
Contributor Author

eladb commented Mar 4, 2019

#1670 and #1595 should provide a pretty good solution for customizing logical IDs

@cal5barton
Copy link

#1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

@skinny85
Copy link
Contributor

#1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

You can also do it by subclassing Stack and overriding the allocateLogicalId method: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_core.Stack.html#protected-allocate-wbr-logical-wbr-idcfnelement

@cal5barton
Copy link

#1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

You can also do it by subclassing Stack and overriding the allocateLogicalId method: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_core.Stack.html#protected-allocate-wbr-logical-wbr-idcfnelement

Thanks for the link to that. Unfortunately, it doesn't produce the behavior that I want. I'd like to strip or remove the auto-generated hash completely and allow for the logical id to be the id that I pass in the constructor of the resource. Basically, by overriding that method, I'm replacing the auto-generated hash with a hash of my own. It would be nice to disable that functionality altogether and let us assume the risk of collisions of logicalIds.

@skinny85
Copy link
Contributor

#1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

You can also do it by subclassing Stack and overriding the allocateLogicalId method: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_core.Stack.html#protected-allocate-wbr-logical-wbr-idcfnelement

Thanks for the link to that. Unfortunately, it doesn't produce the behavior that I want. I'd like to strip or remove the auto-generated hash completely and allow for the logical id to be the id that I pass in the constructor of the resource. Basically, by overriding that method, I'm replacing the auto-generated hash with a hash of my own. It would be nice to disable that functionality altogether and let us assume the risk of collisions of logicalIds.

I'm not sure what you mean by that... here's an example implementation:

export class TestStack extends Stack {
    protected allocateLogicalId(cfnElement: CfnElement): string {
        return cfnElement.node.id;
    }
}

Note you need to be aware the separation between L2s (which you create explicitly) and L1 classes (those starting with the Cfn prefix), which are created implicitly for you with logical IDs like Resource.

@cal5barton
Copy link

cal5barton commented Apr 24, 2020

#1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

You can also do it by subclassing Stack and overriding the allocateLogicalId method: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_core.Stack.html#protected-allocate-wbr-logical-wbr-idcfnelement

Thanks for the link to that. Unfortunately, it doesn't produce the behavior that I want. I'd like to strip or remove the auto-generated hash completely and allow for the logical id to be the id that I pass in the constructor of the resource. Basically, by overriding that method, I'm replacing the auto-generated hash with a hash of my own. It would be nice to disable that functionality altogether and let us assume the risk of collisions of logicalIds.

I'm not sure what you mean by that... here's an example implementation:

export class TestStack extends Stack {
    protected allocateLogicalId(cfnElement: CfnElement): string {
        return cfnElement.node.id;
    }
}

Note you need to be aware the separation between L2s (which you create explicitly) and L1 classes (those starting with the Cfn prefix), which are created implicitly for you with logical IDs like Resource.

@skinny85 I've tried and tried to override this method with the code you provided but cfnElement.node.id just returns Resource, S3VersionKey, CFDistribution, etc.... I would expect the id to be what was passed via the constructor when creating a certain construct. I've debugged several other properties on the node object and don't see where the id I pass is located.

@skinny85
Copy link
Contributor

skinny85 commented Apr 27, 2020

Yes. That's why I said:

Note you need to be aware the separation between L2s (which you create explicitly) and L1 classes (those starting with the Cfn prefix), which are created implicitly for you with logical IDs like Resource.

You can try something like:

    protected allocateLogicalId(cfnElement: CfnElement): string {
        const allScopes = cfnElement.node.scopes;
        const stackIndex = allScopes.indexOf(cfnElement.stack);
        const resourceScopes = allScopes.slice(stackIndex + 1);
        return resourceScopes
            .map(cons => cons.node.id)
            .reduce((id, acc) => id + acc, '');
    }

to try and get rid of the hashes at the end of the logical IDs, while having a better chance of them being unique.

Feel free to come up with your own algorithm as you see fit (for example, you might want to filter out generic IDs in the scopes like Resource, Default, etc.).

@RoKish
Copy link

RoKish commented Apr 29, 2020

@skinny85 This is also an issue for me because I want to migrate to CDK from a simple CloudFormation template. I was under the impression that I would be able to describe a CDK stack which would generate a template similar to what I have. Because of the hash codes at the end of the names of the resources, the tables/buckets will be replaced, which I want to avoid. So, I don't want to generate a new unique logical id, I just want to use the one I gave in the constructor.

@skinny85
Copy link
Contributor

@RoKish for that, you have a few options:

  1. If you use the Layer 1 classes (those that start with Cfn*), and you give Stack as the parent, the logical ID will be preserved:
import * as cdk from '@aws-cdk/core';
import s3 = require('@aws-cdk/aws-s3');

export class MyStack extends cdk.Stack {
    constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
        super(scope, id, props);

        new s3.CfnBucket(this, 'MyBucket')
    }
}

results in the template:

Resources:
  MyBucket:
    Type: AWS::S3::Bucket
    Metadata:
      aws:cdk:path: MyStack/MyBucket
  1. If you'd rather use the Layer 2 classes, you can use the overrideLogicalId method on the underlying L1 classes that the the L2s create:
import * as cdk from '@aws-cdk/core';
import s3 = require('@aws-cdk/aws-s3');

export class MyStack extends cdk.Stack {
    constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
        super(scope, id, props);

        const bucket = new s3.Bucket(this, 'MyBucket');
        const cfnBucket = bucket.node.defaultChild as cdk.CfnResource;
        cfnBucket.overrideLogicalId('MyBucket');
    }
}

results in the template:

Resources:
  MyBucket:
    Type: AWS::S3::Bucket
    UpdateReplacePolicy: Retain
    DeletionPolicy: Retain
    Metadata:
      aws:cdk:path: MyStack/MyBucket/Resource

Hope this helps!

@RoKish
Copy link

RoKish commented Apr 30, 2020

@skinny85 Thank you! The second should work for me.

@skinny85
Copy link
Contributor

NP, glad I could help 🙂

@SKIIDK
Copy link

SKIIDK commented Jun 2, 2020

@skinny85 How would you go about overriding allocateLogicalId in Python? Directly translating the code doesn't seem to work for me.

@mgwidmann
Copy link

mgwidmann commented Jul 21, 2020

This strategy continues to be a huge burden to an iterative development process. Creating a construct tree path of Stack -> MyLambda produces something like StackMyLambdaABC123 as a logical ID. Once we've rolled this out to production and have come back to development we are left with a problem.

If we wrap MyLambda with a construct which does nothing except create a new MyLambda(this) construct, the path has changed to StackMyConstructMyLambdaABC123, resulting in cloud formation destroying the resource and creating a new one. This is frequent in iterative development as refactors happen continuously. This sounds like its fine for a lambda and questionable for other resources but its actually problematic even for lambdas as well as other resources which don't contain any state. The reason for that is because cloud formation will create the new resource first before deleting to make rollbacks easier. This leads to a name conflict, since the "primary key" in cloud formation changed, but the "primary key" in the service did not. The root cause of this issue is that cloud formation tracks a different primary identifier than the underlying services do, the difference between Logical ID and Physical ID, and I understand CDK can't change that, but this strategy magnifies the issue more than if you were setting logical IDs yourself manually like you would with JSON or YAML. The only workaround is to put it back to how it was (not often a viable choice) or to rename your resource to be called something else (which sometimes can be an issue too). The last workaround is to roll out a change which deletes the resource (by commenting it out) followed by one which recreates it (by uncommenting it out); this is also not a viable strategy most times as it leaves the production system without a component of the system for some period of time.

I don't think any one solution works here, as has been discussed both have drawbacks to one side or another. I believe the default behavior should be that the ID (second parameter to all constructs) should be used as the logical ID, since they're typically hard coded or something hard coded plus some suffix for most end-user CDK code which is written. This leaves the issue of when "reusable components from multiple uncoordinated sources (i.e. 3rd parties, etc) are composed together". This is where the generation strategy which is the current default should be applied. The only time this issue comes into play again is when you wrap a construct which contains the third party construct and so happens way less frequently. Other strategies could be introduced potentially, but I believe that there is no one size fits all here and that each construct should decide how it sets the logical Ids for its children with the stack's default to being just returning the ID of the construct.

@laimonassutkus
Copy link

Please add a functionality to use logical ids as resource ids (second parameter to all constructs).

@shellscape
Copy link

I second the request from @laimonassutkus. While I understand that the intent of the current ID scheme is meant to protect us from ourselves and ensure uniqueness, it's burdensome for those of us who don't need the extra guards.

@chrisandrewcl
Copy link

chrisandrewcl commented Mar 29, 2021

Meanwhile, this should help:

class EnforceLegacyLogicalIds {
  visit(element) {
    if (element.node.id && element.node.defaultChild) {
      element.node.defaultChild.overrideLogicalId(element.node.id);
    }
  }
}

export { EnforceLegacyLogicalIds };
Aspects.of(stack).add(new EnforceLegacyLogicalIds());

This will use constructs' IDs (the second parameter) as the logical ID. Besides the default child, other L1 constructs created by L2 and L3 constructs still need to be manually renamed.

@ruebroad
Copy link

Terraform doesn't appear to have this issue. Cloudformation has so many issues that it makes no sense to have created CDK as a wrapper around it. Why not implement CDK as a wrapper around the API and drop the Cfn entirely? I appreciate this does nothing to help the discussion but I still find it very difficult to understand why anyone continues to use Cfn or CDK when there are such better alternatives.

@jk2l
Copy link

jk2l commented Nov 30, 2022

i also agree this create lot of issues in term of refactoring. when code start small with only one or two resources it look simple, but as the project grow over the years, if i ever want to create L3 construct this will trigger a new CDK path and completely create new resources. the solution i end up is create L3 construct but not actual construct. however this feel kinda bad design overall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/core Related to core CDK functionality guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests