Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC655: Enhanced L1s #657

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

RFC655: Enhanced L1s #657

wants to merge 8 commits into from

Conversation

aaythapa
Copy link

This is a request for comments about enhancing L1s in CDK. See #655 for additional details.

APIs are signed off by @alias.


By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license

@aaythapa aaythapa marked this pull request as ready for review November 19, 2024 17:30
@aaythapa aaythapa changed the title RFC 655: Enhanced L1s RFC655: Enhanced L1s Nov 26, 2024
Copy link

@mikejgray mikejgray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's exciting to see big changes being proposed in the CDK! Thank you for the work on this one.

Comment on lines 86 to 99
Currently, L1 constructs in the CDK perform minimal validation on properties, mostly ensuring type correctness. We can extend these validations by leveraging additional data from our sources. Depending on the property type, various validation rules can be applied, as defined in the schema:

1. String
1. `minLength`: min length the string can be
2. `maxLength`: max length the string can be
3. `pattern`: the pattern the string must match (commonly used for arns)
4. `enum`: the enum values the string can be, this likely won’t be part of the validation as we’ll have an enum type
5. `const`: the constant the string must be
2. Array
1. `minItems`: min number of items in the array
2. `maxItems`: max number of items in the array
3. Integer
1. `minimum`
2. `maximum`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a huge help. It's immensely frustrating to have so many checks in your CDK code only to have a simple validation check fail in Cloudformation. Thank you for recommending this!

Comment on lines +141 to +150
const func = new lambda.CfnFunction(this, 'test', {
code: {
...
},
handler: 'index.handler',
role: r.attrArn,
runtime_v2: lambda.CfnFunction.Runtime.PYTHON310, // new V2 property
// if both runtime and runtime_v2 are set we can throw an error
timeout: 30
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a better way to manage the backwards compatibility issues here, but I'm not crazy about the possibility of having runtime_v2 defined but not runtime. Can't think of specific examples but I get the impression validation becomes a little more difficult - if you must have a property defined, now you have to account for that property and its _v2 variant, which could introduce issues for certain use cases.


### POC

There were some internal experiments done around this idea before. We concluded that generating `ICfn<Resource>` is fairly straight forward but actually backporting the interface to our existing L2 API is manual and runs the risk of accidentally creating breaking changes e.x turning a public property `IBucket` into `ICfnBucket` (since users could be referencing `IBucket` -specific properties) or renaming a parameter to an API can cause breaking changes in Python. If we move forward with this we'll continue the work from the experiments while keeping the findings in mind.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cautiously optimistic about this approach, and definitely curious about the findings.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can look at this branch for a deeper look on this POC: https://github.com/aws/aws-cdk/tree/conroy/generate

@johnf
Copy link

johnf commented Dec 18, 2024

I love everything being proposed here. This will bring significant improvements!

Except for the _v2 suggestion. I understand why this approach is being suggested, but it doesn't feel right to me.

I think this adds cognitive load to the developer, I have to now think about wether to use _v2 or not. Also, does this stay around forever, or does it get dumped in a future change? DO we end up with _v3 down the track?

My suggestion: Bite the bullet, make a breaking change and bump the major version.

A possible approach to help, if we release today

  • release aws-cdk-lib 3.174.0 and 2.174.0.
  • Keep the versions in sync
  • The only difference would be v3 has the prop_v2 logic written out to prop
  • So this should be automatable

Some alternatives I can think of

  • Make a breaking change and release a codemod to help everyone upgrade their code
  • Typescript magic: pass in something like cdk_type_version: 2 into every prop, which gives me the new types - in the future, we could make this go away. I suspect there is some Typescript magic that might allow us to set something globally and have this still work

@rehanvdm
Copy link

I love the idea of having Interfaces for the CFN L1s and improving the L1-L2 interoperability.

I agree with @johnf, it has been a few years since a major release and there are many Flags in cdk.json and @deprecated functions. Including this big change, it justifies a new major release for me.

@mrgrain
Copy link
Contributor

mrgrain commented Dec 19, 2024

At re:Invent 2024 I had the opportunity to discuss this with a number of AWS Heroes: @hoegertn @Osterjour @pgarbe This is a write-up of this discussion from my perspective. Please add to it as required.

It was questioned if enums are necessary in L1s as they don't seem to add much value and take L1s further away from what the CFN is. It was also noted that the use of _v2 is confusing and annoying. In particular deploying this pattern for enums will make it a regular occurrence across the L1 library, as opposed to using the suffix only in situations when an upstream type change is breaking. In response to this, it was noted that old properties would be marked as deprecated and that it is an IDE and documentation concern to ensure that users are not flooded with multiple of the same props to choose from.

The discussion then dived into what L1s are representing, what they should be representing and what they could be representing in future. There was consensus that L1 are currently roughly equal to CFN, however it was not clear what that means exactly. As an example, Validations were brought up which are currently not part of L1s but having them would make them arguably closer to what CFN does. Another example was IAM grants. While the group agreed that this is not currently part of CFN, it would be very useful if grants can be auto-generated from a data source.

Based on this, the idea was proposed that L1s could be everything that can be auto generated (i.e. not hand-written) from a data source. It was suggested that maybe this should be separate layer and the proposed features in the RFC plus future features like grants could be introduced as a new layer between the current L1 and L2. However concerns were raised that this was also confusing.

End of the discussion.


### Recommended approach

Adding new property to the existing L1s would be the recommended approach. The suffix of `_v2` should make it clear which property to use and alleviate any confusion. We can also deprecate the older properties to signal to developers to use the new one. By deprecating the older properties rather than removing them, we ensure that existing code continues to function without breaking. Developers can still use the original properties, while being encouraged to adopt the new "_v2" version at their own pace.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the more thorough validations added with the enhanced L1s!

One question, how does this work when the original prop is required? For example in CfnTableProps, the keySchema property is required, something like below:

interface CfnTableProps {
  keySchema: FooType
}

After enhancing L1, we cannot make both original and v2 prop required, because users only need to set one.

// NOT WORK
interface CfnTableProps {
  keySchema: FooType
  keySchema_v2: FooType
}

So I think we have to make the interface something like this:

interface CfnTableProps {
  keySchema?: FooType
  keySchema_v2?: FooType
}

But imho this is actually worse than the original L1, because it lost the information that keySchema is a required property. (of course we can do runtime validation, but type check is faster.)

This is another reason why I'm not a big fan of option 1. I like option 3 better with the upgrade of aws-cdk-lib to v3 (and possibly aws-cdk v3) as @johnf said. With cdk v3, we can also remove all the feature flags that have been added since the v2 release, which has been a burden for construct library maintainers to validate their code with various possible flags.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good example, my thought was we could do runtime validation (making sure one of these is set) but as you mentioned type check is faster. I'll add this example to the cons for option 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn this stuff is hard!

For option 1, I agree that the loss of strong typing (turning required properties to optional) is a major drawback.

But option 1 is tricky for another reason: you will likely want to deprecate the legacy properties and keep only the modern versions. In the above example, it means that you will need to delete keySchema and only leave keySchema_v2 which is not a good outcome. If you end up modernizing keySchema, you'll need to remove keySchema_v2 and then you will penalize the users who migrated to the modern version earlier by breaking or deprecating their code.

I would say the best approach would be to go with option 3 and couple this feature with a new major version (v3) of the CDK, which will allow you to break the existing APIs. When we planned v2 we had a GitHub issue where we accumulated these major breaking changes, and at some point we got to a critical mass that justified a new major version.

If v3 is too far away and there's a very strong push to introduce this capabilities sooner, I think option 2 is a reasonable tradeoff albeit there's a chance it will cause the library to be too large.

I am wondering if perhaps as an intermediate step (before a major version is released), you can just add better documentation and runtime (synth-time) checks that the value is valid. This will go a long way to shift-left errors without breaking. In TypeScript, you could even use a type union like "foo" | "bar" | "baz" to improve typings without impacting JSII languages.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you end up modernizing keySchema, you'll need to remove keySchema_v2 and then you will penalize the users who migrated to the modern version earlier by breaking or deprecating their code.

By modernizing keySchema do you mean if/when we go to CDK V3 we'd have to go back to keySchema?

I am wondering if perhaps as an intermediate step (before a major version is released), you can just add better documentation and runtime (synth-time) checks that the value is valid.

Updating the docs is a great idea and the implementation should be fairly straight forward. For the runtime validation, is it possible to do without introducing a new property and without breaking compatibility?

When we planned v2 we had a GitHub issue where we accumulated these major breaking changes?

Do you have a link to this github issue? Couldn't find it through a quick search 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you end up modernizing keySchema, you'll need to remove keySchema_v2 and then you will penalize the users who migrated to the modern version earlier by breaking or deprecating their code.

By modernizing keySchema do you mean if/when we go to CDK V3 we'd have to go back to keySchema?

Yes. In the long run that's the "right" name.

I am wondering if perhaps as an intermediate step (before a major version is released), you can just add better documentation and runtime (synth-time) checks that the value is valid.

Updating the docs is a great idea and the implementation should be fairly straight forward. For the runtime validation, is it possible to do without introducing a new property and without breaking compatibility?

I believe it should be possible to only add runtime validations instead of introducing new types.

When we planned v2 we had a GitHub issue where we accumulated these major breaking changes?

Do you have a link to this github issue? Couldn't find it through a quick search 😅

https://github.com/aws/aws-cdk-rfcs/blob/main/text/0079-cdk-2.0.md

That's the relevant RFC

@go-to-k
Copy link

go-to-k commented Dec 20, 2024

This is an excellent suggestion. It would be especially nice to have the built-in validation in L1 by default.

And most of my opinions have already been mentioned above. Introducing V2 for each parameter could lead to complexity (props become bloated, required properties become optional, etc.).

Also, many users have created a mechanism to check against L1 resource properties using Aspects, and I'm concerned that the addition of this v2 property might break that (existing) mechanism.

Therefore, I agree with the opinion to upgrade with CDK V3 (or aws-cdk-lib-v2 mentioned as option.3) as they say.

@aaythapa
Copy link
Author

aaythapa commented Dec 23, 2024

Thank you all for the feedback everyone!

The general consensus seems to be that the features are valuable and would make a great addition to the L1s. However, the proposed implementation (_v2 suffix on the new properties), compromises UX while aiming to maintain backward compatibility. Most would prefer moving forward with CDK V3 instead.

I'll share this feedback with the team and provide updates here as they come. In the meantime, please feel free to share any additional comments or suggestions. Thanks again

text/0655-enhanced-l1.md Show resolved Hide resolved
text/0655-enhanced-l1.md Show resolved Hide resolved

The `aws-service-spec` currently integrates various data sources (listed [here](https://github.com/cdklabs/awscdk-service-spec?tab=readme-ov-file#data-sources)), combining them into a single JSON file that serves as the source of truth for L1 generation. This JSON file is then published to NPM. The CDK repository uses this file in conjunction with the `spec2cdk` script ([here](https://github.com/aws/aws-cdk/tree/main/tools/%40aws-cdk/spec2cdk)) to generate L1s. While we can enhance the CDK L1s with enums and validations by adjusting our generation code, we first need to ensure the necessary data for this generation is available.

Within our current list of data sources we can find some enums and validation information. The registry schema, which forms the base of our source of truth, already includes some enums and validation information. We just need to modify the service spec to incorporate this information into the published JSON file. We can then expand our data sources to improve coverage. `cfn-lint` is a command line tool that helps validate CFN templates for syntax and best practices. It contains a schema that builds on the CFN Resource Provider Schemas by adding additional information from various SDKs, then repackages these schemas for distribution. We can include the information provided by `cfn-lint` into our source of truth to achieve better coverage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does cfn-lint get the data from? Any reason not to include this data source directly?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfn-lint already has a process built that pulls the data from the AWS SDK (specifically boto3). Idea is to not re-implement that ourselves. CFN Lint has similar data requirements/usage, so (imo) makes more sense to re-use. Ideally this data should just be in the CFN schemas per service directly, which is something we can work with cfn-lint and CFN on directly. That is outside of this, but helps with the full picture.


Within our current list of data sources we can find some enums and validation information. The registry schema, which forms the base of our source of truth, already includes some enums and validation information. We just need to modify the service spec to incorporate this information into the published JSON file. We can then expand our data sources to improve coverage. `cfn-lint` is a command line tool that helps validate CFN templates for syntax and best practices. It contains a schema that builds on the CFN Resource Provider Schemas by adding additional information from various SDKs, then repackages these schemas for distribution. We can include the information provided by `cfn-lint` into our source of truth to achieve better coverage.

In cases where a schema doesn’t contain the required information, it doesn’t matter whether we retrieve this data from a JSON file or by executing an AWS CLI command. For instance this PR updates the enum values available for a property based on the output of an AWS CLI command defined in the docs for that property. We can build workflows that run these commands and update specific properties with new data. These workflows should support all necessary AWS CLI commands and allow flexible specification of which command to run and which property to update.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK AWS CLI is based on the Smithy service definitions. Would it make sense to pull these as a data source and cross reference with the CFN service spec?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is (more or less) what cfn-lint is doing (see above comment) but packaging it as CFN schemas that we can just leverage instead of doing ourselves.

text/0655-enhanced-l1.md Show resolved Hide resolved
1. Drawback: The UX may be confusing as developers have to decide which version of a property they want to use
2. Create new L1 files in the existing library: Here, the suffix would be added to the file name rather than the properties, e.g., `aws_lambda` vs. `aws_lambda_v2`.
1. Drawback: This would overload the current library and may be redundant in some cases as not all properties have a _v2 so we would essentially be copying over the same thing to the new `_v2` file. For libraries with `_v2` files there may be very minimal changes.
3. Generate a new library with the new L1s: The suffix would be applied to the library name itself, e.g., `aws-cdk-lib` vs. `aws-cdk-lib-v2`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's enough "meat" for CDKv3, then I would recommend choosing this approach, but bear in mind that a major version is a big project (v2 tool us almost two years to release).

It is also okay to punt this particular enhancement until you are ready to release v3 given its breaking nature.


### Recommended approach

Adding new property to the existing L1s would be the recommended approach. The suffix of `_v2` should make it clear which property to use and alleviate any confusion. We can also deprecate the older properties to signal to developers to use the new one. By deprecating the older properties rather than removing them, we ensure that existing code continues to function without breaking. Developers can still use the original properties, while being encouraged to adopt the new "_v2" version at their own pace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn this stuff is hard!

For option 1, I agree that the loss of strong typing (turning required properties to optional) is a major drawback.

But option 1 is tricky for another reason: you will likely want to deprecate the legacy properties and keep only the modern versions. In the above example, it means that you will need to delete keySchema and only leave keySchema_v2 which is not a good outcome. If you end up modernizing keySchema, you'll need to remove keySchema_v2 and then you will penalize the users who migrated to the modern version earlier by breaking or deprecating their code.

I would say the best approach would be to go with option 3 and couple this feature with a new major version (v3) of the CDK, which will allow you to break the existing APIs. When we planned v2 we had a GitHub issue where we accumulated these major breaking changes, and at some point we got to a critical mass that justified a new major version.

If v3 is too far away and there's a very strong push to introduce this capabilities sooner, I think option 2 is a reasonable tradeoff albeit there's a chance it will cause the library to be too large.

I am wondering if perhaps as an intermediate step (before a major version is released), you can just add better documentation and runtime (synth-time) checks that the value is valid. This will go a long way to shift-left errors without breaking. In TypeScript, you could even use a type union like "foo" | "bar" | "baz" to improve typings without impacting JSII languages.

readonly artifactBucket?: s3.ICfnBucket;
}
```
If those locations use a functionality from the L2 interface, we can do an upconversion to an `IBucket` by implementing an API that converts an `ICfn<Resource>` into an `I<Resource>`. We do a similar operation when a resource that isn’t created in the current CDK stack is referenced. This needs to done to maintain backwards compatibility and not break existing customers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if perhaps it would be sufficient to just add fromCfnXxx to all L2s instead of juggling all of these interfaces.

These conversion static methods would be straightforward to implement:

public static fromCfnBucket(bucket: CfnBucket): IBucket {
  return this.fromBucketArn(bucket.attrBucketArn);
}

Indeed, it won't allow users to directly pass an L1 everywhere an IBucket is needed but it feels like this would be good enough to improve the ergonomics without too many breaking changes.

Copy link
Author

@aaythapa aaythapa Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bucket L2 already has a fromCfnBucket that implements what you mentioned but I'm not sure how many other L2s have something similar or what the effort to add the missing ones would be. Could be worthwhile alternative but I don't think it fixes the root issue (and it becomes another thing we have to manually make rather than having it be auto generated :/ )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be easy to enforce the fromCfnXxx static methods using awslint so it's semi-manual.

text/0655-enhanced-l1.md Outdated Show resolved Hide resolved
@mrpackethead
Copy link
Contributor

It was questioned if enums are necessary in L1s as they don't seem to add much value and take L1s further away from what the CFN is. It was also noted that the use of _v2 is confusing and annoying. In particular deploying this pattern for enums will make it a regular occurrence across the L1 library, as opposed to using the suffix only in situations when an upstream type change is breaking.

I may have missed the point here, but to me, having enums where there is a defined list of possible values, is very valuable. Its an almost every day occurrence having to look at the docs, to check on valid values.

Why would this not be valuable?

@aaythapa aaythapa added status/proposed Newly proposed RFC pr/do-not-merge Let mergify know not to auto merge labels Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/do-not-merge Let mergify know not to auto merge status/proposed Newly proposed RFC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants