How do we protect against breaking changes in construct libraries? #145

eladb · 2018-06-20T08:31:52Z

Users expressed a concern that CDK constructs, that are routinely updated to fix issues and add functionality, will not be deterministic. i.e. A developer updates the construct libraries on their machine via a package manager. These constructs contain changes to IAM policy that is not obvious from just inspecting the repo. They then push changes to a stack that results in unexpected security permission changes.

The cdk diff command can help identify unexpected modifications to infrastructure, but that seems like half the story. Is tight versioning of CDK constructs the other part of the story; so that CDK stacks are more deterministic?

The text was updated successfully, but these errors were encountered:

eladb · 2018-06-20T09:13:44Z

The short answer is, "yes". The long answer is "it's complicated".

Versioning is the mechanism to ensure that we don't break our customers unexpectedly. Post-1.0, we are committed to semver, which means that breaking changes in any of our modules will be communicated by bumping the "major" component of the version. This is the industry-standard and package managers usually protect consumers from accidentally taking unsolicited major versions.

So the question is: how do we decide what is a breaking change? That's where things get complicated.

Academically, a component is considered backwards-compatible if for the same inputs, it produces the same outputs as a previous compatible version. A common practice to protect against breaking changes is regression tests: when you release a new version you execute the tests of previous compatible versions against it and verify that they all pass. For examples the cracks plugin for semantic-release can do that automatically.

How do we translate these practices to construct libraries?

Theoretically, CDK constructs should be extremely easy to test because they are purely functional: there are no side-effects by design (output is always a synthesized artifact). And at the moment, indeed many of our tests today follow this pattern:

Define a stack, with the constructs you wish to test, configured in a certain way (i.e. define a Bucket with KMS encryption enabled).
Synthesize a CloudFormation template from the stack
Compare the result 1:1 with an expected CloudFormation template

This allows us to very quickly write tests that validate the behavior of the construct under certain conditions and protect against unexpected regressions.

The problem is that these tests are too rigid. It practically means that if we change anything in how the construct behaves (i.e. add a resource to the construct), the test will fail and we will have to update the expected result. In some of these cases, these failures will indicate a breaking change and in other cases, they won't. This means that we can't use these naive testing approach to automatically determine if a change was breaking.

Moreover, the heuristics for determining if a change was breaking are quite elaborate. Here are some examples:

An added resource might be fine, but may also open up unwanted security surface.
Changes to logical IDs will result in tearing down deployed resources, but this might be fine resources that don't have any persistent state behind them (i.e. event rules) and incredibly bad for resources with state (i.e. a database).
More specific IAM permissions could be breaking, but might not (maybe they were overly permissive).
etc.

So what should we do today?

This is not a unique challenge reserved for construct libraries. It's a generally complex problem in software engineering. The holly grail is to have good tools to write robust tests that can be used to automatically discover breaking changes in future versions, and that's exactly why we started the aws-cdk-assert library.

There's a long way to go until we can build enough confidence in our understanding of the various use cases and semantics of breaking changes in synthesis, such that we can codify them into aws-cdk-assert and utilize them across our codebase.

In the meantime, no unlike many other software projects, it is our responsibility to review every change to construct libraries critically, and make sure to call-out breaking changes, both in inputs (API shapes) and outputs (synthesized templates), but reason about the actual nature of the change. If a change is breaking, the commit message must include the text BREAKING CHANGE (per conventionalcommits) so that when the software is released, a major-version bump will be included.

How do we reach nirvana?

In parallel, we should try to get to the point where aws-cdk-assert lets us author robust tests and have built-in heuristics to identify suspects for breaking changes. Perhaps we can setup some pipelines to automatically execute old tests against new should-be-compatible versions and open issues to that will help us improve the robustness of the test and add to the knowledge base of aws-cdk-assert.

rix0rrr · 2019-03-27T15:03:13Z

Plan of attack:

Parse stability annotations from docs
Expose stability annotations through jsii-reflect
Write jsii tool to compare two assemblies for compatibility (jsii-compat?)
Update awslint to require people to write docs and add stability annotations to all constructs, no experimental types used as arguments to stable methods.
Reflect stability in docs
Use jsii-compat in integ tests

eladb · 2019-03-27T23:38:09Z

My thought for a name for this tool was jsii-diff, that can be used to generally compare two APIs, but can also work in a "superset/subset" mode.

I am thinking that perhaps we should run this as an integration test in the pipeline and not during bump. It would cause the pipeline to stop and then people would need to fix the error and commit a fix to revert any breaking changes.

rix0rrr · 2019-03-28T08:07:25Z

You're right, when we're in post-1.0 we should be finding incompatibilities as early on as possible.

The big question will be: where are we going to get the previous release's .jsii file?

Check it in? Download from GitHub?

eladb · 2019-03-28T08:34:49Z

I've been contemplating on this. I think download from npm because that's that's the authoritative last published version by definition, no?

rix0rrr · 2019-03-28T08:37:20Z

Yes. It's just not so nice to do a download from NPM for every build.

But I guess we can make it an option.

eladb · 2019-03-28T08:40:07Z

That's why I felt it should be an integration test. Also, it should be possible to just download the tar.gz file from npm without actually installing all the dependencies.

rix0rrr · 2019-03-28T12:35:01Z

That's not working because jsii-reflect borks if it can't find a package's dependencies.

Add a check to `cdk-test` to confirm that no breaking API changes have been introduced with respect to the most recently published version. Because this will incur an NPM fetch, add a flag (`--quick`/`-q`) to disable it. Addresses #145.

Add a check to `cdk-test` to confirm that no breaking API changes have been introduced with respect to the most recently published version. Because this will incur an NPM fetch, add a flag (`--quick`/`-q`) to disable it. Addresses aws#145.

skinny85 · 2019-07-26T23:28:15Z

Is this done? :)

eladb · 2019-07-28T05:55:40Z

Yes

eladb mentioned this issue Jul 10, 2018

Automated tests for semantic versions backwards compatibility #253

Closed

eladb added the ops-excellence Operational Excellence label Oct 9, 2018

fulghum added p0 package/awscl Cross-cutting issues related to the AWS Construct Library package/tools Related to AWS CDK Tools or CLI labels Mar 6, 2019

rix0rrr self-assigned this Mar 25, 2019

rix0rrr mentioned this issue Apr 23, 2019

feat(cdk-test): check API compatibility #2356

Merged

4 tasks

eladb closed this as completed Jul 28, 2019

This was referenced Dec 12, 2019

[Snyk] Upgrade @aws-cdk/cdk from 0.22.0 to 0.35.0 MechanicalRock/scdk#3

Closed

[Snyk] Upgrade @aws-cdk/cdk from 0.24.1 to 0.35.0 MechanicalRock/cdk-constructs#8

Open

[Snyk] Upgrade @aws-cdk/cdk from 0.24.1 to 0.35.0 MechanicalRock/account-reaper#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do we protect against breaking changes in construct libraries? #145

How do we protect against breaking changes in construct libraries? #145

eladb commented Jun 20, 2018

eladb commented Jun 20, 2018

rix0rrr commented Mar 27, 2019 •

edited

Loading

eladb commented Mar 27, 2019

rix0rrr commented Mar 28, 2019

eladb commented Mar 28, 2019

rix0rrr commented Mar 28, 2019

eladb commented Mar 28, 2019

rix0rrr commented Mar 28, 2019

skinny85 commented Jul 26, 2019

eladb commented Jul 28, 2019

How do we protect against breaking changes in construct libraries? #145

How do we protect against breaking changes in construct libraries? #145

Comments

eladb commented Jun 20, 2018

eladb commented Jun 20, 2018

How do we translate these practices to construct libraries?

So what should we do today?

How do we reach nirvana?

rix0rrr commented Mar 27, 2019 • edited Loading

eladb commented Mar 27, 2019

rix0rrr commented Mar 28, 2019

eladb commented Mar 28, 2019

rix0rrr commented Mar 28, 2019

eladb commented Mar 28, 2019

rix0rrr commented Mar 28, 2019

skinny85 commented Jul 26, 2019

eladb commented Jul 28, 2019

rix0rrr commented Mar 27, 2019 •

edited

Loading