Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: allow disabling context lookups (was commit cdk.context.json) #6929

Closed
philtay opened this issue Mar 22, 2020 · 11 comments
Closed

cli: allow disabling context lookups (was commit cdk.context.json) #6929

philtay opened this issue Mar 22, 2020 · 11 comments
Labels
effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p1 package/tools Related to AWS CDK Tools or CLI
Milestone

Comments

@philtay
Copy link

philtay commented Mar 22, 2020

The docs state that cdk.context.json should be checked-in to ensure reproducibility. However, that file may contain information that's usually considered sensitive like AWS account numbers (e.g. aws-samples). Such data shouldn't be commited and utilities like git-secrets actively try to prevent that. Could you please share what's the best practice in these cases?

@philtay philtay added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Mar 22, 2020
@SomayaB SomayaB added the guidance Question that needs advice or information. label Mar 25, 2020
@ghost ghost self-assigned this Mar 27, 2020
@mjsilva
Copy link

mjsilva commented Apr 5, 2020

Same questions as OP. It would be good to understand better the pros and cons of commiting cdk.context.json. For example on our environment we have 2 organisation accounts staging and production. On our CI we use different IAM credentials to deploy to each account, in this case is there a benefit of commiting it?

As far as I've understand you'd only commit the file if you want to be able to run synth without having to login to AWS and possible speed things up a bit since is saves some lookups.

Another thing I'd like to understand is the difference between cdk.json and cdk.context.json. I noticed if I put things like "@aws-cdk/core:enableStackNameDuplicates": "true", in cdk.json once I run diff they are moved to cdk.context.json, not sure why.

@NGL321 NGL321 removed their assignment Apr 6, 2020
@NGL321 NGL321 added documentation This is a problem with documentation. package/tools Related to AWS CDK Tools or CLI and removed feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Apr 6, 2020
@NGL321 NGL321 assigned shivlaks and unassigned shivlaks Apr 6, 2020
@rix0rrr rix0rrr assigned eladb and unassigned ghost Apr 17, 2020
@rix0rrr
Copy link
Contributor

rix0rrr commented Apr 17, 2020

As far as I've understand you'd only commit the file if you want to be able to run synth without having to login to AWS and possible speed things up a bit since is saves some lookups.

Not really, depending on what's in there. If there are 3 AZs in your region and you deploy a CDK app, a VPC will be created with a specific IP layout.

If you don't commit context.json and on the next deploy AWS happens to have added an AZ to your region, your VPC will now try to span 4 AZs, have a different IP space layout and all your subnets will have to be destroyed and recreated (this will probably fail so you won't actually lose any data, but your deployment will be stuck with no way to move forward).

Similarly, if you've started an instance off of an Amazon Linux AMI ID, and there now happens to be a new version of that AMI available... do you want your instance to be automatically replaced with a new version of the OS? What if you had state on that machine?

cdk.context.json is not just an API call cache, it's a store of nondeterministic decisions taken in the past, which you must commit in order to ensure consistency in the future. It also gives you the opportunity to update the values piecemeal. ("YES I will take a new AMI ID now, NO I will not be spreading to new AZs").

As for account IDs, we never considered them very sensitive, similar to how usernames aren't usually considered sensitive. AWS itself publishes many of its own account IDs for users to put in their IAM policies, for example. The keys to the castle are in the access credentials to the account, not the account ID itself.

In fact, we generally expect people to put account IDs in their source as well (or build their application to read them from a configuration file), especially in the upcoming CI/CD implementation where we're going to force it.

It may be possible that our opinions differ from other departments whose specialization is security. The biggest risk I personally see is bucket sniping.

I see a couple of solutions to this, but they all come with their own downsides:

Account IDs in context file:

  • Storing context in the account itself. We could store the context for an account INSIDE the account, let's say in SSMPS. There would be no more cdk.context.json, so nothing to commit (yay). Would not get rid of account ids in the source files though.
  • Make account IDs nonreversible. In the context.json file, we could hash the account IDs to make then nonreversible. Would not get rid of account ids in the source files though.

Account IDs in source:

  • Symbolic account IDs. Instead of specifying account IDs in source files, we'd have a symbolic identifier such as $prod1, $prod2, $prod3 etc. The next complication then becomes: where do we store the mapping from { "$prod1": "123456789012" }? It will then be the user's responsibility to make sure the mapping is transported and kept in sync between all developer's machines. Also, these values could not be stored in the context.json file directly, because there's no guarantee that $prod1 on one machine is the same account as $prod1 on another machine, so we'd need one of the other solutions in addition to this.

I think the end result of this will be that we may implement one of the solutions for the context, and it will be users responsibility to hide account IDs from their sources if desired, to be implemented in a way that suits them.

@mjsilva
Copy link

mjsilva commented Apr 17, 2020

thanks for taking the time @rix0rrr. That provided a lot of clarity on things I was wondering about.

I'm still not completely clear though, I understand the use case I just think I'm missing the the development workflow.

Since we run cdk diff + cdk deploy as part of our CI/CD pipeline. When am I supposed to update/create cdk.context.json, should I just run cdk synthesise while I'm developing to make sure cdk.context.json has the right data in and then commit?

@rix0rrr
Copy link
Contributor

rix0rrr commented Apr 17, 2020

When am I supposed to update/create cdk.context.json, should I just run cdk synthesise while I'm developing to make sure cdk.context.json has the right data in and then commit?

Yes.

Honestly it's probably also a good idea to add a flag for the CLI to fail unexpected context lookups (--no-lookups), which you can then use in CI/CD to make sure the output is deterministic (@shivlaks I just came up with another feature request :) )

@eladb
Copy link
Contributor

eladb commented May 3, 2020

@rix0rrr @shivlaks should we create a new issue for the CLI feature request and close this one out?

@pgollucci
Copy link

This one has a lot of good info in it. Definitely interested in this one.

@eladb
Copy link
Contributor

eladb commented May 27, 2020

@shivlaks ping

@eladb eladb changed the title commit cdk.context.json cli: allow disabling context lookups (was commit cdk.context.json) May 27, 2020
@eladb eladb added the feature-request A feature should be added or improved. label May 27, 2020
@eladb eladb assigned shivlaks and unassigned eladb May 27, 2020
@shivlaks
Copy link
Contributor

@eladb sorry, this one skipped my notifications inbox. That makes sense to me - saw that you already re-titled it. I dropped the guidance label as well. Let's use this issue to track the feature if there are no objections.

@rix0rrr rix0rrr added this to the [CLI] Soon milestone Aug 13, 2020
@shivlaks shivlaks added effort/medium Medium work item – several days of effort p1 labels Aug 20, 2020
@NGL321 NGL321 assigned rix0rrr and unassigned shivlaks Jan 25, 2021
@schourode
Copy link
Contributor

schourode commented Nov 24, 2022

@eladb Unless I have misread the thread above entirely, I believe this issue can now be closed as fixed:

$ cdk --help
…
      --lookups            Perform context lookups (synthesis fails if this is
                           disabled and context lookups need to be performed)
                                                       [boolean] [default: true]
…

@tim-finnigan tim-finnigan removed the documentation This is a problem with documentation. label Jan 25, 2023
@TheRealAmazonKendra
Copy link
Contributor

Closing this issue based off the previous comment.

@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p1 package/tools Related to AWS CDK Tools or CLI
Projects
None yet
Development

No branches or pull requests