Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Towards automation of deployment creation/teardown #926

Closed
wants to merge 1 commit into from

Conversation

xbrianh
Copy link
Member

@xbrianh xbrianh commented Jan 25, 2018

Standalone script that semi-automates deployment, and completely automates teardown.

Connects to #856

Test plan

Deploy and teardown a lot.

@ghost ghost assigned xbrianh Jan 25, 2018
@ghost ghost added code review labels Jan 25, 2018
resp = input('Proceed (y/N)').lower() or 'n'

if 'y' != resp:
sys.exit(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of writing this yourself, consider using http://click.pocoo.org/5/prompts/



aws = Command(find('aws'))
gcloud = Command(find('gcloud'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure about gcloud, but for aws I would consider using boto3 directly instead. It makes things less painful down the road when you inevitably accrue more logic for handling errors and other conditions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kislyuk My original intent was the avoidance of tools unavailable on the command line. Todays discussion with @hannes-ucsc will cause some refactoring, however, and I may head down the boto path.

@xbrianh xbrianh force-pushed the bhannafi-create-destroy branch 2 times, most recently from 3acf44e to 8a309c3 Compare February 5, 2018 21:54
dss.tf Outdated
@@ -0,0 +1,119 @@
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend using HCL not JSON, as it is much more human readable + editable.

Copy link

@ryanking ryanking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few general comments–

  • you should really use terraform remote state files. otherwise there's no good way to collaborate
  • this stuff should be human readable, use HCL rather than JSON for the terraform configs
  • rather than switching on variables files, I find it cleaner to put your code in a module than have multiple invocations of that module. This makes it easier to get a high-level view of the architecture.

@ryanking
Copy link

ryanking commented Feb 6, 2018

Also, we have a bunch of tools at CZI to make using this stuff easier. We'd like to open source them, but until we do we could look at sharing the source.

@xbrianh xbrianh force-pushed the bhannafi-create-destroy branch 5 times, most recently from b39e15e to 47d1e5d Compare February 16, 2018 19:15
@xbrianh xbrianh force-pushed the bhannafi-create-destroy branch 12 times, most recently from b4fc5bf to 860c6bc Compare February 28, 2018 19:19
@xbrianh xbrianh force-pushed the bhannafi-create-destroy branch 2 times, most recently from 9963f92 to 05de38d Compare March 3, 2018 01:19
@xbrianh xbrianh force-pushed the bhannafi-create-destroy branch 2 times, most recently from 990f24a to 80cba9b Compare March 14, 2018 18:29
Makefile Outdated
for comp in $(components); do \
$(MAKE) -C deployment apply COMPONENT=$$comp; \
done
curl https://dss.dev.data.humancellatlas.org/internal/application_secrets > application_secrets.json
Copy link
Member

@kislyuk kislyuk Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I accidentally saved a draft review note as a single comment:

Just a quick note in case you're not tracking this: The deployment should not depend on a deployed component, and separate deployments should not depend on each other.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked! Although it seems reasonable that infrastructure such as route53 zone and certificates may be shared among deployments.

Unfortunately the creation of application_secrets.json requires a visit to the GCP console.

Copy link
Member

@Bento007 Bento007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • consider adding aws cli to the requirements and automating the installation. pip install awscli —upgrade
  • Add some instructions on things that need to be setup prior to running config, example certificates, route 53, environment file, terraform state bucket
  • if the resource already exists ask if they would like to create another, overwrite, or use existing.
    ignore deployment files.

README.md Outdated
7. Enable required APIs: `gcloud service-management enable cloudfunctions.googleapis.com`; `gcloud service-management
enable runtimeconfig.googleapis.com`
Now you may deploy the cloud assets with
make deploy-infra
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to make deploy-infra

1. Copy `environment.local.example` to `environment.local`
2. Edit `environment.local` to add custom entries that override the default values in `environment`

Run `source environment` now and whenever these environment files are modified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some instructions on things that need to be setup prior to running config. For example certificates or route 53.

@xbrianh xbrianh force-pushed the bhannafi-create-destroy branch 3 times, most recently from 70f117c to 988dc37 Compare March 27, 2018 17:44
Copy link
Member

@Bento007 Bento007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add documentation to the suggested areas.

I don't understand why we have deployment/dev and deployment/staging committed to this repo. We can suggested users create multiple stages, but I don't see the benefit in committing them to the repo when users will create their own anyways. What are they used for?

README.md Outdated
Now you may deploy the cloud assets with
make deploy-infra

##### GCP Application Secrets
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we deal with gcp-credentials.json?

README.md Outdated

8. Generate OAuth application secrets to be used for your instance:
Now you may deploy the cloud assets with
make deploy-infra
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make deploy-infra


Hint: To create GCS buckets from the command line, use `gsutil mb -c regional -l REGION gs://BUCKET_NAME/`.
Run `source environment` now and whenever `configure.py` is executed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this how you change deployments?

@@ -0,0 +1,46 @@
#!/usr/bin/env python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are users supposed to run these, if so when , otherwise we should make them private.

@@ -0,0 +1,22 @@
#!/bin/bash -x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what steps from the readme this is for.

@@ -0,0 +1,15 @@
#!/bin/bash -x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what steps from the readme this is for.

@@ -0,0 +1,10 @@
#!/bin/bash -x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what steps from the readme this is for.

@@ -0,0 +1,248 @@
#!/usr/bin/env python
Copy link
Member

@Bento007 Bento007 Mar 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document what this tool is for.

@Bento007
Copy link
Member

resolve all references to deploy_checkout_lifecycle.py

$(DSS_HOME)/scripts/deploy_checkout_lifecycle.py; \

@Bento007
Copy link
Member

running into this error when creating elasticsearch logs.

here is the short version

* aws_elasticsearch_domain.elasticsearch: ValidationException: The Resource Access Policy specified for the CloudWatch Logs log group dss-index-tsmith1-search-logs does not grant sufficient permissions for Amazon Elasticsearch Service to create a log stream. Please check the Resource Access Policy.
	status code: 400, request id: 2f6880e5-32b4-11e8-80b2-db6dd6a4c1d3

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.


make[2]: *** [apply-component] Error 1
make[1]: *** [apply] Error 2
make: *** [deploy-infra] Error 2

Here is the full version:

cd active/elasticsearch; terraform apply
aws_cloudwatch_log_group.dss-index-log: Refreshing state... (ID: dss-index-tsmith1-index-logs)
aws_cloudwatch_log_group.dss-search-log: Refreshing state... (ID: dss-index-tsmith1-search-logs)
data.aws_caller_identity.current: Refreshing state...
data.aws_region.current: Refreshing state...

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + aws_cloudwatch_log_group.dss-index-log
      id:                                                          <computed>
      arn:                                                         <computed>
      name:                                                        "dss-index-tsmith1-index-logs"
      retention_in_days:                                           "0"

  + aws_cloudwatch_log_group.dss-search-log
      id:                                                          <computed>
      arn:                                                         <computed>
      name:                                                        "dss-index-tsmith1-search-logs"
      retention_in_days:                                           "0"

  + aws_elasticsearch_domain.elasticsearch
      id:                                                          <computed>
      access_policies:                                             "    {\n      \"Version\": \"2012-10-17\",\n      \"Statement\": [\n        {\n          \"Effect\": \"Allow\",\n          \"Principal\": {\n            \"AWS\": \"arn:aws:iam::719818754276:root\"\n          },\n          \"Action\": \"es:*\",\n          \"Resource\": \"arn:aws:es:us-east-1:719818754276:domain/dss-index-tsmith1/*\"\n        },\n        {\n          \"Effect\": \"Allow\",\n          \"Principal\": {\n            \"AWS\": \"*\"\n          },\n          \"Action\": \"es:*\",\n          \"Resource\": \"arn:aws:es:us-east-1:719818754276:domain/dss-index-tsmith1/*\",\n          \"Condition\": {\n            \"IpAddress\": {\n              \"aws:SourceIp\": [\n              ]\n            }\n          }\n        }\n      ]\n    }\n  "
      advanced_options.%:                                          "1"
      advanced_options.rest.action.multi.allow_explicit_index:     "true"
      arn:                                                         <computed>
      cluster_config.#:                                            "1"
      cluster_config.0.dedicated_master_enabled:                   "false"
      cluster_config.0.instance_count:                             "1"
      cluster_config.0.instance_type:                              "t2.small.elasticsearch"
      domain_id:                                                   <computed>
      domain_name:                                                 "dss-index-tsmith1"
      ebs_options.#:                                               "1"
      ebs_options.0.ebs_enabled:                                   "true"
      ebs_options.0.volume_size:                                   "10"
      ebs_options.0.volume_type:                                   "gp2"
      elasticsearch_version:                                       "5.5"
      encrypt_at_rest.#:                                           <computed>
      endpoint:                                                    <computed>
      kibana_endpoint:                                             <computed>
      log_publishing_options.#:                                    "2"
      log_publishing_options.~1336667104.cloudwatch_log_group_arn: "${aws_cloudwatch_log_group.dss-index-log.arn}"
      log_publishing_options.~1336667104.enabled:                  "true"
      log_publishing_options.~1336667104.log_type:                 "INDEX_SLOW_LOGS"
      log_publishing_options.~793069664.cloudwatch_log_group_arn:  "${aws_cloudwatch_log_group.dss-search-log.arn}"
      log_publishing_options.~793069664.enabled:                   "true"
      log_publishing_options.~793069664.log_type:                  "SEARCH_SLOW_LOGS"
      snapshot_options.#:                                          "1"
      snapshot_options.0.automated_snapshot_start_hour:            "23"


Plan: 3 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_cloudwatch_log_group.dss-search-log: Creating...
  arn:               "" => "<computed>"
  name:              "" => "dss-index-tsmith1-search-logs"
  retention_in_days: "" => "0"
aws_cloudwatch_log_group.dss-index-log: Creating...
  arn:               "" => "<computed>"
  name:              "" => "dss-index-tsmith1-index-logs"
  retention_in_days: "" => "0"
aws_cloudwatch_log_group.dss-index-log: Creation complete after 1s (ID: dss-index-tsmith1-index-logs)
aws_cloudwatch_log_group.dss-search-log: Creation complete after 1s (ID: dss-index-tsmith1-search-logs)
aws_elasticsearch_domain.elasticsearch: Creating...
  access_policies:                                           "" => "    {\n      \"Version\": \"2012-10-17\",\n      \"Statement\": [\n        {\n          \"Effect\": \"Allow\",\n          \"Principal\": {\n            \"AWS\": \"arn:aws:iam::719818754276:root\"\n          },\n          \"Action\": \"es:*\",\n          \"Resource\": \"arn:aws:es:us-east-1:719818754276:domain/dss-index-tsmith1/*\"\n        },\n        {\n          \"Effect\": \"Allow\",\n          \"Principal\": {\n            \"AWS\": \"*\"\n          },\n          \"Action\": \"es:*\",\n          \"Resource\": \"arn:aws:es:us-east-1:719818754276:domain/dss-index-tsmith1/*\",\n          \"Condition\": {\n            \"IpAddress\": {\n              \"aws:SourceIp\": [\n              ]\n            }\n          }\n        }\n      ]\n    }\n  "
  advanced_options.%:                                        "" => "1"
  advanced_options.rest.action.multi.allow_explicit_index:   "" => "true"
  arn:                                                       "" => "<computed>"
  cluster_config.#:                                          "" => "1"
  cluster_config.0.dedicated_master_enabled:                 "" => "false"
  cluster_config.0.instance_count:                           "" => "1"
  cluster_config.0.instance_type:                            "" => "t2.small.elasticsearch"
  domain_id:                                                 "" => "<computed>"
  domain_name:                                               "" => "dss-index-tsmith1"
  ebs_options.#:                                             "" => "1"
  ebs_options.0.ebs_enabled:                                 "" => "true"
  ebs_options.0.volume_size:                                 "" => "10"
  ebs_options.0.volume_type:                                 "" => "gp2"
  elasticsearch_version:                                     "" => "5.5"
  encrypt_at_rest.#:                                         "" => "<computed>"
  endpoint:                                                  "" => "<computed>"
  kibana_endpoint:                                           "" => "<computed>"
  log_publishing_options.#:                                  "" => "2"
  log_publishing_options.173942822.cloudwatch_log_group_arn: "" => "arn:aws:logs:us-east-1:719818754276:log-group:dss-index-tsmith1-index-logs:*"
  log_publishing_options.173942822.enabled:                  "" => "true"
  log_publishing_options.173942822.log_type:                 "" => "INDEX_SLOW_LOGS"
  log_publishing_options.175525206.cloudwatch_log_group_arn: "" => "arn:aws:logs:us-east-1:719818754276:log-group:dss-index-tsmith1-search-logs:*"
  log_publishing_options.175525206.enabled:                  "" => "true"
  log_publishing_options.175525206.log_type:                 "" => "SEARCH_SLOW_LOGS"
  snapshot_options.#:                                        "" => "1"
  snapshot_options.0.automated_snapshot_start_hour:          "" => "23"

Error: Error applying plan:

1 error(s) occurred:

* aws_elasticsearch_domain.elasticsearch: 1 error(s) occurred:

* aws_elasticsearch_domain.elasticsearch: ValidationException: The Resource Access Policy specified for the CloudWatch Logs log group dss-index-tsmith1-search-logs does not grant sufficient permissions for Amazon Elasticsearch Service to create a log stream. Please check the Resource Access Policy.
	status code: 400, request id: 2f6880e5-32b4-11e8-80b2-db6dd6a4c1d3

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.


make[2]: *** [apply-component] Error 1
make[1]: *** [apply] Error 2
make: *** [deploy-infra] Error 2

@Bento007
Copy link
Member

Bento007 commented Mar 29, 2018

Why isn't DSS_TEST_ES_PATH is set anywhere? I think we should continue to source environment.local after environment to make the transition easier for everyone. Otherwise we should write instruction in the readme to direct users where to modify the variable added to environment

@Bento007
Copy link
Member

Import new deployment changes into current deployment.

@ghost ghost added the in progress label Apr 3, 2018
@xbrianh xbrianh force-pushed the bhannafi-create-destroy branch from 0d948c6 to 9b6d7c4 Compare April 6, 2018 18:34
@xbrianh xbrianh closed this Apr 6, 2018
@ghost ghost removed the in progress label Apr 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants