Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terragrunt v0.0.1, with DynamoDB locking #2

Merged
merged 31 commits into from
Jun 1, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
9e266be
RDD: Create Readme to describe how terragrunt will work
brikis98 May 25, 2016
4d5af92
README cleanup
brikis98 May 25, 2016
3714665
Put in place basic skeleton of terragrunt
brikis98 May 26, 2016
47e23d7
Update install instructions
brikis98 May 26, 2016
618af5f
Update README TODOs
brikis98 May 26, 2016
4118f8f
Update config syntax
brikis98 May 26, 2016
5c40850
Implement DynamoDB lock
brikis98 May 26, 2016
783cf1e
Simplify command parsing. Update docs.
brikis98 May 26, 2016
b08374a
Initial work on git locking
brikis98 May 30, 2016
7378c9d
Handle case where lock file already exists. Refactor packages.
brikis98 May 30, 2016
933c3c7
Remove git implementation and docs
brikis98 May 30, 2016
84736a6
Use Glide. Remove unused code.
brikis98 May 30, 2016
17e3e20
Add automated tests. Add max retries. Split up dynamo code.
brikis98 May 30, 2016
0ac31db
Configure CircleCI to build, test, and push the app
brikis98 May 30, 2016
d102c55
Add app-name parameter to circle.yml.
brikis98 May 30, 2016
eb717b6
Update README about DynamoDB pricing
brikis98 May 30, 2016
cedabc5
Update example output in README
brikis98 May 30, 2016
3b74873
Move note about charges later in README
brikis98 May 30, 2016
65cf3df
Minor typo fix in README
brikis98 May 30, 2016
a2ddae2
Add ldflags. Fix circle.yml. Several README fixes.
brikis98 May 30, 2016
294cf38
Add Glide to path. Add note on releases.
brikis98 May 30, 2016
3f2d6d2
Add set -e flags to _ci scripts
brikis98 May 30, 2016
14eb7fa
Use rm instead of rmdir to remove symlink
brikis98 May 30, 2016
aedb037
Fix glide path. Update test info in README.
brikis98 May 30, 2016
77d83e1
Try to fix glide path yet again
brikis98 May 30, 2016
33f69bd
Fix link in README
brikis98 May 30, 2016
e5b727b
Fix run-tests.sh script. Update test section in README.
brikis98 May 30, 2016
82db053
Enable GO15VENDOREXPERIMENT in circle.yml
brikis98 May 30, 2016
0897697
Remove vendor experiment setting
brikis98 May 30, 2016
60eda07
Fix install-dependencies.sh call in circle.yml
brikis98 May 30, 2016
4d62f6c
Need rm -rf in install-dependencies.sh
brikis98 May 30, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
.idea
vendor
.terragrunt
205 changes: 205 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
# Terragrunt

Terragrunt is a thin wrapper for the [Terraform client](https://www.terraform.io/) that provides a distributed locking
mechanism which allows multiple people to collaborate on the same Terraform state without overwriting each other's
changes. Terragrunt currently uses Amazon's [DynamoDB](https://aws.amazon.com/dynamodb/) to acquire and release locks.
DynamoDB is part of the [AWS free tier](https://aws.amazon.com/dynamodb/pricing/), so if you're already using AWS, this
locking mechanism should be completely free. Other locking mechanisms may be added in the future.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Other locking mechanisms/Other locking mechanism options/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the #4 readme to avoid merge conflicts.


## Motivation

When you use Terraform to provision infrastructure, it records the state of your infrastructure in [state
files](https://www.terraform.io/docs/state/). In order to make changes to your infrastructure, everyone on your
team needs access to these state files. You could check the files into version control (not a great idea, as the state
files may contain secrets) or use a supported [remote state
backend](https://www.terraform.io/docs/state/remote/index.html) to store the state files in a shared location such as
[S3](https://www.terraform.io/docs/state/remote/s3.html),
[Consul](https://www.terraform.io/docs/state/remote/consul.html),
or [etcd](https://www.terraform.io/docs/state/remote/etcd.html). The problem is that none of these options provide
*locking*, so if two team members run `terraform apply` on the same state files at the same time, they may overwrite
each other's changes. The official solution to this problem is to use [Hashicorp's
Atlas](https://www.hashicorp.com/atlas.html), but that requires using a SaaS platform for all Terraform operations and
can cost a lot of money.

The goal of Terragrunt is to provide a simple, free locking mechanism that allows multiple people to safely collaborate
on Terraform state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well said.

## Install

1. Install [Terraform](https://www.terraform.io/).
1. Install Terragrunt by going to the [Releases Page](https://github.com/gruntwork-io/terragrunt/releases), downloading
the binary for your OS, renaming it to `terragrunt`, and adding it to your PATH.

## Quick start

Go into the folder with your Terraform templates and create a `.terragrunt` file. This file uses the same
[HCL](https://github.com/hashicorp/hcl) syntax as Terraform and is used to configure Terragrunt and tell it how to do
locking. To use DynamoDB for locking (see [Locking using DynamoDB](#locking-using-dynamodb)), `.terragrunt` should
have the following contents:

```hcl
lockType = "dynamodb"
stateFileId = "my-app"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't I need to specify a DynamoDB table name, or is that what stateFileId is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why the following isn't needed here?

dynamoLock = {
   awsRegion = "us-east-1"
   tableName = "terragrunt_locks"
   maxLockRetries = 360
 }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a left-over from the initial work that included git locking. I've refactored this in #3. All DynamoDB config is now under the dynamoLock setting.


Now everyone on your team can use Terragrunt to run all the standard Terraform commands:

```bash
terragrunt get
terragrunt plan
terragrunt apply
terragrunt output
terragrunt destroy
```

Terragrunt forwards most commands directly to Terraform. However, for the `apply` and `destroy` commands, it will first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth mention if you can blindly pass in args as well, and whether terragrunt will automatically support all new terraform commands added, or if we have to issue a new release to handle that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docs to say we forward args/options too and that we are just shelling out to the Terraform you have installed, so it'll use whatever version you have.

acquire a locking using [DynamoDB](#locking-using-dynamodb):

```
terragrunt apply
[terragrunt] 2016/05/30 16:55:29 Attempting to acquire lock for state file my-app in DynamoDB
[terragrunt] 2016/05/30 16:55:30 Attempting to create lock item for state file my-app in DynamoDB table terragrunt_locks
[terragrunt] 2016/05/30 16:55:30 Lock acquired!
[terragrunt] 2016/05/30 16:55:30 Running command: terraform apply
terraform apply

aws_instance.example: Creating...
ami: "" => "ami-0d729a60"
instance_type: "" => "t2.micro"

[...]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

[terragrunt] 2016/05/27 00:39:19 Attempting to release lock for state file my-app in DynamoDB
[terragrunt] 2016/05/27 00:39:19 Lock released!
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool.


## Locking using DynamoDB

Terragrunt can use Amazon's [DynamoDB](https://aws.amazon.com/dynamodb/) to acquire and release locks. DynamoDB supports
[strongly consistent reads](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.DataConsistency.html)
as well as [conditional writes](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.SpecifyingConditions.html),
which are all the primitives we need for a very basic distributed lock system. It's also part of [AWS's free
tier](https://aws.amazon.com/dynamodb/pricing/), and given the tiny amount of data we are working with and the
relatively small number of times per day you're likely to run Terraform, it _should_ be a free option for teams already
using AWS. We take no responsibility for any charges you may incur.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We take no responsibility for any charges you may incur.

Ha, I think this goes without saying but comes off a little brusque. Taking a second look, do you still think we need to include this statement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Packer and Terraform docs say basically the same thing.


#### DynamoDB locking prerequisites

To use DynamoDB for locking, you must:

1. Already have an AWS account.
1. Set your AWS credentials in the environment using one of the following options:
1. Set your credentials as the environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
1. Run `aws configure` and fill in the details it asks for.
1. Run Terragrunt on an EC2 instance with an IAM Role.
1. Your AWS user must have an [IAM
policy](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/access-control-identity-based.html)
granting all DynamoDB actions (`dynamodb:*`) on the table `terragrunt_locks` (see the
Copy link
Contributor

@josh-padnick josh-padnick May 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so terragrunt always uses the same table name. Does that create any conflicts if multiple different terraform templates are using it?

I guess the implication is that you only get one lock across all terraform templates in your entire infrastructure?

Either way, we should explicitly discuss what it looks like to use terragrunt in parallel with multiple different Terraform templates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just saw stateFileId. It may be worth clarifying the parallel runs bit sooner, not a big deal, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment clarifying this in #4.

[DynamoDB locking configuration](#dynamodb-locking-configuration) for how to configure this table name). Here is an
example IAM policy that grants the necessary permissions on the `terragrunt_locks` table in region `us-west-2` for
an account with account id `1234567890`:

```json
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "",
"Effect": "Allow",
"Action": "dynamodb:*",
"Resource": "arn:aws:dynamodb:us-west-2:1234567890:table/terragrunt_locks"
}]
}
```

#### DynamoDB locking configuration

For DynamoDB locking, Terragrunt supports the following settings in `.terragrunt`:

```hcl
lockType = "dynamodb"
stateFileId = "my-app"

dynamoLock = {
awsRegion = "us-east-1"
tableName = "terragrunt_locks"
maxLockRetries = 360
}
```

* `lockType`: (Required) Must be set to `dynamodb`.
* `stateFileId`: (Required) A unique id for the state file for these Terraform templates. Many teams have more than
one set of templates, and therefore more than one state file, so this setting is used to disambiguate locks for one
state file from another.
* `awsRegion`: (Optional) The AWS region to use. Default: `us-east-1`.
* `tableName`: (Optional) The name of the table in DynamoDB to use to store lock information. Default:
`terragrunt_locks`.
* `maxLockRetries`: (Optional) The maximum number of times to retry acquiring a lock. Terragrunt waits 10 seconds
between retries. Default: 360 retries (one hour).

#### How DynamoDB locking works

When you run `terragrunt apply` or `terragrunt destroy`, Terragrunt does the following:

1. Create the `terragrunt_locks` table if it doesn't already exist.
1. Try to write an item to the `terragrunt_locks` with `stateFileId` equal to the id specified in your
`.terragrunt` file. This item will include useful metadata about the lock, such as who created it (e.g. your
username) and when.
1. Note that the write is a conditional write that will fail if an item with the same `stateFileId` already exists.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see this in the docs, but just wanted to do a sanity check: Is the conditional write's reading of stateFileId itself strongly consistent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. If the write succeeds, it means we have a lock!
1. If the write does not succeed, it means someone else has a lock. Keep retrying every 10 seconds until we get a
lock.
1. Run `terraform apply` or `terraform destroy`.
1. When Terraform is done, delete the item from the `terragrunt_locks` table to release the lock.

## Cleaning up old locks

If Terragrunt is shut down before it releases a lock (e.g. via `CTRL+C` or a crash), the lock might not be deleted, and
will prevent future changes to your state files. To clean up old locks, you can use the `release-lock` command:

```
terragrunt release-lock
Are you sure you want to forcibly remove the lock for stateFileId "my-app"? (y/n):
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. It might be nice to have a terragrunt show-locks command as well that reports for how long a lock has been in place so you can see right from terragrunt which lock is probably obsolete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as a TODO


## Developing terragrunt

#### Running tests

**Note**: The tests in the `dynamodb` folder for Terragrunt run against a real AWS account and will add and remove
real data from DynamoDB. DO NOT hit `CTRL+C` while the tests are running, as this will prevent them from cleaning up
temporary tables and data in DynamoDB. We are not responsible for any charges you may incur.

Before running the tests, you must configure your AWS credentials as explained in the [DynamoDB locking
prerequisites](#dynamodb-locking-prerequisites) section.

To run all the tests:

```bash
go test -v -parallel 128 $(glide novendor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glide novendor! Didn't realize this existed!

```

To run a single test called `TestFoo`:

```bash
go test -v -parallel 128 -run TestFoo
```

#### Releasing new versions

To release a new version, just go to the [Releases Page](https://github.com/gruntwork-io/terragrunt/releases) and
create a new release. The CircleCI job for this repo has been configured to:

1. Automatically detect new tags.
1. Build binaries for every OS using that tag as a version number.
1. Upload the binaries to the release in GitHub.

See `circle.yml` and `_ci/build-and-push-release-asset.sh` for details.

## TODO

* Implement best-practices in Terragrunt, such as checking if all changes are committed, calling `terraform get`,
calling `terraform configure`, etc.
* Consider implementing alternative locking mechanisms, such as using Git instead of DynamoDB.
* Consider embedding the Terraform Go code within Terragrunt instead of calling out to it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider embedding the Terraform Go code within Terragrunt instead of calling out to it.

I'm not a fan of this personally. It means that we now have to make sure we're tracking the main terraform repo to ensure that how we invoke each command is identical to how terraform invokes each command. At least if we shell out, we guarantee we're using their "official" interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my thinking as well, but there is one compelling reason to consider embedding in the future: we get full control over the tfstate files. I've never been a fan of Terraform's cavalier attitude with a) storing secrets in tfstate files and then b) copying them, unencrypted, to any system where you run Terraform.

One solution for a future version would be for Terragrunt manage tfstate completely. It could store it in an encrypted remote store (e.g. S3, Vault) or encrypt it itself using KMS and store it wherever (e.g. DynamoDB, along with the locks). When you run terragrunt, it would load the tfstate files into memory (not disk) and feed it directly to the embedded Terraform Go code.

Loading