Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adding AWS stack walkthrough #3

Merged
merged 1 commit into from
Sep 5, 2024
Merged

Conversation

yhakbar
Copy link
Owner

@yhakbar yhakbar commented Sep 4, 2024

Adds a practical example of how Stacks can be used to solve problems with real AWS infrastructure.

Closes #1.
Closes #2.

Review notes:
This will be pretty hard to read in the PR. It's probably easier to just start reading from here

@yhakbar yhakbar merged commit b061355 into main Sep 5, 2024
@yhakbar yhakbar deleted the feat/adding-aws-stack branch September 5, 2024 19:41
Comment on lines +3 to +5
It's a safe guess that most Terragrunt users use it to manage AWS infrastructure.

Here we'll take a look at how Stacks can be used to manage AWS infrastructure in a more scalable way.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/s/Many Terragrunt users use Terragrunt to manage AWS infrastructure, so here we'll take a look...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually know that to be true? I didn't.


When writing up this walkthrough, I did so in the following order:

1. I created the `live/dev/services/api/terragrunt.hcl` file and created a `live/dev/services/api/main.tf` file with the minimal configuration to create a Lambda function.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and created a modules/api/main.tf

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see you really did create it in live first. You might say that you built a blueprint/module first. Otherwise, it's confusing why you're writing Tofu code where the Terragrunt code normally lives.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's how we teach users Terragrunt:
https://terragrunt.gruntwork.io/docs/getting-started/quick-start/

It's the "minimal overhead" approach to adopting Terragrunt if you're currently using OpenTofu.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. In my case, I was confused why you started that way. Might be worth clarifying.

Copy link

@josh-padnick josh-padnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of great content and ideas here, and this is a solid notch better than the previous walkthroughs, but still some improvements to make IMO. Thank you for considering all the feedback!

walkthrough/05-aws/01-no-stacks/README.md Show resolved Hide resolved

When writing up this walkthrough, I did so in the following order:

1. I created the `live/dev/services/api/terragrunt.hcl` file and created a `live/dev/services/api/main.tf` file with the minimal configuration to create a Lambda function.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see you really did create it in live first. You might say that you built a blueprint/module first. Otherwise, it's confusing why you're writing Tofu code where the Terragrunt code normally lives.

6. Finally, I setup the `terragrunt.hcl` file at the root of this walkthrough so that the contents in the `live` directory could be stored in a central state backend, using an S3 bucket.

This is a common pattern when writing IaC configurations. Start with adding a small piece of the infrastructure, and iterate on it until it's correct. Then, refactor out the common patterns into modules and shared configurations, then repeat the process.

This comment was marked as resolved.

walkthrough/05-aws/02-stacks/README.md Show resolved Hide resolved
## Next Steps

The [next chapter](../03-no-shared/) will demonstrate how a little more refactoring can further reduce the amount of code that has to be maintained.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate comment, but I'm looking at:

And they're copying the stack definition. So...if I decide that all my stack instances should now have an S3 bucket, I have to manually update every one of these definitions. If I have to do that...what's the point of using stacks?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point at this stage is that the definition of the units are separately defined here instead of repeatedly in each "Stack" as copied terragrunt.hcl files.

This way, if dev and prod reference the same Unit definition, there's only one place to edit Unit configurations instead of N places.

As you point out later, there is a DRY-er approach where the entire Stack is an encapsulated artifact, but users should be able to chose the level of abstraction that works for them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually love that ability to choose the abstraction level, but it wasn't clear to me that's an option here, so might be worth making that more explicit -- that here we're copying the stack definition, but that if you want to reuse the same dedinition, you'll show that later.

## Next Steps

The [next chapter](../03-no-shared/) will demonstrate how a little more refactoring can further reduce the amount of code that has to be maintained.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear why the mock-stage-generate.sh file is there, though maybe that goes away when we publish this to the community b/c stacks will have been implemented.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,179 @@
# 03 - Reusing Stacks

This chapter will detail how Stacks themselves can be re-used by Stacks to reproduce a set of infrastructure instead of just reproducing individual Units.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we jumping right to this advanced use case of recursive stacks...before we've even shown inputs, or really played around with the stacks lifecycle at all!

Before getting into recursion, I'd expect you to walk through:

  • Anatomy of a stack
    • New blocks like unit
    • How you define a unit template
    • How you define a stack
    • How you reuse a stack
    • How you pass input vals
  • What happens when you update a stack definition
    • What happens when you update a stack instance configuration
  • What happens when you destroy a stack

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh...you're using recursion to introduce the concept of a stack definition. Ok, I think what's throwing me off here is that I'm thinking in terms of how I want to solve the problem whereas you're writing to showcase one feature at a time. So when the feature you intro seems "incomplete" to me, I get confused.

I love that I can look through the file system that corresponds to your README; that's a really neat idea, but maybe we should treat this as an "advanced" walkthrough and create a separate "get an idea of how it works" walkthrough that introduces the concepts in the following order:

  • Plain old Terraform/Tofu
  • Terragrunt Units
  • Terragrunt stacks with a unique definition per stack and no inputs
  • Terragrunt stacks with a shared definition
  • Terragrunt stacks with inputs

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stack "stateful_service" {
source = "../../stacks/stateful-service"
path = "stateful-service"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing that we use the file name terragrunt.stack.hcl for both an instance and the definition of the stack. I wonder if we should introduce a convention like the following:

  • terragrunt.stack.hcl and terragrunt.stack.def.hcl are both valid stack file names
  • We recommend using terragrunt.stack.def.hcl to define your Stack definitions, though you can use them for whatever you want.
  • We might do something similar with units (e.g. terragrunt.def.hcl)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what that buys us.

Users are already using OpenTofu/Terraform, where .tf files are used both for module files and instantiating modules. I don't think we need to complicate matters, at least not for the initial release, if a more minimal implementation with one valid filename will do the job just as well.

...
name = "${local.project_name}-${basename(get_terragrunt_dir())}-${local.random_id}-${local.environment}"
...
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea that we use the inputs of the unit for the inputs of the stack is not at all expected, but it's actually really cool. I love the idea that you could define the unit in such a way that the path sets the input value.

But...it's hard to read and understand, so I'd still really like to see us introduce a "base case" approach to passing inputs to units where you can see inputs specified inline. We could then highlight the limitations of this and call out that users could choose from different options with different pros/cons.

At the end of the day, it's just really jarring to see an instance of a stack (a terragrunt.stack.hcl file), and see no configuration whatsoever. The fact that the stack instance and the stack instance configuration are separated or inferred is really surprising, so I'm worried that both Stack authors and Stack consumers (people who just drop into the codebase) are going to be confused by what's happening.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stacks don't have inputs. The single responsibility of a Stack is to define how Units (or other Stacks) fit together as part of a collective.

I think this is an X Y problem. The thing users want isn't actually to set inputs on Stacks. It's to have Stacks with dynamic Units.

I know we've debated the pros and cons of this quite a bit, but I would like to make two final arguments for why I think we should start with this approach:

  1. This pattern is what users currently have in their codebases. Terragrunt users currently don't have a way to parameterize terragrunt.hcl files, so they already have logic like this to dynamically determine inputs based on pulling in external configurations, using the names of directories, etc.

    Having this be how you make Stacks dynamic on release as the "base case" is how we maximize the likelihood that zero code changes to terragrunt.hcl files have to occur in order to adopt Stacks (note that in this walkthrough I explicitly pointed out that I didn't edit the terragrunt.hcl files at all under This chapter was authored by:).

    In the next chapter, the only way terragrunt.hcl files are edited is to simplify them using configurations that already exist in Terragrunt, and work today. No new syntax, and less indirection.

  2. This walkthrough wouldn't work with inputs on terragrunt.stack.hcl files.

    Note that all of the examples in this repository actually work using the latest release of Terragrunt. The only "magic" functionality used here is the bash script that generates the Stack.

    While I could just tell readers of the walkthrough to use their imaginations, the practical takeaway of that fact is things like:

    a. Introducing Stacks involves less moving parts in the codebase (no additional logic has to be introduced to have inputs merged between those defined on terragrunt.hcl files and terragrunt.stack.hcl files via some sort of override mechanism.

    b. The CI systems that users have built will require no updates. Users can adopt Stacks with zero changes to their CI systems, as from an operational perspective, Stacks are transparent. Units work just like they always have, and users can leverage diffs, etc to determine which Units need updates.

OK, it was kind of four arguments, but hopefully this is persuasive that we are fine to move forward with the RFC as is.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this point I disagree on.

The thing users want isn't actually to set inputs on Stacks. It's to have Stacks with dynamic Units.

Strong disagree. Consider Brian's canonical use case from GruntCon: deploy a common set of units, each with slightly different configuration. At the end of the day, there is a need for both a stack definition and stack instance, and each stack instance needs the potential for unique configuration. In my mind, having some solution for each of these is table stakes for this feature.

At an absolute minimum, we need to demonstrate how we handle each of those concepts with the proposed implementation.

Terragrunt users currently don't have a way to parameterize terragrunt.hcl files

In my mind the Tofu module is the "unit definition" and the terragrunt.hcl file is the "unit instance", and you specify inputs to give configuration.

I don't have more time to dig into this right now, but I'd be grateful if you could address the above points.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it clear that the same terragrunt.stack.hcl file is the definition of a stack, and the instance of a stack in different contexts?

In this example, we're using this file as the "definition" of a stack:
https://github.com/yhakbar/terragrunt-3313-stacks-walkthrough/blob/main/walkthrough/05-aws/04-reusing-stacks/stacks/stateful-service/terragrunt.stack.hcl

When a stack references it:
https://github.com/yhakbar/terragrunt-3313-stacks-walkthrough/blob/main/walkthrough/05-aws/04-reusing-stacks/live/dev/terragrunt.stack.hcl

It generates a .terragrunt-stack/stateful-service directory containing that terragrunt.stack.hcl file, which then recursively generates more .terragrunt-stack directories until all Stacks have been generated.

The terragrunt.stack.hcl file in the .terragrunt-stack/stateful-service directory is now an "instance" of the stack, with the same contents.

The only file that's committed to the repo is the "definition", but having the file work both ways is what allows Stacks to be recursively nested.

Each instance does have the potential for unique configuration, they just don't do it via inputs.

I tried to explain this here. If you could, please let me know what part of that doesn't address the concerns here.


> TBD
>
> Please submit feedback if you would like to see more content here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing missing here is the canonical use case of specifying different configurations for different stack instances using plain old literals. Could you show how each stack would receive different literal values, similar to how each terragrun.hcl file can specify literals directly in the inputs?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't.

The design here is that you always define all the configurations for all of your inputs to OpenTofu/Terraform modules in the terragrunt.hcl files. Full stop. There's no hidden secondary way to override values.

Authors of terragrunt.hcl files have to decide that they want to pull external configurations like the environment.hcl file instead of having overriding configurations pushed from above further removed from the context of how the inputs are going to be used in OpenTofu/Terraform modules.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, but from everything I've seen, users want that per-stack configuration. The fact that it's "configurable" that way is actually pretty cool, but it's not at all obvious. Could you update the walkthrough to highlight this very point? Mainly, that:

  1. All inputs are defined in the units themselves
  2. But if you wanted each stack to have its own unique configuration, here's an example of how you'd configure the units and the stack itself.

It'd be especially helpful to see this with a canonical stack definition which has several stack instances, each with different configuration.

If I may, I understand your pushback on some of my comments, but the idea of stack definitions and stack instances seems fundamental to the problem, so I ask that you explicitly address that use case, even if the design doesn't treat those as first-class concepts.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josh-padnick
Copy link

One final thought: I'm a little confused about the role of all the different walkthroughs. We have the "chicken" one, and now this is the "AWS" one. They're each thoughtful, through, and insightful, but what I'm really looking for is a single, minimal walkthrough of the stacks concept that we can include in a blog post (and as a summary for the GitHub issue RFC) that introduces the stack concept, the elements of a stack (e.g. the unit block), and talks about how you'd want to use stacks from the perspective of common use cases:

  1. Hello world stack
  2. Shared stack definition
  3. Shared stack definition with per-stack config

In that example, it'd be helpful to use simple AWS concepts like VPC, Lambda, and RDS.

My end goal is that I'd like someone -- as quickly as possible -- to absorb the stacks concept from reading the walkthrough so that they can more quickly react to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explain Why .terraform.lock.hcl Files Exist in the Walkthrough Add Practical Examples Using AWS Resources
2 participants