feat: Adding AWS stack walkthrough #3

yhakbar · 2024-09-04T21:59:11Z

Adds a practical example of how Stacks can be used to solve problems with real AWS infrastructure.

Closes #1.
Closes #2.

Review notes:
This will be pretty hard to read in the PR. It's probably easier to just start reading from here

josh-padnick · 2024-09-06T01:47:00Z

walkthrough/05-aws/README.md

+It's a safe guess that most Terragrunt users use it to manage AWS infrastructure.
+
+Here we'll take a look at how Stacks can be used to manage AWS infrastructure in a more scalable way.


/s/Many Terragrunt users use Terragrunt to manage AWS infrastructure, so here we'll take a look...

Do we actually know that to be true? I didn't.

josh-padnick · 2024-09-06T01:49:47Z

walkthrough/05-aws/01-no-stacks/README.md

+
+When writing up this walkthrough, I did so in the following order:
+
+1. I created the `live/dev/services/api/terragrunt.hcl` file and created a `live/dev/services/api/main.tf` file with the minimal configuration to create a Lambda function.


and created a modules/api/main.tf

Oh, I see you really did create it in live first. You might say that you built a blueprint/module first. Otherwise, it's confusing why you're writing Tofu code where the Terragrunt code normally lives.

It's how we teach users Terragrunt:
https://terragrunt.gruntwork.io/docs/getting-started/quick-start/

It's the "minimal overhead" approach to adopting Terragrunt if you're currently using OpenTofu.

Understood. In my case, I was confused why you started that way. Might be worth clarifying.

josh-padnick

Lots of great content and ideas here, and this is a solid notch better than the previous walkthroughs, but still some improvements to make IMO. Thank you for considering all the feedback!

walkthrough/05-aws/01-no-stacks/README.md

josh-padnick · 2024-09-06T01:52:07Z

walkthrough/05-aws/01-no-stacks/README.md

+
+When writing up this walkthrough, I did so in the following order:
+
+1. I created the `live/dev/services/api/terragrunt.hcl` file and created a `live/dev/services/api/main.tf` file with the minimal configuration to create a Lambda function.


Oh, I see you really did create it in live first. You might say that you built a blueprint/module first. Otherwise, it's confusing why you're writing Tofu code where the Terragrunt code normally lives.

walkthrough/05-aws/01-no-stacks/README.md

+6. Finally, I setup the `terragrunt.hcl` file at the root of this walkthrough so that the contents in the `live` directory could be stored in a central state backend, using an S3 bucket.
+
+This is a common pattern when writing IaC configurations. Start with adding a small piece of the infrastructure, and iterate on it until it's correct. Then, refactor out the common patterns into modules and shared configurations, then repeat the process.
+


walkthrough/05-aws/02-stacks/README.md

josh-padnick · 2024-09-06T01:59:56Z

walkthrough/05-aws/02-stacks/README.md

+## Next Steps
+
+The [next chapter](../03-no-shared/) will demonstrate how a little more refactoring can further reduce the amount of code that has to be maintained.
+


Separate comment, but I'm looking at:

The dev stack

The prod stack

And they're copying the stack definition. So...if I decide that all my stack instances should now have an S3 bucket, I have to manually update every one of these definitions. If I have to do that...what's the point of using stacks?

The point at this stage is that the definition of the units are separately defined here instead of repeatedly in each "Stack" as copied terragrunt.hcl files.

This way, if dev and prod reference the same Unit definition, there's only one place to edit Unit configurations instead of N places.

As you point out later, there is a DRY-er approach where the entire Stack is an encapsulated artifact, but users should be able to chose the level of abstraction that works for them.

I actually love that ability to choose the abstraction level, but it wasn't clear to me that's an option here, so might be worth making that more explicit -- that here we're copying the stack definition, but that if you want to reuse the same dedinition, you'll show that later.

josh-padnick · 2024-09-06T02:01:03Z

walkthrough/05-aws/02-stacks/README.md

+## Next Steps
+
+The [next chapter](../03-no-shared/) will demonstrate how a little more refactoring can further reduce the amount of code that has to be maintained.
+


It's not clear why the mock-stage-generate.sh file is there, though maybe that goes away when we publish this to the community b/c stacks will have been implemented.

This explains why it's there:
https://github.com/yhakbar/terragrunt-3313-stacks-walkthrough/tree/main/walkthrough/04-stacks/01-basic#01---basic

josh-padnick · 2024-09-06T02:04:51Z

walkthrough/05-aws/04-reusing-stacks/README.md

@@ -0,0 +1,179 @@
+# 03 - Reusing Stacks
+
+This chapter will detail how Stacks themselves can be re-used by Stacks to reproduce a set of infrastructure instead of just reproducing individual Units.


Why are we jumping right to this advanced use case of recursive stacks...before we've even shown inputs, or really played around with the stacks lifecycle at all!

Before getting into recursion, I'd expect you to walk through:

Anatomy of a stack

New blocks like unit

How you define a unit template

How you define a stack

How you reuse a stack

How you pass input vals

What happens when you update a stack definition

What happens when you update a stack instance configuration

What happens when you destroy a stack

Ohh...you're using recursion to introduce the concept of a stack definition. Ok, I think what's throwing me off here is that I'm thinking in terms of how I want to solve the problem whereas you're writing to showcase one feature at a time. So when the feature you intro seems "incomplete" to me, I get confused.

I love that I can look through the file system that corresponds to your README; that's a really neat idea, but maybe we should treat this as an "advanced" walkthrough and create a separate "get an idea of how it works" walkthrough that introduces the concepts in the following order:

Plain old Terraform/Tofu

Terragrunt Units

Terragrunt stacks with a unique definition per stack and no inputs

Terragrunt stacks with a shared definition

Terragrunt stacks with inputs

That's chapters 1-4, right?

https://github.com/yhakbar/terragrunt-3313-stacks-walkthrough/tree/main/walkthrough

josh-padnick · 2024-09-06T02:15:34Z

walkthrough/05-aws/04-reusing-stacks/live/dev/terragrunt.stack.hcl

+stack "stateful_service" {
+	source = "../../stacks/stateful-service"
+	path   = "stateful-service"
+}


It's confusing that we use the file name terragrunt.stack.hcl for both an instance and the definition of the stack. I wonder if we should introduce a convention like the following:

terragrunt.stack.hcl and terragrunt.stack.def.hcl are both valid stack file names

We recommend using terragrunt.stack.def.hcl to define your Stack definitions, though you can use them for whatever you want.

We might do something similar with units (e.g. terragrunt.def.hcl)

I don't see what that buys us.

Users are already using OpenTofu/Terraform, where .tf files are used both for module files and instantiating modules. I don't think we need to complicate matters, at least not for the initial release, if a more minimal implementation with one valid filename will do the job just as well.

josh-padnick · 2024-09-06T02:21:37Z

walkthrough/05-aws/04-reusing-stacks/README.md

+...
+	name = "${local.project_name}-${basename(get_terragrunt_dir())}-${local.random_id}-${local.environment}"
+...
+```


The idea that we use the inputs of the unit for the inputs of the stack is not at all expected, but it's actually really cool. I love the idea that you could define the unit in such a way that the path sets the input value.

But...it's hard to read and understand, so I'd still really like to see us introduce a "base case" approach to passing inputs to units where you can see inputs specified inline. We could then highlight the limitations of this and call out that users could choose from different options with different pros/cons.

At the end of the day, it's just really jarring to see an instance of a stack (a terragrunt.stack.hcl file), and see no configuration whatsoever. The fact that the stack instance and the stack instance configuration are separated or inferred is really surprising, so I'm worried that both Stack authors and Stack consumers (people who just drop into the codebase) are going to be confused by what's happening.

Stacks don't have inputs. The single responsibility of a Stack is to define how Units (or other Stacks) fit together as part of a collective.

I think this is an X Y problem. The thing users want isn't actually to set inputs on Stacks. It's to have Stacks with dynamic Units.

I know we've debated the pros and cons of this quite a bit, but I would like to make two final arguments for why I think we should start with this approach:

This pattern is what users currently have in their codebases. Terragrunt users currently don't have a way to parameterize terragrunt.hcl files, so they already have logic like this to dynamically determine inputs based on pulling in external configurations, using the names of directories, etc.

Having this be how you make Stacks dynamic on release as the "base case" is how we maximize the likelihood that zero code changes to terragrunt.hcl files have to occur in order to adopt Stacks (note that in this walkthrough I explicitly pointed out that I didn't edit the terragrunt.hcl files at all under This chapter was authored by:).

In the next chapter, the only way terragrunt.hcl files are edited is to simplify them using configurations that already exist in Terragrunt, and work today. No new syntax, and less indirection.

This walkthrough wouldn't work with inputs on terragrunt.stack.hcl files.

Note that all of the examples in this repository actually work using the latest release of Terragrunt. The only "magic" functionality used here is the bash script that generates the Stack.

While I could just tell readers of the walkthrough to use their imaginations, the practical takeaway of that fact is things like:

a. Introducing Stacks involves less moving parts in the codebase (no additional logic has to be introduced to have inputs merged between those defined on terragrunt.hcl files and terragrunt.stack.hcl files via some sort of override mechanism.

b. The CI systems that users have built will require no updates. Users can adopt Stacks with zero changes to their CI systems, as from an operational perspective, Stacks are transparent. Units work just like they always have, and users can leverage diffs, etc to determine which Units need updates.

OK, it was kind of four arguments, but hopefully this is persuasive that we are fine to move forward with the RFC as is.

Hm, this point I disagree on.

The thing users want isn't actually to set inputs on Stacks. It's to have Stacks with dynamic Units.

Strong disagree. Consider Brian's canonical use case from GruntCon: deploy a common set of units, each with slightly different configuration. At the end of the day, there is a need for both a stack definition and stack instance, and each stack instance needs the potential for unique configuration. In my mind, having some solution for each of these is table stakes for this feature.

At an absolute minimum, we need to demonstrate how we handle each of those concepts with the proposed implementation.

Terragrunt users currently don't have a way to parameterize terragrunt.hcl files

In my mind the Tofu module is the "unit definition" and the terragrunt.hcl file is the "unit instance", and you specify inputs to give configuration.

I don't have more time to dig into this right now, but I'd be grateful if you could address the above points.

Is it clear that the same terragrunt.stack.hcl file is the definition of a stack, and the instance of a stack in different contexts?

In this example, we're using this file as the "definition" of a stack:
https://github.com/yhakbar/terragrunt-3313-stacks-walkthrough/blob/main/walkthrough/05-aws/04-reusing-stacks/stacks/stateful-service/terragrunt.stack.hcl

When a stack references it:
https://github.com/yhakbar/terragrunt-3313-stacks-walkthrough/blob/main/walkthrough/05-aws/04-reusing-stacks/live/dev/terragrunt.stack.hcl

It generates a .terragrunt-stack/stateful-service directory containing that terragrunt.stack.hcl file, which then recursively generates more .terragrunt-stack directories until all Stacks have been generated.

The terragrunt.stack.hcl file in the .terragrunt-stack/stateful-service directory is now an "instance" of the stack, with the same contents.

The only file that's committed to the repo is the "definition", but having the file work both ways is what allows Stacks to be recursively nested.

Each instance does have the potential for unique configuration, they just don't do it via inputs.

I tried to explain this here. If you could, please let me know what part of that doesn't address the concerns here.

josh-padnick · 2024-09-06T02:22:38Z

walkthrough/05-aws/04-reusing-stacks/README.md

+
+> TBD
+>
+> Please submit feedback if you would like to see more content here.


One thing missing here is the canonical use case of specifying different configurations for different stack instances using plain old literals. Could you show how each stack would receive different literal values, similar to how each terragrun.hcl file can specify literals directly in the inputs?

They don't.

The design here is that you always define all the configurations for all of your inputs to OpenTofu/Terraform modules in the terragrunt.hcl files. Full stop. There's no hidden secondary way to override values.

Authors of terragrunt.hcl files have to decide that they want to pull external configurations like the environment.hcl file instead of having overriding configurations pushed from above further removed from the context of how the inputs are going to be used in OpenTofu/Terraform modules.

Understood, but from everything I've seen, users want that per-stack configuration. The fact that it's "configurable" that way is actually pretty cool, but it's not at all obvious. Could you update the walkthrough to highlight this very point? Mainly, that:

All inputs are defined in the units themselves

But if you wanted each stack to have its own unique configuration, here's an example of how you'd configure the units and the stack itself.

It'd be especially helpful to see this with a canonical stack definition which has several stack instances, each with different configuration.

If I may, I understand your pushback on some of my comments, but the idea of stack definitions and stack instances seems fundamental to the problem, so I ask that you explicitly address that use case, even if the design doesn't treat those as first-class concepts.

I mentioned above that I tried to tackle that here:
https://github.com/yhakbar/terragrunt-3313-stacks-walkthrough/tree/main/walkthrough/05-aws/04-reusing-stacks#stack-dynamicity

josh-padnick · 2024-09-09T01:23:49Z

One final thought: I'm a little confused about the role of all the different walkthroughs. We have the "chicken" one, and now this is the "AWS" one. They're each thoughtful, through, and insightful, but what I'm really looking for is a single, minimal walkthrough of the stacks concept that we can include in a blog post (and as a summary for the GitHub issue RFC) that introduces the stack concept, the elements of a stack (e.g. the unit block), and talks about how you'd want to use stacks from the perspective of common use cases:

Hello world stack
Shared stack definition
Shared stack definition with per-stack config

In that example, it'd be helpful to use simple AWS concepts like VPC, Lambda, and RDS.

My end goal is that I'd like someone -- as quickly as possible -- to absorb the stacks concept from reading the walkthrough so that they can more quickly react to it.

feat: Adding AWS stack walkthrough

785c722

yhakbar merged commit b061355 into main Sep 5, 2024

yhakbar deleted the feat/adding-aws-stack branch September 5, 2024 19:41

josh-padnick reviewed Sep 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Adding AWS stack walkthrough #3

feat: Adding AWS stack walkthrough #3

yhakbar commented Sep 4, 2024 •

edited

Loading

josh-padnick Sep 6, 2024

yhakbar Sep 6, 2024

josh-padnick Sep 6, 2024

josh-padnick Sep 6, 2024

yhakbar Sep 6, 2024

josh-padnick Sep 6, 2024

josh-padnick left a comment

josh-padnick Sep 6, 2024

This comment was marked as resolved.

josh-padnick Sep 6, 2024

yhakbar Sep 6, 2024

josh-padnick Sep 6, 2024

josh-padnick Sep 6, 2024

yhakbar Sep 6, 2024

josh-padnick Sep 6, 2024

josh-padnick Sep 6, 2024

yhakbar Sep 6, 2024

josh-padnick Sep 6, 2024

yhakbar Sep 6, 2024

josh-padnick Sep 6, 2024

yhakbar Sep 6, 2024

josh-padnick Sep 6, 2024

yhakbar Sep 17, 2024

josh-padnick Sep 6, 2024

yhakbar Sep 6, 2024

josh-padnick Sep 9, 2024

yhakbar Sep 17, 2024

josh-padnick commented Sep 9, 2024

		It's a safe guess that most Terragrunt users use it to manage AWS infrastructure.

		Here we'll take a look at how Stacks can be used to manage AWS infrastructure in a more scalable way.


		When writing up this walkthrough, I did so in the following order:

		1. I created the `live/dev/services/api/terragrunt.hcl` file and created a `live/dev/services/api/main.tf` file with the minimal configuration to create a Lambda function.

		6. Finally, I setup the `terragrunt.hcl` file at the root of this walkthrough so that the contents in the `live` directory could be stored in a central state backend, using an S3 bucket.

		This is a common pattern when writing IaC configurations. Start with adding a small piece of the infrastructure, and iterate on it until it's correct. Then, refactor out the common patterns into modules and shared configurations, then repeat the process.

		## Next Steps

		The [next chapter](../03-no-shared/) will demonstrate how a little more refactoring can further reduce the amount of code that has to be maintained.

		@@ -0,0 +1,179 @@
		# 03 - Reusing Stacks

		This chapter will detail how Stacks themselves can be re-used by Stacks to reproduce a set of infrastructure instead of just reproducing individual Units.

feat: Adding AWS stack walkthrough #3

feat: Adding AWS stack walkthrough #3

Conversation

yhakbar commented Sep 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh-padnick left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh-padnick commented Sep 9, 2024

yhakbar commented Sep 4, 2024 •

edited

Loading