core: Internal implementation of the "refresh only" planning mode and "replace" planning option #28560

apparentlymart · 2021-04-30T01:14:12Z

This PR adds the internal handling of what will eventually become the -refresh-only and -replace=ADDR options when creating a plan. There's not yet any UI-level code to activate these two, so for the moment these are just dead code from an end-user perspective, but I want to land these parts first and then iterate separately on how to present the effects of these in the UI, since I expect that'll be a more subjective discussion with more tradeoffs to weigh.

For "refresh only",at the lowest level of abstraction inside the graph nodes themselves this effectively mirrors the existing option to disable refreshing with a new option to disable change-planning, so that either "half" of the process can be disabled. As far as the nodes are concerned it would be possible in principle to disable both, but the higher-level representation of these modes prevents that combination from reaching Terraform Core in practice, because we block activating both "skip refresh" and the refresh-only planning mode at the same time when creating a context.

The force-replace option is ultimately handled inside the NodeAbstractResourceInstance.plan method, in the same place we handle the similar situation of the provider indicating that replacement is needed, and so the rest of the changes here are just to propagate the settings through all of the layers in order to reach that point. The only externally-visible difference in the result between those two situations is the new ActionReason field we added earlier in #28544, which we'll use in a later commit to describe this situation differently when we present the planned changes to the user.

This also introduces a new Mode field for plans which records which mode was active when the plan was created. As with ActionReason in the earlier PR, this is something we'd intentionally excluded so far in fear of it becoming a load-bearing part of the plan/apply process, but our later UI work here will need to include an explanation in the UI for why a refresh-only plan doesn't plan to change anything and so having core report that out, rather than replicating similar logic in the UI layer, again seems preferable to ensure that these will always stay aligned over time as the rules might change. This field is explicitly documented as for presentation only, since the details in the Changes object should always be totally sufficient to correctly execute the apply step: the plan mode just affects what sort of actions Terraform might include in the plan, not how those chosen actions are to be executed.

Since these two changes have some top-level overlap but are otherwise in different parts of the core package, it might be easier to review them on a per-commit basis.

This only includes the core mechanisms to make it work. There's not yet any way to turn this mode on as an end-user, because we have to do some more work at the UI layer to present this well before we could include it as an end-user-visible feature in a release. At the lowest level of abstraction inside the graph nodes themselves, this effectively mirrors the existing option to disable refreshing with a new option to disable change-planning, so that either "half" of the process can be disabled. As far as the nodes are concerned it would be possible in principle to disable _both_, but the higher-level representation of these modes prevents that combination from reaching Terraform Core in practice, because we block using -refresh-only and -refresh=false at the same time.

This only includes the internal mechanisms to make it work, and not any of the necessary UI changes to "terraform plan" and "terraform apply" to activate it yet. The force-replace options are ultimately handled inside the NodeAbstractResourceInstance.plan method, at the same place we handle the similar situation of the provider indicating that replacement is needed, and so the rest of the changes here are just to propagate the settings through all of the layers in order to reach that point.

mildwonkey

Let's see if my inline comments take this time 😉

mildwonkey · 2021-04-30T12:41:42Z

plans/internal/planproto/planfile.proto

+    // This is saved only for UI purposes, so that Terraform can tailor its
+    // rendering of the plan depending on the mode. This must never be used to
+    // make decisions in Terraform Core during the applying of a plan.
+    Mode mode = 17;


What do you think about changing the name of this to really make the intent obvious? The comment is great, and I'm happy to call it sufficient, but I wouldn't mind renaming the field ModeForDisplay or something that's as descriptive as some of our functions that only print for-display-purposes-only strings (literally ForDisplay())

Ooh! If we foresee other similar fields in the future, I wouldn't be opposed to a plans.DisplayInfo (or whatever name makes more sense) struct that contains this information, and those other fields.

This is a good point, and the only thing that gives me pause is that we usually use "for display" to mean something than be directly displayed to the user, rather than an indicator used as part of choosing what to display. But the string representations of these modes are not user-friendly, at least as currently designed.

I'm thinking about ui_mode as an alternative. I think it isn't quite right but hopefully at least sufficient to prompt a potential user of it to notice the prefix is there and wonder what it means enough to come read this comment. What do you think?

r.e. similar fields: I hesitate to generalize this too soon cause we only have two examples of this so far and they've been in different parts of the structure anyway. Since our plan file format isn't something we promise as stable between releases we will be free to reshape this later if we have new requirements that suggest a different design.

Absolutely, I'm down with ui_mode - anything that makes the scope of that field unavoidably obvious to a developer (me!) is awesome.
I also agree (not that you said this directly) that my second suggestion is a premature optimization, and we can see what's actually useful in time.

mildwonkey · 2021-04-30T12:56:22Z

terraform/node_resource_abstract_instance.go

+		// address to match, but just in case a user forgets to include the
+		// instance key for a multi-instance resource we'll give them a
+		// warning hint.
+		if n.Addr.Resource.Key != addrs.NoKey && candidateAddr.Resource.Key == addrs.NoKey {


🤔 I'm ambivalent on this. It's a really great message for the user, but this is going to generate a diagnostic for every ResourceInstance, which would be pretty unpleasant with anything more than a small number of resource instances. The test case, for example, generates two messages. Can we do this validation earlier, in the command package instead of terraform perhaps? Check if the replace target references an existing resource from state but missing a key?

I liked doing this down here because it allows the error message to include a precise example of what to run to get the intended result, rather than making the user figure out themselves what the instance keys for this resource might be. But I agree it does seem quite bulky.

In principle we could check it earlier in the pre-expansion graph node and generate one message that lists out N separate options to add, but my assumption here is that the primary use-case for this option is that one object has become degraded and needs replacing, and so it felt weird to encourage just unilaterally replacing the whole set rather than picking out just the one you intended to select.

🤔 I'm not sure either. What do you think about running with this implementation for now and then considering adjustments once we see how this option gets used? Hopefully some real-world feedback will help evaluate what cases are most common here and thus what the best UI might be, and specific error message text is never a compatibility constraint so we can freely change it over time as we learn more.

Hrm, I see what you're saying. My feeling is that anything more than one message is too many, and anyone with, say, 3 or more resource instances will end up with too much output to get a useful signal. I would rather see an earlier message that said "here are all of the instances of this resource address, pick one", but I'm also amenable to your suggestion that we wait and see what feedback we get.

I'd have to see what the actual output is, but I have 2 suspicions about what it will look like... the user misses one instance key and gets a wall of diagnostic showing lots of instances they don't care about, or the diagnostics are collapsed and they get a message about one random instance they don't care about.

I think it may be better to do this earlier and craft a message about the address with a list of possible keys. We can also run with this to get the implementation in place and change it later if we want.

Emitting a single message earlier would mean doing this check again in another place, which is fine but I'm inclined to save it for a later enhancement because while I'm still working through this it's likely I'll be fiddling with these details some more before we reach the end of this PR sequence and I'd rather only churn one place. I'll return to this once I get stuck into the UI work for both of these and see about adding a new check in a different graph node.

mildwonkey

This all LGTM; I approve regardless of what you decide to change - including nothing! - based on our conversation.

This is to make it more obvious at all uses of this field that it's not something to be used for anything other than UI decisions, hopefully prompting a reader of code elsewhere to refer to the comments to understand why it has this unusual prefix and thus see what its intended purpose is.

github-actions · 2021-05-31T02:44:19Z

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

apparentlymart added 2 commits April 29, 2021 17:58

apparentlymart added enhancement core 0.15-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged labels Apr 30, 2021

apparentlymart requested a review from a team April 30, 2021 01:14

apparentlymart self-assigned this Apr 30, 2021

mildwonkey reviewed Apr 30, 2021

View reviewed changes

mildwonkey approved these changes Apr 30, 2021

View reviewed changes

apparentlymart merged commit 06adc69 into main Apr 30, 2021

apparentlymart deleted the f-plan-refresh-only-and-replace branch April 30, 2021 17:31

teamterraform mentioned this pull request Apr 30, 2021

Backport of core: Internal implementation of the "refresh only" planning mode and "replace" planning option into v0.15 #28568

Closed

This was referenced May 1, 2021

command: New -replace=... planning option #28574

Merged

Refresh-only plans and forced-replacement #28297

Closed

github-actions bot locked as resolved and limited conversation to collaborators May 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: Internal implementation of the "refresh only" planning mode and "replace" planning option #28560

core: Internal implementation of the "refresh only" planning mode and "replace" planning option #28560

apparentlymart commented Apr 30, 2021

mildwonkey left a comment

mildwonkey Apr 30, 2021

mildwonkey Apr 30, 2021

apparentlymart Apr 30, 2021

apparentlymart Apr 30, 2021

mildwonkey Apr 30, 2021

mildwonkey Apr 30, 2021

apparentlymart Apr 30, 2021

mildwonkey Apr 30, 2021

jbardin Apr 30, 2021

apparentlymart Apr 30, 2021

mildwonkey left a comment

github-actions bot commented May 31, 2021

core: Internal implementation of the "refresh only" planning mode and "replace" planning option #28560

core: Internal implementation of the "refresh only" planning mode and "replace" planning option #28560

Conversation

apparentlymart commented Apr 30, 2021

mildwonkey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mildwonkey left a comment

Choose a reason for hiding this comment

github-actions bot commented May 31, 2021