-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: Internal implementation of the "refresh only" planning mode and "replace" planning option #28560
Conversation
This only includes the core mechanisms to make it work. There's not yet any way to turn this mode on as an end-user, because we have to do some more work at the UI layer to present this well before we could include it as an end-user-visible feature in a release. At the lowest level of abstraction inside the graph nodes themselves, this effectively mirrors the existing option to disable refreshing with a new option to disable change-planning, so that either "half" of the process can be disabled. As far as the nodes are concerned it would be possible in principle to disable _both_, but the higher-level representation of these modes prevents that combination from reaching Terraform Core in practice, because we block using -refresh-only and -refresh=false at the same time.
This only includes the internal mechanisms to make it work, and not any of the necessary UI changes to "terraform plan" and "terraform apply" to activate it yet. The force-replace options are ultimately handled inside the NodeAbstractResourceInstance.plan method, at the same place we handle the similar situation of the provider indicating that replacement is needed, and so the rest of the changes here are just to propagate the settings through all of the layers in order to reach that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see if my inline comments take this time 😉
// This is saved only for UI purposes, so that Terraform can tailor its | ||
// rendering of the plan depending on the mode. This must never be used to | ||
// make decisions in Terraform Core during the applying of a plan. | ||
Mode mode = 17; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about changing the name of this to really make the intent obvious? The comment is great, and I'm happy to call it sufficient, but I wouldn't mind renaming the field ModeForDisplay
or something that's as descriptive as some of our functions that only print for-display-purposes-only strings (literally ForDisplay()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh! If we foresee other similar fields in the future, I wouldn't be opposed to a plans.DisplayInfo
(or whatever name makes more sense) struct that contains this information, and those other fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point, and the only thing that gives me pause is that we usually use "for display" to mean something than be directly displayed to the user, rather than an indicator used as part of choosing what to display. But the string representations of these modes are not user-friendly, at least as currently designed.
I'm thinking about ui_mode
as an alternative. I think it isn't quite right but hopefully at least sufficient to prompt a potential user of it to notice the prefix is there and wonder what it means enough to come read this comment. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r.e. similar fields: I hesitate to generalize this too soon cause we only have two examples of this so far and they've been in different parts of the structure anyway. Since our plan file format isn't something we promise as stable between releases we will be free to reshape this later if we have new requirements that suggest a different design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, I'm down with ui_mode
- anything that makes the scope of that field unavoidably obvious to a developer (me!) is awesome.
I also agree (not that you said this directly) that my second suggestion is a premature optimization, and we can see what's actually useful in time.
// address to match, but just in case a user forgets to include the | ||
// instance key for a multi-instance resource we'll give them a | ||
// warning hint. | ||
if n.Addr.Resource.Key != addrs.NoKey && candidateAddr.Resource.Key == addrs.NoKey { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I'm ambivalent on this. It's a really great message for the user, but this is going to generate a diagnostic for every ResourceInstance, which would be pretty unpleasant with anything more than a small number of resource instances. The test case, for example, generates two messages. Can we do this validation earlier, in the command package instead of terraform perhaps? Check if the replace target references an existing resource from state but missing a key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked doing this down here because it allows the error message to include a precise example of what to run to get the intended result, rather than making the user figure out themselves what the instance keys for this resource might be. But I agree it does seem quite bulky.
In principle we could check it earlier in the pre-expansion graph node and generate one message that lists out N separate options to add, but my assumption here is that the primary use-case for this option is that one object has become degraded and needs replacing, and so it felt weird to encourage just unilaterally replacing the whole set rather than picking out just the one you intended to select.
🤔 I'm not sure either. What do you think about running with this implementation for now and then considering adjustments once we see how this option gets used? Hopefully some real-world feedback will help evaluate what cases are most common here and thus what the best UI might be, and specific error message text is never a compatibility constraint so we can freely change it over time as we learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, I see what you're saying. My feeling is that anything more than one message is too many, and anyone with, say, 3 or more resource instances will end up with too much output to get a useful signal. I would rather see an earlier message that said "here are all of the instances of this resource address, pick one", but I'm also amenable to your suggestion that we wait and see what feedback we get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to see what the actual output is, but I have 2 suspicions about what it will look like... the user misses one instance key and gets a wall of diagnostic showing lots of instances they don't care about, or the diagnostics are collapsed and they get a message about one random instance they don't care about.
I think it may be better to do this earlier and craft a message about the address with a list of possible keys. We can also run with this to get the implementation in place and change it later if we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emitting a single message earlier would mean doing this check again in another place, which is fine but I'm inclined to save it for a later enhancement because while I'm still working through this it's likely I'll be fiddling with these details some more before we reach the end of this PR sequence and I'd rather only churn one place. I'll return to this once I get stuck into the UI work for both of these and see about adding a new check in a different graph node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all LGTM; I approve regardless of what you decide to change - including nothing! - based on our conversation.
This is to make it more obvious at all uses of this field that it's not something to be used for anything other than UI decisions, hopefully prompting a reader of code elsewhere to refer to the comments to understand why it has this unusual prefix and thus see what its intended purpose is.
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
This PR adds the internal handling of what will eventually become the
-refresh-only
and-replace=ADDR
options when creating a plan. There's not yet any UI-level code to activate these two, so for the moment these are just dead code from an end-user perspective, but I want to land these parts first and then iterate separately on how to present the effects of these in the UI, since I expect that'll be a more subjective discussion with more tradeoffs to weigh.For "refresh only",at the lowest level of abstraction inside the graph nodes themselves this effectively mirrors the existing option to disable refreshing with a new option to disable change-planning, so that either "half" of the process can be disabled. As far as the nodes are concerned it would be possible in principle to disable both, but the higher-level representation of these modes prevents that combination from reaching Terraform Core in practice, because we block activating both "skip refresh" and the refresh-only planning mode at the same time when creating a context.
The force-replace option is ultimately handled inside the
NodeAbstractResourceInstance.plan
method, in the same place we handle the similar situation of the provider indicating that replacement is needed, and so the rest of the changes here are just to propagate the settings through all of the layers in order to reach that point. The only externally-visible difference in the result between those two situations is the newActionReason
field we added earlier in #28544, which we'll use in a later commit to describe this situation differently when we present the planned changes to the user.This also introduces a new
Mode
field for plans which records which mode was active when the plan was created. As withActionReason
in the earlier PR, this is something we'd intentionally excluded so far in fear of it becoming a load-bearing part of the plan/apply process, but our later UI work here will need to include an explanation in the UI for why a refresh-only plan doesn't plan to change anything and so having core report that out, rather than replicating similar logic in the UI layer, again seems preferable to ensure that these will always stay aligned over time as the rules might change. This field is explicitly documented as for presentation only, since the details in theChanges
object should always be totally sufficient to correctly execute the apply step: the plan mode just affects what sort of actions Terraform might include in the plan, not how those chosen actions are to be executed.Since these two changes have some top-level overlap but are otherwise in different parts of the core package, it might be easier to review them on a per-commit basis.