-
Notifications
You must be signed in to change notification settings - Fork 41
Add new resource for Device Network Mode #240
Conversation
The Acceptance tests succeed. |
Hey @t0mk, Also what is the significance of the string |
@c0dyhi11, I was thinking about keeping it in the old way, but if we keep monitoring/changing the network mode in the packet_device, there is no benefit of having the new resource. If the device resource still waits for the network_type I.e. I think that if we won't remove the network_type handling code from the packet_device, there will be no benefit of this PR. The |
Is there a way we can have a very helpful error message if someone tries to implement the old way? Something along those lines? |
@c0dyhi11 Yes, See hashicorp@d25aab3 in this PR. |
Awesome. Let me prep the |
@c0dyhi11 OK. Thanks. Ping me when you need the provider release with this. Also, you can fix the provider version in the google-anthos master branch to 2.10.1 and do a branch for provider 3.0.0. Considering how extensively you use the TF provider there, it's probably a good idea to fix the version anyway. |
I have concerns about the direction of this PR. (other than that, it looks good for what it does 😄 - see some notes below ("If we proceed in this direction:")) Terraform represents the upstream cloud provider API and the expectation for Terraform users is for the TF provider to faithfully represent the upstream API except in places where that API is not stateful and must be reinterpreted in a stateful way. In the Packet API, The problem being addressed here is that the Packet API is, at times, returning incorrect responses (suspected to be caching). This is a Packet API problem and should be addressed there. This PR has found a creative way to circumvent the caching problem, through cache-busting strategies introduced in this PR (by introducing excludes parameters in the API request). This strategy could be applied whenever the Device state is being fetched (fetch the device and then fetch the network_type with a cache-buster). This client-side fixing strategy could be implemented in packngo to address equinixmetal-archive/packngo#133. By addressing this problem in packngo the Terraform provider would only need a packngo import bump.
Help me understand the other problems that this approach is improving. As far as I know, the only attribute experiencing drift, due to perceived API bugs, is the network_type. If other attributes are experiencing drift they should be addressed separately. We shouldn't need to introduce a new A Does this cache-busting approach alone address the problem? Do any added benefits of this resource justify breaking the API, existing modules, and configurations? As you pointed out, one of the benefits is that
In this scenario, it would seem that a post provision If we proceed in this direction:
|
Hi @displague , thanks for your comment.
This is not the case in general, see the *_attachment resources.
It could be done in packngo, but it can't be configurable then. @c0dyhi11 noticed that in his case, the
The Network Mode of the device is not a common attribute. It can be read simply from jq path ".network_ports[0].network_type", of a device resource JSON, but it can't be set via the device resource. Setting the network mode is a series of API operations, best seen in https://github.com/packethost/packngo/blob/master/ports.go#L321. There's bonding, disbonding and port layer conversions. If those ops are done in the proper order, the device will most likely end up in the desired network mode. You can also ask Packet frontend people about the network mode, as they do it the same way, but in Javascript. I put the network mode change to a separate resource because
Sorry, I don't understand what you mean by this.
Only the device ID is the "Id()" of the packet_device_network_mode. The "mode" is an updatable attribute, storing the state of network_mode of the device. I don't understand what you mean by "primary_key" in context of TF provider resources.
Of course the new resource is tested, I always add tests when possible. See
I don't know what you mean by this.
I renamed it because it's referred to as "Mode" in the API, e.g.:
.. and I thought it's good to be consistent in terminology for UX. I know it will break some setups, but the deprecation strings are informative, I think people can't get confused. |
Thanks for those insights, @t0mk! I'm really seeing the benefits of this approach based on your thorough response. I'll try to clarify some of my earlier points. Largely, you've convinced me of the merits of this approach. I'll follow-up with my code based feedback.
Attachment resources (in any provider) are one of those cases where the general Terraform best practice of faithfully representing the cloud provider's API starts to bend. Providers have the option to represent resources like these IPs and Ports as maps or slices within the resource (device). In this case, the TF Device resource could align with the For some API properties, it certainly does make more sense to offer new resources, especially when that resource creates or has a dependency relationship with provisioning phases. I can see how this would be the case for volume_attachments and port_vlan_attachments. I'm not yet convinced that IP attachments fit these criteria, but I have lots to learn about that and I'm sure you've encountered these use-cases and can speak to the benefits.
This is something that I didn't understand. Thanks for these details and references!
Terraform providers can store partial state to overcome this
But there may be some reasons not to introduce this:
I was not at all clear there. I was thinking about how the google-anthos module changes the network type within a python script provisioner on the device, changing the state of the device after Terraform has provisioned it. This would create a state drift for Terraform, which would cause subsequent Your new proposed resources offer a way to both heal and avoid that condition because network_mode (network_type) state changes and provisioners can be chained. A python script (provisioner) that changes network_type among other things could omit the network type changes, deferring that to Terraform. The python script, as a secondary provisioner with a This is a powerful addition following the precedent of
My mistake, I meant It seems users will have to be careful about the order that the network_attachment is defined. I see that you covered that in the docs: https://github.com/terraform-providers/terraform-provider-packet/pull/240/files#diff-8dd49a56d3344614491b5395d96b9d0eR36 🚀
I missed the test additions because I expected to find new acceptance tests, with a new resource. The tests you added seem sufficient, but it may help to split these tests into multiple
## Import
Packet Device Network Modes can be imported using the device id, e.g.
```
terraform import packet_device_network_mode.mydevicenetmode {device-uuid}
```
I'd argue that this is not represented consistently and I would defer to the name of the API field. The Web UX and other documentation refer to Servers rather than Devices, for example. In Terraform, generally, the provider's upstream API is the source of truth (or at least names). https://www.terraform.io/docs/extend/best-practices/naming.html#naming
|
Required: true, | ||
ValidateFunc: validation.StringInSlice([]string{"layer3", "layer2-bonded", "layer2-individual", "hybrid"}, false), | ||
}, | ||
"http_get_excludes": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you see value in this attribute if the API caching issue were completely resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's obviously a workaround for a current issue. Attributes and fields can be added and deprecated with every provider release (for which there's no fixed schedule).
I wanted to offer fast, simple and temporary workaround.
So no, I don't. And with this PR stalling, I see the value of this attribute decreasing every day. But then again, if not for the API caching issue, there's no value in any code, or walls of text in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API problem has been resolved, network_type
should be stable now. 🎉
I think the new resource you are presenting here is still valuable and valid for the reasons that you've given (and defended ❤️).
A low level, invisible, toggle like http_get_excludes
probably doesn't belong in this PR, anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for helping me understand these changes!
You suggest
OK, I'll add the test steps! It's a good idea to keep the number of acceptance tests low, they're run at the same time (more or less) and they eat quite a lot of resources. All possible resources obviously must be tested, but should not be tested twice. I will also add imports to the doc of the new resource. If you ack all this, I will do the changes. |
Hi @displague, thanks for your feedback I will
I would like to still clarify these:
|
That sounds good @t0mk. Let's stick with network_type in both cases. Do you think users will run into issues with the computed network_type in "device" lagging a refresh cycle behind the packet_device_network_type? Perhaps, if you think this would create a bad experience, the network_type notice should offer advice here, to not use the computed network_type when also using the new resource type. |
OK
Yes, I think that using the packet_device_network_type will make the network_type attr of packet_device inherently incorrect, as the p_d_n_t resource will modify the network type, and the TF state of parent packet_device will not be updated. I ran into this problem when trying to do the 2-phase acceptance tests.
I will make the network_type attr packet_device "Deprecated" (https://www.terraform.io/docs/extend/best-practices/deprecations.html#provider-attribute-removal), it will show that it's now recommended to use the new resource. |
}, | ||
Type: schema.TypeString, | ||
Computed: true, | ||
Deprecated: "You should handle Network Type with the new packet_device_network_type resource.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@displague Please suggest better deprec msg if you feel like.
@displague I did changes that we've agreed. I squashed some commits because there was a lot of changes which were reverted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, @t0mk! I really appreciate the detail you put into the added acceptance tests around network type modifications.
This PR decouples the Network Mode handling from the device resource. We've had a lot of issues with the network mode reading, setting and waiting for it. If the API calls fails, it's better if it's in a separate resource, because some devices have very long provisioning times and re-creating of fixing TF state becomes really ugly.
Relevant issues:
The new usage will then look as:
Note that the
packet_port_vlan_attachment
must depend onpacket_device_network_mode
, not onpacket_device
, so that TF waits for the network mode change.I also added a
http_get_excludes
param, which can be used to fool the Packet API cache and maybe workaround the bug. If TF hangs on the state change, try to addproject_lite
to the excludes, and it might just avoid the cache hit: