Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema incompatibility on updating from 1.26.0 to 1.28.0 #2

Closed
shapirus opened this issue Oct 3, 2023 · 26 comments
Closed

Schema incompatibility on updating from 1.26.0 to 1.28.0 #2

shapirus opened this issue Oct 3, 2023 · 26 comments

Comments

@shapirus
Copy link

shapirus commented Oct 3, 2023

So I tried to change my TF kops provider from eddycharly/kops v1.26.0-alpha1 to clayrisser/kops v1.28.0. After updatind the obvious changes reported by TF (such as the root_volume attributes moved from separate attributes to a dedicated root_volume block) I am getting the following obscure error on terraform plan:

Planning failed. Terraform encountered an error while generating this plan.

╷
│ Error: missing expected [
│ 
│   with module.cluster.kops_cluster.cluster,
│   on ../modules/base_cluster/cluster.tf line 6, in resource "kops_cluster" "cluster":
│    6: resource "kops_cluster" "cluster" {
│ 
╵
Failed generating plan JSON
Exit code: 1

Failed to marshal plan to json: error marshaling prior state: unsupported attribute "access"
Operation failed: 2 errors occurred:
        * failed running terraform plan (exit 1)
        * failed generating plan JSON: failed running command (exit 1)

On further research, looking at the output of terraform providers schema -json, I found the following:

  • eddycharly/kops has the following in resource_schemas/kops_cluster/block:
              "kubernetes_api_access": {
                "type": [
                  "list",
                  "string"
                ],
                "description_kind": "plain",
                "optional": true
              },
  • clayrisser/kops has the following in resource_schemas/kops_cluster/block/block_types:
              "api": {
                "nesting_mode": "list",
                "block": {
                  "attributes": {
                    "access": {
                      "type": [
                        "list",
                        "string"
                      ],
                      "description_kind": "plain",
                      "optional": true
                    },
...

Apparently both are produced from api_spec's access field, but tf plan fails because of the schema change: the access attribute moved from its own kubernetes_api_access definition to a field in the api block.

What is the best way of migrating/upgrading in this case? Can it be handled in the provider?

@shapirus
Copy link
Author

shapirus commented Oct 4, 2023

Update. The error above only occurs when the api.access attribute is defined dynamically, in my case:

access = distinct(concat([aws_vpc.vpc.cidr_block], var.api_access))

If it is set to a fixed list, the error goes away, and the following error is produced, for yet another schema change:

Failed to marshal plan to json: error marshaling prior state: unsupported attribute "autoscale_priority"
Operation failed: 2 errors occurred:
        * failed running terraform plan (exit 1)
        * failed generating plan JSON: failed running command (exit 1)

I don't have autoscale_priority defined anywhere, so it must have been produced implicitly. I will check the previous and current specs and try to define it explicitly, maybe that'll help.

Still I guess this is something to be handled in the provider. There should be a clean upgrade path.

@shapirus
Copy link
Author

shapirus commented Oct 4, 2023

Another update. When autoscale_priority is set to an explicit value (an integer or even null), the error related to autoscale_priority goes away, but the original error related to access is produced again, even if the respective attribute is set to a fixed list.
Now this becomes really weird.

@shapirus
Copy link
Author

shapirus commented Oct 4, 2023

So in the end I saved raw state to a file using terraform state pull, edited it to remove the conflicting attributes and then tried to upload it back using terraform state push. Here's what happened:

  1. Failed to persist state: unsupported attribute "additional_sans" -- ok, I don't use this, can be removed
  2. Failed to persist state: unsupported attribute "capacity_rebalance" -- same
  3. Failed to persist state: unsupported attribute "public_name" -- this time, it's a core attribute in the api block which cannot be removed. It is also supported in both the 1.26.0-alpha.1 and 1.28.0 providers, so at this point I am out of further ideas.

I will also try to create a clean new cluster using 1.28.0. Will post an update on how it went (update: at least terraform plan works fine).

@shapirus
Copy link
Author

shapirus commented Oct 4, 2023

...yet another update :)

I did a string replacement: s/eddycharly/clayrisser/g in the state file and tried to terraform push it again. It solved the previous errors, but now I started to get more:

  1. Failed to persist state: unsupported attribute "root_volume_encryption" -- solved by removing this attr
  2. Failed to persist state: unsupported attribute "config_base" -- this can't be simply removed. It probably moved to a different location. This value is set from:
provider "kops" {
        state_store = "..."
}

So at this point I guess it's safe to say that upgrading/migration from eddycharly/kops v1.26.0-alpha.1 to clayrisser/kops v1.28.0 is not viable. Creating new clusters, however, should work (but I have yet to try to apply the plan).

@sl1pm4t
Copy link

sl1pm4t commented Oct 5, 2023

The config_base should look like this now:

  config_store {
    base = var.config_store_base
  }

Also, this is not exactly the same as the state_store value in the provider definition - it should be in the form:
<bucket>/<cluster_name>

@shapirus
Copy link
Author

shapirus commented Oct 6, 2023

That's a different thing.

╷
│ Error: Missing required argument
│ 
│ The argument "state_store" is required, but was not set.
╵

The state_store provider configuration attribute is required (the above error is produced when it's not set). It's the same parameter that is set by e.g. the KOPS_STATE_STORE environment variable for command-line kops (see https://kops.sigs.k8s.io/state/#state-store-configuration).

The config_store block is optional and serves a different purpose (see https://pkg.go.dev/k8s.io/kops/pkg/apis/kops#ConfigStoreSpec)

@shapirus
Copy link
Author

shapirus commented Oct 6, 2023

...can confirm that creating a clean new cluster from scratch works fine. Kubernetes 1.28.2 all right.

@clayrisser
Copy link
Owner

I recommend looking at the following for reference, since I was able to get it to work.

https://gitlab.com/bitspur/rock8s/rock8s-cluster/-/blob/main/main/cluster.tf?ref_type=heads#L65

Because I forked this project and am not the original author, I'm not really able to spend a ton of time other than getting the basics to work. Any support or pull requests from the community is much appreciated.

@shapirus
Copy link
Author

shapirus commented Nov 23, 2023

Yeah I understand. On my part, I guess I lack (immediate) knowledge to dig deep into this and understand it.

Either way, creating new clusters from scratch work, although with some cosmetic stuff that TF wants to update on second invocation. After second invocation it all runs fine.

At this point I'm more worried about the future of the TF kops provider(s) in general, since the only(?) existing one seems to be abandoned now, and there doesn't seem to be much enthusiasm in the community to take it over.

Probably the proper strategy would be to manage clusters without kops, with pure TF. Will see.

@clayrisser
Copy link
Owner

@shapirus this project is core to my work, so I don't mind being the defacto maintainer until someone else takes it over.

@sl1pm4t
Copy link

sl1pm4t commented Nov 23, 2023

I've also been actively working on my fork of the provider. The changes I've made have mostly been around better support for GCP so not likely to be of interest to either of you immediately. I've also contributed some fixes and features to kops itself (for GCP), and my TF provider fork is using my custom branch of kops until those are merged.

However, my company deploys kops clusters to both AWS and GC. Up until this point the AWS deploys have been using the kops CLI to generate Terraform config. I intend to swap that out to use the TF kops provider, therefore we will be invested in keeping it up to date.

Has anyone been contact with eddycharly and heard officially that the project is abandoned?
If so, perhaps we should form a GitHub org called terraform-kops (or something) and co-maintain a fork there @clayrisser ?

@clayrisser
Copy link
Owner

@sl1pm4t yeah, I would feel much more comfortable doing it that way. I just created the organization and add you as an owner.

https://github.com/terraform-kops

We can either move, fork, or create it brand new. If you want me to move this project (might be nice because it already has issues on it), then I need a few days because terraform is using this repo for publishing.

@clayrisser
Copy link
Owner

I'll also be adding some of my team members as members of the organization. They will help maintain it.

@clayrisser
Copy link
Owner

I hope @eddycharly doesn't mind if I add him to the organization also. If he happens to come back and contribute, it would be very welcome.

@sl1pm4t
Copy link

sl1pm4t commented Nov 24, 2023

Great thanks for doing that @clayrisser.
I think move this fork over to the new org. I'll start opening some PRs to bring in the changes I've made on my fork.
No rush on setting it up - I'll likely still need to use my own fork for a few weeks until the upstream changes to kops have been merged.

@shapirus
Copy link
Author

...this is how history is made :)

@mmckeen
Copy link

mmckeen commented Jan 22, 2024

Hey folks 👋 I'm interested in trying out this fork and possibly contributing to the new org.

Any chance I can get added?

@sl1pm4t
Copy link

sl1pm4t commented Jan 23, 2024

That'd be great @mmckeen - are you using kops at Fastly?

@clayrisser
Copy link
Owner

@mmckeen just added you. I should have this project migrated over by the end of this month.

@mmckeen
Copy link

mmckeen commented Jan 23, 2024

That'd be great @mmckeen - are you using kops at Fastly?

Indeed we are!

@eddycharly
Copy link

Hey folks 👋

@clayrisser i see the org but no repo ? Did something happen ?

I just posted this on the kops-dev slack channel https://kubernetes.slack.com/archives/C8MKE2G5P/p1708107317601559

@clayrisser
Copy link
Owner

Nothing happened. I haven't moved it yet. I will move it next week.

@eddycharly
Copy link

eddycharly commented Feb 16, 2024

Nothing happened. I haven't moved it yet. I will move it next week.

Cool, happy to see the project is not completely dead :)

@clayrisser
Copy link
Owner

Definitely not dead. Just got super busy with other things. Thank you for the reminder.

@sl1pm4t
Copy link

sl1pm4t commented Apr 11, 2024

Hi all - I created a fork in the org - https://github.com/terraform-kops/terraform-provider-kops
I haven't been able to publish to the TF registry yet, because for some reason the registry UI is not detecting the new org.

@clayrisser
Copy link
Owner

@sl1pm4t I see you got it working. Really appreciate the effort @sl1pm4t. Sorry I wasn't able to fork it earlier. I did just published 1.28.4 because I needed the fix below.

kubernetes/kops#16218

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants