kubernetes_manifest listing all CRDs each time #1651

bchess · 2022-03-18T21:51:34Z

Terraform Version, Provider Version and Kubernetes Version

Terraform version: v0.15.5
Kubernetes provider version: v2.8.0
Kubernetes version: v1.21.3

Affected Resource(s)

kubernetes_manifest

Steps to Reproduce

terraform plan

Expected Behavior

The kubernetes provider may query the API server once to get a list of known CRDs, and cache the result for subsequent resource reads

Actual Behavior

It appears that the provider executes a LIST query of /apis/apiextensions.k8s.io/v1/customresourcedefinitions for each kubernetes_manifest resource. Somehow it's not caching the results, and this makes running plan unnecessarily slow. This is occurring even with kubernetes_manifest of basic builtin types like ConfigMap

Here is the stack trace:

2 @ 0x100a1f140 0x100a1f1cc 0x100a504c8 0x100a5ef24 0x100f4a258 0x100f5a81c 0x100b5525c 0x100b56098 0x100da1618 0x100d9ff58 0x100d9f2d0 0x100da3de8 0x100f5d590 0x100a93558 0x100ce5488 0x101ef08ac 0x101ef0410 0x101ef0128 0x101eeff08 0x101eefb04 0x101ef0334 0x10221780c 0x10272c46c 0x10272a520 0x102727f88 0x10118f3f0 0x1011548cc 0x1011397d8 0x101027b34 0x10102c61c 0x101024c04 0x100a54344
#	0x100a504c7	sync.runtime_notifyListWait+0x157											/opt/homebrew/Cellar/go/1.17.6/libexec/src/runtime/sema.go:513
#	0x100a5ef23	sync.(*Cond).Wait+0x93													/opt/homebrew/Cellar/go/1.17.6/libexec/src/sync/cond.go:56
#	0x100f4a257	golang.org/x/net/http2.(*pipe).Read+0x387										/Users/bchess/terraform-provider-kubernetes/vendor/golang.org/x/net/http2/pipe.go:65
#	0x100f5a81b	golang.org/x/net/http2.transportResponseBody.Read+0xbb									/Users/bchess/terraform-provider-kubernetes/vendor/golang.org/x/net/http2/transport.go:2110
#	0x100b5525b	bufio.(*Reader).fill+0x25b												/opt/homebrew/Cellar/go/1.17.6/libexec/src/bufio/bufio.go:101
#	0x100b56097	bufio.(*Reader).ReadByte+0xb7												/opt/homebrew/Cellar/go/1.17.6/libexec/src/bufio/bufio.go:253
#	0x100da1617	compress/flate.(*decompressor).huffSym+0x97										/opt/homebrew/Cellar/go/1.17.6/libexec/src/compress/flate/inflate.go:719
#	0x100d9ff57	compress/flate.(*decompressor).huffmanBlock+0x87									/opt/homebrew/Cellar/go/1.17.6/libexec/src/compress/flate/inflate.go:494
#	0x100d9f2cf	compress/flate.(*decompressor).Read+0x21f										/opt/homebrew/Cellar/go/1.17.6/libexec/src/compress/flate/inflate.go:347
#	0x100da3de7	compress/gzip.(*Reader).Read+0xa7											/opt/homebrew/Cellar/go/1.17.6/libexec/src/compress/gzip/gunzip.go:251
#	0x100f5d58f	golang.org/x/net/http2.(*gzipReader).Read+0x1af										/Users/bchess/terraform-provider-kubernetes/vendor/golang.org/x/net/http2/transport.go:2578
#	0x100a93557	io.ReadAll+0x1a7													/opt/homebrew/Cellar/go/1.17.6/libexec/src/io/io.go:633
#	0x100ce5487	io/ioutil.ReadAll+0x47													/opt/homebrew/Cellar/go/1.17.6/libexec/src/io/ioutil/ioutil.go:27
#	0x101ef08ab	k8s.io/client-go/rest.(*Request).transformResponse+0xbb									/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1067
#	0x101ef040f	k8s.io/client-go/rest.(*Request).Do.func1+0x4f										/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1039
#	0x101ef0127	k8s.io/client-go/rest.(*Request).request.func2.1+0x47									/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:996
#	0x101eeff07	k8s.io/client-go/rest.(*Request).request.func2+0x3a7									/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1021
#	0x101eefb03	k8s.io/client-go/rest.(*Request).request+0x6b3										/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1023
#	0x101ef0333	k8s.io/client-go/rest.(*Request).Do+0xa3										/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1038
#	0x10221780b	k8s.io/client-go/dynamic.(*dynamicResourceClient).List+0x1ab								/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/dynamic/simple.go:254
#	0x10272c46b	github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).lookUpGVKinCRDs+0x42b		/Users/bchess/terraform-provider-kubernetes/manifest/provider/resource.go:218
#	0x10272a51f	github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).TFTypeFromOpenAPI+0x1ef	/Users/bchess/terraform-provider-kubernetes/manifest/provider/resource.go:91
#	0x102727f87	github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).ReadResource+0x1417		/Users/bchess/terraform-provider-kubernetes/manifest/provider/read.go:93
#	0x10118f3ef	github.com/hashicorp/terraform-plugin-mux.SchemaServer.ReadResource+0xdf						/Users/bchess/terraform-provider-kubernetes/vendor/github.com/hashicorp/terraform-plugin-mux/schema_server.go:265
#	0x1011548cb	github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ReadResource+0x4cb				/Users/bchess/terraform-provider-kubernetes/vendor/github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server/server.go:744
#	0x1011397d7	github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler+0x2d7		/Users/bchess/terraform-provider-kubernetes/vendor/github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:349
#	0x101027b33	google.golang.org/grpc.(*Server).processUnaryRPC+0x1253									/Users/bchess/terraform-provider-kubernetes/vendor/google.golang.org/grpc/server.go:1282
#	0x10102c61b	google.golang.org/grpc.(*Server).handleStream+0x80b									/Users/bchess/terraform-provider-kubernetes/vendor/google.golang.org/grpc/server.go:1616
#	0x101024c03	google.golang.org/grpc.(*Server).serveStreams.func1.2+0xa3								/Users/bchess/terraform-provider-kubernetes/vendor/google.golang.org/grpc/server.go:921

And here is a pprof sorted by cumulative time:

(pprof) top100 -cum
Showing nodes accounting for 33.86s, 79.39% of 42.65s total
Dropped 319 nodes (cum <= 0.21s)
Showing top 100 nodes out of 169
      flat  flat%   sum%        cum   cum%
         0     0%     0%     25.92s 60.77%  google.golang.org/grpc.(*Server).processUnaryRPC
         0     0%     0%     25.83s 60.56%  github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).TFTypeFromOpenAPI
         0     0%     0%     25.83s 60.56%  google.golang.org/grpc.(*Server).handleStream
         0     0%     0%     25.73s 60.33%  google.golang.org/grpc.(*Server).serveStreams.func1.2
         0     0%     0%     25.09s 58.83%  k8s.io/client-go/dynamic.(*dynamicResourceClient).List
         0     0%     0%     25.06s 58.76%  github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).lookUpGVKinCRDs
         0     0%     0%     23.87s 55.97%  k8s.io/apimachinery/pkg/util/json.Unmarshal
         0     0%     0%     23.77s 55.73%  k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.unstructuredJSONScheme.decode
         0     0%     0%     23.75s 55.69%  k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.unstructuredJSONScheme.Decode
         0     0%     0%     23.74s 55.66%  k8s.io/apimachinery/pkg/runtime.Decode
         0     0%     0%     19.68s 46.14%  k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.unstructuredJSONScheme.decodeToList
         0     0%     0%     14.90s 34.94%  encoding/json.(*Decoder).Decode
         0     0%     0%     13.57s 31.82%  encoding/json.(*decodeState).value
         0     0%     0%     13.56s 31.79%  encoding/json.(*decodeState).object
         0     0%     0%     13.49s 31.63%  encoding/json.(*decodeState).unmarshal
         0     0%     0%     11.46s 26.87%  runtime.systemstack
         0     0%     0%      9.86s 23.12%  runtime.gcBgMarkWorker.func2
     0.15s  0.35%  0.35%      9.86s 23.12%  runtime.gcDrain
     0.14s  0.33%  0.68%      9.54s 22.37%  encoding/json.(*decodeState).objectInterface
     0.02s 0.047%  0.73%      9.49s 22.25%  encoding/json.(*decodeState).valueInterface
         0     0%  0.73%      9.16s 21.48%  encoding/json.(*decodeState).array
         0     0%  0.73%      8.96s 21.01%  github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).UpgradeResourceState
         0     0%  0.73%      8.95s 20.98%  github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).UpgradeResourceState
         0     0%  0.73%      8.94s 20.96%  github.com/hashicorp/terraform-plugin-mux.SchemaServer.UpgradeResourceState
         0     0%  0.73%      8.89s 20.84%  github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_UpgradeResourceState_Handler
         0     0%  0.73%      8.89s 20.84%  github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).PlanResourceChange
         0     0%  0.73%      8.88s 20.82%  github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_PlanResourceChange_Handler
         0     0%  0.73%      8.88s 20.82%  github.com/hashicorp/terraform-plugin-mux.SchemaServer.PlanResourceChange
         0     0%  0.73%      8.88s 20.82%  github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).PlanResourceChange

References

Restmapper cache #1508 seems related, as it seems to add some caching to manifest. But it doesn't appear to be actually caching these requests.

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

alexsomesan · 2022-03-23T12:34:29Z

Hi, thanks for opening this conversation.

In fact, the provider needs to make sure it knows about any CRDs that may have been created during the same operation, so that it can handle any CRs of that type.

However, I think there is room for optimizing the number of these calls. We previously avoided to introduce any optimisations until the provider had stabilized enough. In fact, we actually had to roll-back some caching we had introduced too early that was causing hard to diagnose issues.

At this point, I think we can take a look at reducing the number of CRD retrieval calls.

github-actions · 2023-03-24T00:00:57Z

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

varunthakur2480 · 2023-03-27T04:06:27Z

this is starting to hurt us really bad, we have a set of 20 resources (flux) that are being created and it takes ~20 minutes to run a plan and often it times out . Please can this be prioritised asap?

bchess · 2023-03-27T16:29:29Z

fwiw we moved to kubectl_manifest and so far it's been much better

varunthakur2480 · 2023-03-29T05:45:34Z

I can try that but it will be good to address this issue too

github-actions · 2024-03-29T00:00:28Z

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

cjgibson · 2024-04-12T20:26:05Z

Just noting this is still an active issue with the upstream Hashi provider here so the bot doesn't close this issue - generally speaking if you're managing CRDs or CRs at all (or anything else applied via bare YAML) you shouldn't use Hashi's provider.

vihangm · 2024-06-08T01:09:44Z

This continues to be quite slow, any ETA on whether this is going to be addressed?

bchess added the bug label Mar 18, 2022

github-actions bot removed the bug label Mar 18, 2022

alexsomesan added help wanted enhancement labels Mar 23, 2022

bfanyuk mentioned this issue Mar 29, 2022

kubernetes_manifest resource should NOT make any CRD call #1665

Open

github-actions bot added the stale label Mar 24, 2023

github-actions bot removed the stale label Mar 27, 2023

github-actions bot added the stale label Mar 29, 2024

github-actions bot removed the stale label Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubernetes_manifest listing all CRDs each time #1651

kubernetes_manifest listing all CRDs each time #1651

bchess commented Mar 18, 2022 •

edited

Loading

alexsomesan commented Mar 23, 2022 •

edited

Loading

github-actions bot commented Mar 24, 2023

varunthakur2480 commented Mar 27, 2023

bchess commented Mar 27, 2023

varunthakur2480 commented Mar 29, 2023

github-actions bot commented Mar 29, 2024

cjgibson commented Apr 12, 2024

vihangm commented Jun 8, 2024

kubernetes_manifest listing all CRDs each time #1651

kubernetes_manifest listing all CRDs each time #1651

Comments

bchess commented Mar 18, 2022 • edited Loading

Terraform Version, Provider Version and Kubernetes Version

Affected Resource(s)

Steps to Reproduce

Expected Behavior

Actual Behavior

References

Community Note

alexsomesan commented Mar 23, 2022 • edited Loading

github-actions bot commented Mar 24, 2023

varunthakur2480 commented Mar 27, 2023

bchess commented Mar 27, 2023

varunthakur2480 commented Mar 29, 2023

github-actions bot commented Mar 29, 2024

cjgibson commented Apr 12, 2024

vihangm commented Jun 8, 2024

bchess commented Mar 18, 2022 •

edited

Loading

alexsomesan commented Mar 23, 2022 •

edited

Loading