Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes_manifest listing all CRDs each time #1651

Open
bchess opened this issue Mar 18, 2022 · 8 comments
Open

kubernetes_manifest listing all CRDs each time #1651

bchess opened this issue Mar 18, 2022 · 8 comments

Comments

@bchess
Copy link

bchess commented Mar 18, 2022

Terraform Version, Provider Version and Kubernetes Version

Terraform version: v0.15.5
Kubernetes provider version: v2.8.0
Kubernetes version: v1.21.3

Affected Resource(s)

  • kubernetes_manifest

Steps to Reproduce

  1. terraform plan

Expected Behavior

The kubernetes provider may query the API server once to get a list of known CRDs, and cache the result for subsequent resource reads

Actual Behavior

It appears that the provider executes a LIST query of /apis/apiextensions.k8s.io/v1/customresourcedefinitions for each kubernetes_manifest resource. Somehow it's not caching the results, and this makes running plan unnecessarily slow. This is occurring even with kubernetes_manifest of basic builtin types like ConfigMap

Here is the stack trace:

2 @ 0x100a1f140 0x100a1f1cc 0x100a504c8 0x100a5ef24 0x100f4a258 0x100f5a81c 0x100b5525c 0x100b56098 0x100da1618 0x100d9ff58 0x100d9f2d0 0x100da3de8 0x100f5d590 0x100a93558 0x100ce5488 0x101ef08ac 0x101ef0410 0x101ef0128 0x101eeff08 0x101eefb04 0x101ef0334 0x10221780c 0x10272c46c 0x10272a520 0x102727f88 0x10118f3f0 0x1011548cc 0x1011397d8 0x101027b34 0x10102c61c 0x101024c04 0x100a54344
#	0x100a504c7	sync.runtime_notifyListWait+0x157											/opt/homebrew/Cellar/go/1.17.6/libexec/src/runtime/sema.go:513
#	0x100a5ef23	sync.(*Cond).Wait+0x93													/opt/homebrew/Cellar/go/1.17.6/libexec/src/sync/cond.go:56
#	0x100f4a257	golang.org/x/net/http2.(*pipe).Read+0x387										/Users/bchess/terraform-provider-kubernetes/vendor/golang.org/x/net/http2/pipe.go:65
#	0x100f5a81b	golang.org/x/net/http2.transportResponseBody.Read+0xbb									/Users/bchess/terraform-provider-kubernetes/vendor/golang.org/x/net/http2/transport.go:2110
#	0x100b5525b	bufio.(*Reader).fill+0x25b												/opt/homebrew/Cellar/go/1.17.6/libexec/src/bufio/bufio.go:101
#	0x100b56097	bufio.(*Reader).ReadByte+0xb7												/opt/homebrew/Cellar/go/1.17.6/libexec/src/bufio/bufio.go:253
#	0x100da1617	compress/flate.(*decompressor).huffSym+0x97										/opt/homebrew/Cellar/go/1.17.6/libexec/src/compress/flate/inflate.go:719
#	0x100d9ff57	compress/flate.(*decompressor).huffmanBlock+0x87									/opt/homebrew/Cellar/go/1.17.6/libexec/src/compress/flate/inflate.go:494
#	0x100d9f2cf	compress/flate.(*decompressor).Read+0x21f										/opt/homebrew/Cellar/go/1.17.6/libexec/src/compress/flate/inflate.go:347
#	0x100da3de7	compress/gzip.(*Reader).Read+0xa7											/opt/homebrew/Cellar/go/1.17.6/libexec/src/compress/gzip/gunzip.go:251
#	0x100f5d58f	golang.org/x/net/http2.(*gzipReader).Read+0x1af										/Users/bchess/terraform-provider-kubernetes/vendor/golang.org/x/net/http2/transport.go:2578
#	0x100a93557	io.ReadAll+0x1a7													/opt/homebrew/Cellar/go/1.17.6/libexec/src/io/io.go:633
#	0x100ce5487	io/ioutil.ReadAll+0x47													/opt/homebrew/Cellar/go/1.17.6/libexec/src/io/ioutil/ioutil.go:27
#	0x101ef08ab	k8s.io/client-go/rest.(*Request).transformResponse+0xbb									/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1067
#	0x101ef040f	k8s.io/client-go/rest.(*Request).Do.func1+0x4f										/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1039
#	0x101ef0127	k8s.io/client-go/rest.(*Request).request.func2.1+0x47									/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:996
#	0x101eeff07	k8s.io/client-go/rest.(*Request).request.func2+0x3a7									/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1021
#	0x101eefb03	k8s.io/client-go/rest.(*Request).request+0x6b3										/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1023
#	0x101ef0333	k8s.io/client-go/rest.(*Request).Do+0xa3										/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/rest/request.go:1038
#	0x10221780b	k8s.io/client-go/dynamic.(*dynamicResourceClient).List+0x1ab								/Users/bchess/terraform-provider-kubernetes/vendor/k8s.io/client-go/dynamic/simple.go:254
#	0x10272c46b	github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).lookUpGVKinCRDs+0x42b		/Users/bchess/terraform-provider-kubernetes/manifest/provider/resource.go:218
#	0x10272a51f	github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).TFTypeFromOpenAPI+0x1ef	/Users/bchess/terraform-provider-kubernetes/manifest/provider/resource.go:91
#	0x102727f87	github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).ReadResource+0x1417		/Users/bchess/terraform-provider-kubernetes/manifest/provider/read.go:93
#	0x10118f3ef	github.com/hashicorp/terraform-plugin-mux.SchemaServer.ReadResource+0xdf						/Users/bchess/terraform-provider-kubernetes/vendor/github.com/hashicorp/terraform-plugin-mux/schema_server.go:265
#	0x1011548cb	github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ReadResource+0x4cb				/Users/bchess/terraform-provider-kubernetes/vendor/github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server/server.go:744
#	0x1011397d7	github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler+0x2d7		/Users/bchess/terraform-provider-kubernetes/vendor/github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:349
#	0x101027b33	google.golang.org/grpc.(*Server).processUnaryRPC+0x1253									/Users/bchess/terraform-provider-kubernetes/vendor/google.golang.org/grpc/server.go:1282
#	0x10102c61b	google.golang.org/grpc.(*Server).handleStream+0x80b									/Users/bchess/terraform-provider-kubernetes/vendor/google.golang.org/grpc/server.go:1616
#	0x101024c03	google.golang.org/grpc.(*Server).serveStreams.func1.2+0xa3								/Users/bchess/terraform-provider-kubernetes/vendor/google.golang.org/grpc/server.go:921

And here is a pprof sorted by cumulative time:

(pprof) top100 -cum
Showing nodes accounting for 33.86s, 79.39% of 42.65s total
Dropped 319 nodes (cum <= 0.21s)
Showing top 100 nodes out of 169
      flat  flat%   sum%        cum   cum%
         0     0%     0%     25.92s 60.77%  google.golang.org/grpc.(*Server).processUnaryRPC
         0     0%     0%     25.83s 60.56%  github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).TFTypeFromOpenAPI
         0     0%     0%     25.83s 60.56%  google.golang.org/grpc.(*Server).handleStream
         0     0%     0%     25.73s 60.33%  google.golang.org/grpc.(*Server).serveStreams.func1.2
         0     0%     0%     25.09s 58.83%  k8s.io/client-go/dynamic.(*dynamicResourceClient).List
         0     0%     0%     25.06s 58.76%  github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).lookUpGVKinCRDs
         0     0%     0%     23.87s 55.97%  k8s.io/apimachinery/pkg/util/json.Unmarshal
         0     0%     0%     23.77s 55.73%  k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.unstructuredJSONScheme.decode
         0     0%     0%     23.75s 55.69%  k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.unstructuredJSONScheme.Decode
         0     0%     0%     23.74s 55.66%  k8s.io/apimachinery/pkg/runtime.Decode
         0     0%     0%     19.68s 46.14%  k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.unstructuredJSONScheme.decodeToList
         0     0%     0%     14.90s 34.94%  encoding/json.(*Decoder).Decode
         0     0%     0%     13.57s 31.82%  encoding/json.(*decodeState).value
         0     0%     0%     13.56s 31.79%  encoding/json.(*decodeState).object
         0     0%     0%     13.49s 31.63%  encoding/json.(*decodeState).unmarshal
         0     0%     0%     11.46s 26.87%  runtime.systemstack
         0     0%     0%      9.86s 23.12%  runtime.gcBgMarkWorker.func2
     0.15s  0.35%  0.35%      9.86s 23.12%  runtime.gcDrain
     0.14s  0.33%  0.68%      9.54s 22.37%  encoding/json.(*decodeState).objectInterface
     0.02s 0.047%  0.73%      9.49s 22.25%  encoding/json.(*decodeState).valueInterface
         0     0%  0.73%      9.16s 21.48%  encoding/json.(*decodeState).array
         0     0%  0.73%      8.96s 21.01%  github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).UpgradeResourceState
         0     0%  0.73%      8.95s 20.98%  github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).UpgradeResourceState
         0     0%  0.73%      8.94s 20.96%  github.com/hashicorp/terraform-plugin-mux.SchemaServer.UpgradeResourceState
         0     0%  0.73%      8.89s 20.84%  github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_UpgradeResourceState_Handler
         0     0%  0.73%      8.89s 20.84%  github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).PlanResourceChange
         0     0%  0.73%      8.88s 20.82%  github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_PlanResourceChange_Handler
         0     0%  0.73%      8.88s 20.82%  github.com/hashicorp/terraform-plugin-mux.SchemaServer.PlanResourceChange
         0     0%  0.73%      8.88s 20.82%  github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).PlanResourceChange

References

  • Restmapper cache #1508 seems related, as it seems to add some caching to manifest. But it doesn't appear to be actually caching these requests.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@bchess bchess added the bug label Mar 18, 2022
@github-actions github-actions bot removed the bug label Mar 18, 2022
@alexsomesan
Copy link
Member

alexsomesan commented Mar 23, 2022

Hi, thanks for opening this conversation.

In fact, the provider needs to make sure it knows about any CRDs that may have been created during the same operation, so that it can handle any CRs of that type.

However, I think there is room for optimizing the number of these calls. We previously avoided to introduce any optimisations until the provider had stabilized enough. In fact, we actually had to roll-back some caching we had introduced too early that was causing hard to diagnose issues.

At this point, I think we can take a look at reducing the number of CRD retrieval calls.

@github-actions
Copy link

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

@github-actions github-actions bot added the stale label Mar 24, 2023
@varunthakur2480
Copy link

this is starting to hurt us really bad, we have a set of 20 resources (flux) that are being created and it takes ~20 minutes to run a plan and often it times out . Please can this be prioritised asap?

@github-actions github-actions bot removed the stale label Mar 27, 2023
@bchess
Copy link
Author

bchess commented Mar 27, 2023

fwiw we moved to kubectl_manifest and so far it's been much better

@varunthakur2480
Copy link

I can try that but it will be good to address this issue too

Copy link

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

@github-actions github-actions bot added the stale label Mar 29, 2024
@cjgibson
Copy link

Just noting this is still an active issue with the upstream Hashi provider here so the bot doesn't close this issue - generally speaking if you're managing CRDs or CRs at all (or anything else applied via bare YAML) you shouldn't use Hashi's provider.

@github-actions github-actions bot removed the stale label Apr 12, 2024
@vihangm
Copy link

vihangm commented Jun 8, 2024

This continues to be quite slow, any ETA on whether this is going to be addressed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants